E-CAS Project to Explore Clouds for Acceleration of Science

Print Friendly, PDF & Email

Can the Cloud power ground-breaking research? A new NSF-funded research project aims to provide a deeper understanding of the use of cloud computing in accelerating scientific discoveries.

First announced in 2018, the Exploring Clouds for Acceleration of Science (E-CAS) project has now selected the six research proposals to explore how scientific workflows can leverage advancements in real-time analytics, artificial intelligence, machine learning, accelerated processing hardware, automation in deployment and scaling, and management of serverless applications for a wider range of science.

More recent advancements in cloud offerings allow researchers to explore unique ways of processing huge amounts of data with highly complex inter-relationships using high-throughput computational methods and machine learning systems,” added Howard Pfeffer, president and CEO of Internet2, and principal investigator on the E-CAS project. “We’re excited to support these important and timely scientific research projects as we collectively explore how advancement in commercial clouds can better support the work of researchers and the higher education community.”

The E-CAS project will engage researchers from George Washington University, the Massachusetts Institute of Technology, Purdue University, San Diego Supercomputing Center, State University of New York, and University of Wisconsin, supported by resources from Amazon Web Services and Google Cloud.

The successful proposals for the year-long first phase of the E-CAS project include:

  • Development of BioCompute Objects for Integration into Galaxy in a Cloud Computing Environment, Raja Mazumder, George Washington University. Development of BioCompute objects allow researchers to describe bioinformatic analyses comprised of any number of algorithmic steps and variables to make computational experimental results clearly understandable and easier to repeat. This project will create a library of BioCompute objects that describe bioinformatic workflows on AWS, which can be accessed and contributed to by users of the widely used bioinformatics platform, Galaxy.
  • Investigating Heterogeneous Computing at the Large Hadron Collider, Philip Harris, Massachusetts Institute of Technology. Only a small fraction of the 40 million collisions per second at the Large Hadron Collider are stored and analyzed due to the huge volumes of data and the compute power required to process it. This project proposes a redesign of the algorithms using modern machine learning techniques that can be incorporated into heterogeneous computing systems, allowing more data to be processed and thus larger physics output and potentially foundational discoveries in the field.
  • Building Clouds: Worldwide building typology modelling from images, Daniel Aliaga and Dev Niyogi, Purdue University. This project will utilize computational power and network connectivity to provide a world-scalable solution for generating building-level information for urban canopy parameters as well as for improving the information for estimating local climate zones, both of which are critical to high resolution urban meteorological/environmental models.
  • Accelerating Science by Integrating Commercial Cloud Resources in the CIPRES Science Gateway, Mark Miller, San Diego Supercomputing Center. CIPRES is a web portal that allows scientists around the world to analyze DNA and protein sequence data by providing access to parallel phylogenetics codes run on large high-performance computing (HPC) clusters provided by the NSF-funded eXtreme Science and Engineering Discovery Environment (XSEDE) program and currently runs analyses for about 12,000 scientists per year. This project will develop the infrastructure needed to cloudburst CIPRES jobs to newer, faster V100 GPUs at AWS. As a result, individual jobs will run up to 1.8 fold faster, and users will have access to twice as many GPU nodes as they did in the previous year.
  • Deciphering the Brain’s Neural Code Through Large-Scale Detailed Simulation of Motor Cortex Circuits, William Lytton, State University of New York. This project aims to help decipher the brain’s neural coding mechanisms with far-reaching applications, including developing treatments for brain disorders, advancing brain-machine interfaces for people with paralysis, and developing novel artificial intelligence algorithms. Using a software tool for brain modeling, researchers will run thousands of parallelized simulations exploring different conditions and inputs to the system.
  • IceCube computing in the cloud, Benedikt Riedel, University of Wisconsin. The IceCube Neutrino Observatory located at the South Pole supports science from a number of disciplines including astrophysics, particle physics, and geographical sciences operating continuously being simultaneously sensitive to the whole sky. This project aims to burst into cloud to support follow-up computations of observed events, as well as alerts to and from the research community, such as other telescopes and LIGO.

NSF is thrilled to see the scientific diversity and potential among the selected projects. We look forward to the progress over the next year for these six projects as they investigate the viability and effectiveness of commercial clouds as an option for leading-edge research computing and computational science in a range of areas,” said Manish Parashar, Director of NSF’s Office of Advanced Cyberinfrastructure (OAC).

Following the completion of the first phase of these six research projects, two final projects will be selected in July 2020 for another year of support, with a focus on delivering scientific results. Each phase of the project will be followed by a community-led workshop to assess lessons learned and to define leading practices.

Sign up for our insideHPC Newsletter