ALCF Data Science Program Awards 4 Projects

Print Friendly, PDF & Email

The Argonne Leadership Computing Facility (ALCF) recently awarded computing time and resources to three new projects and one renewed project for 2021-2022, through its ALCF Data Science Program (ADSP).

Launched in 2016, the ADSP enables big data and artificial intelligence (AI) research that requires DOE’s leadership-class computing resources. The forward-looking allocation program is designed to explore and improve computational methods for data-driven discoveries across scientific disciplines. It also focuses on scaling the underlying data science technologies to fully utilize DOE supercomputers. The ALCF is a U.S. Department of Energy (DOE) Office of Science User Facility at DOE’s Argonne National Laboratory.

The new projects — which aim to accelerate autonomous molecular design, data analysis in neutrino experiments, and sky survey discovery — extract science from a range of unique data sources. The project selected for renewal will address challenges in fast, high-resolution X-ray imaging at the Advanced Photon Source (APS), a DOE Office of Science User Facility located at Argonne. Each project will employ leadership-class systems and infrastructure to develop and advance data science techniques, with novel approaches to machine learning, deep learning, and other cutting-edge AI methods.

“This year’s ADSP awards advance the use of artificial intelligence on ALCF systems beyond standalone networks to multi-network workflows integrated in scientific analysis chains,” said Taylor Childers, ALCF research scientist and co-lead of the ADSP program this year. “In addition, unsupervised techniques are targeting our upcoming system Polaris, which is ideal for deep learning applications and will serve as a testbed for our future exascale supercomputer, Aurora.”

ADSP awards are for two years and are renewed on an annual basis.

New ADSP Projects

Autonomous Molecular Design for Redox Flow Batteries
PI: Logan Ward, Argonne National Laboratory

Redox flow batteries can easily be scaled up to store large amounts of energy, making them a promising technology for electrical grid storage. The batteries work by storing energy in large tanks of electrolyte solutions, but they are currently limited by the performance of available electrolyte materials. With tens of millions of potential candidate molecules, scientists need an improved method to speed the discovery of optimal materials for redox flow batteries. The goal of this project is to build an autonomous AI application for supercomputers that can select and perform the simulation and machine learning tasks needed to identify better-performing molecules. Achieving this goal will require scaling individual tasks, such as computing material properties and training AI models, and then combining them into a cohesive application that will remove humans from the materials design process.

Machine Learning for Data Reconstruction to Accelerate Physics Discoveries in Accelerator-Based Neutrino Oscillation Experiments
PI: Marco Del Tutto, Fermi National Accelerator Laboratory (Fermilab)

The liquid argon time projection chamber (LArTPC) is an imaging detector that can record charged particle trajectories at sub-millimeter spatial resolution. It allows scientists to measure neutrino interactions with high precision, making it the detector of choice for current and future accelerator neutrino experiments, including Fermilab’s Short-Baseline Neutrino Program and Deep Underground Neutrino Experiment. A major goal of this project is to accelerate the analysis workflow in LArTPC experiments by orders of magnitude by deploying the first machine learning-based full reconstruction chain on a high-performance computing (HPC) system. The optimization of a traditional data reconstruction pipeline in these experiments is done “by hand,” and can take months to years each time researchers need to reprocess the whole dataset. The team’s goal is to reduce this process to hours using the ALCF’s upcoming Polaris system. This effort will accelerate the analysis pipeline, perhaps even enabling a full physics analysis online, allowing for more frequent and deeper inference of physics insights from experimental data.

Learning Optimal Image Representations for Current and Future Sky Surveys
PI: George Stein, Lawrence Berkeley National Laboratory

Sky surveys are the largest data generators in astronomy, imaging vast numbers of galaxies at high resolutions. To date, machine learning investigations of sky-survey data have provided a large number of high-impact results, including the detection of a large number of strongly gravitationally lensed systems and the classification of millions of galaxies. However, existing methods used in the field of astrophysics suffer from the standard limitations of supervised learning; they require extensive compute resources and development time to target singular objectives, and the performance is limited by the small amount of labeled data on which to train models. With this ADSP project, the team will use their recently developed self-supervised learning framework to extract meaningful representations from galaxy images in the Dark Energy Camera Legacy Survey dataset, providing a scalable data-driven approach capable of learning from unlabeled data. The team’s work aims to serve the broader community by accelerating sky survey discoveries following the release of image representations, trained models, and software. Researchers will be able to simply download the low-dimensional representations of galaxies to perform scientific analysis, or use the team’s pre-trained model and quickly fine-tune it to carry out a specific task.

Renewed ADSP Project

Dynamic Compressed Sensing for Real-Time Tomographic Reconstruction
PI: Robert Hovden, University of Michigan

Using electron and X-ray tomography to perform 3D characterization of materials at the nano- and mesoscale is important to the development of a wide range of applications, including solar cells and semiconductor devices. To overcome experimental limitations and improve image quality in materials characterization research, researchers are leveraging recent advancements in tomographic reconstruction algorithms, such as compressed sensing methods, to provide superior 3D resolution. In the first year of this ADSP project, researchers developed a dynamic tomography framework that uses compressed sensing algorithms to perform in-situ reconstruction while new data is being collected. In year two, the team will continue to conduct comprehensive simulations for real-time electron tomography and develop reconstruction methods for through-focal tomography, an approach that enhances resolution by combining images captured at different levels of focus. They will experimentally demonstrate the reconstruction workflow and methods on commercial scanning transmission electron microscopes and the ptychographic tomography instruments at the APS. By integrating their tool with an open-source 3D visualization and tomography software package, the team’s techniques will be accessible to a wide range of researchers and enable new material characterizations in academia and industry.

source: ALCF