
GRETA is an instrument for nuclear physics that sends data to supercomputers for quick analysis. credit: Jason Smith/ORNL, Berkeley Lab, U.S. Dept of Energy
The U.S. Department of Energy’s next-genertion leadership computing strategy features the Integrated Research Infrastructure (IRI), which aims to link the DOE’s experimental and HPC resources.
What could be a core capability of the IRI, the focus of an article posted by Lawrence Berkeley National Laboratory, is DELERIA (Distributed Event Level Experiment Readout and Analysis) software platform for steaming large amounts of data at high speeds from one lab to another.
Researchers at Berkeley and Oak Ridge labs report that DELERIA is successfuly sending data from data acquisition to a high-performance computing facility and back — for analysis in near-real-time, interactive timescales.
DELERIA is a testbed project that marks a departure from DOE’s traditional supercomputing resource planning. In the past, computing clusters were required to be on-premises and near the science instruments, but collaborative efforts in the last five years among U.S. supercomputing facilities and ESnet have laid the groundwork to connect and deliver information at unprecedented rates. Another collaborative effort to connect scientific instruments to computing clusters in real time is the ESnet-JLab FPGA Accelerated Transport device (EJFAT), a networking hardware prototype developed by ESnet and Thomas Jefferson National Accelerator Facility (JLab).
“One of the problems we face is how to process the data quickly so that we can directly give feedback to the experimenter with what’s going on in their experiment,” said Mario Cromaz, a staff scientist in the Nuclear Science Division at Berkeley Lab. “Instead of running an experiment, collecting and storing data, and carrying out complex data analysis later on, the results can be obtained in real time.”
The project’s current focus is to send live data from the gamma-ray detectors of the Gamma-Ray Energy Tracking Array (GRETA). This data flows from the instrument at Berkeley Lab in California through the Energy Sciences Network (ESnet), the DOE’s dedicated scientific research network, to the Advanced Computing Ecosystem (ACE) testbed at ORNL’s Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science user facility in Tennessee, which provides developers with an HPC platform to test existing and emerging technologies without interfering with ongoing research.
Initial tests involving streaming data from 12 GRETA detectors have been successful, and the team is now scaling up. Processed data is returned within seconds, providing near-real-time results. “The amount of data coming from detectors like this is absolutely mind-boggling, and it will only increase going forward,” said DELERIA co-developer Gustav Jansen, a computational nuclear physicist at ORNL. “The only viable solution in the long term is to connect experimental and computational facilities, and this collaboration is creating a solution that helps us understand what is required at OLCF to support it.”
According to Jansen, the big problem with moving data across the country is latency. Even when traveling at the speed of light, it still takes almost 10 times longer to transmit a single data event than it does to process it. “Because we can’t bypass the speed of light, the solution is to trick it. Instead of running one event at a time, we run events in parallel, so that we can process one while others are being transferred. This provides a 10x speedup and ensures the computing cluster is always busy,” Jansen said.
Testing 1, 2, 3
Heather Crawford, a senior staff scientist in the Nuclear Science Division and deputy project director of GRETA, has spent nearly a decade fine-tuning the instrument for experiments in the fast lane to supercomputing analysis. Like a microscope, GRETA will enable scientists like Crawford to look deep inside the nuclei of atoms to understand their structure and how they react to different external stimuli. This information yields insights into the fundamental building blocks of our world and the forces that make matter exist.
Scheduled for phase-I completion this summer, GRETA is on track to be the world’s most powerful gamma-ray reading instrument, so the current experiments with DELERIA are a test that will guide experimental setups in the future. Once complete, GRETA will operate at the Facility for Rare Isotope Beams (FRIB) in Michigan, and Crawford and the team are excited to leverage the power of the DELERIA project to accelerate their experiments. Prior to DELERIA, the team used an onsite computing cluster to perform data analysis. “But this is a pretty computationally intensive procedure,” Crawford said.
Now, with DELERIA, the team can choose to use a remote high-performance computing facility where more complex analysis can be carried out in real time. This means that GRETA scientists will be able to send individual physics events at very fast rates and process them on interactive timescales across multiple sites.
High-Performance Capabilities
The innovation behind DELERIA lies in its unique approach and architecture, co-designed by ESnet engineers. Specifically, forward buffers allow information to be easily collected from the experiment’s electronics and transmitted across ESnet to an offsite computing facility using a messaging protocol that allows high-speed data transfer without losing information. And the use of software containers simplifies deployment across multiple systems by compartmentalizing information to ensure consistent, quick, and scalable deployment.
Ultimately, these innovations enable scientists to run their experiments and computing functions faster while delivering results in real time. “DELERIA now lets us take advantage of the extensive and significant computational advantage at the remote facilities,” Crawford said. “Now we get to think about, ‘How can we do this even better?’” With DELERIA, data moves at about 40 gigabits every second across the ESnet testbed network. To put that speed in perspective, it’s like moving a two-hour HD Netflix movie every second.
Connecting GRETA to the distant supercomputing testbed is ESnet, a high-performance, high-speed network used by tens of thousands of scientific researchers across the DOE complex. After the many data points are analyzed by the OLCF supercomputers, it is sent back in a condensed readout to the scientists as they iterate on their work in real time. “Science is no longer something that is done only in the lab, it is all connected together,” said Eric Pouyoul, ESnet’s software engineer for DELERIA. “It doesn’t matter where the data is, because with ESnet, it can go anywhere.”
Science Applications
The teams involved are keeping an eye on a bigger perspective beyond the GRETA project. “The goal is to make this available to other sciences,” said Pouyoul. “The way that the DELERIA computing system has been designed, architected, and implemented leads us to believe this is possible.”
The teams involved in this work have already identified many potential applications within and beyond the world of nuclear science. Equipped with real-time feedback on a fusion shot, scientists can update diagnostic parameters for the next shot; and for light source users, real-time analysis could guide experimental settings used to illuminate a sample instantaneously, enabling optimal utilization of time on the beamline.
source: Ashleigh Papp, science writer, Lawrence Berkeley National Laboratory