This summer, the ALCF hosted over 40 college students to work on real-world computing projects in areas ranging from scientific visualization to materials science to digital twins.
Every summer, the Argonne Leadership Computing Facility (ALCF), a U.S. Department of Energy (DOE) Office of Science user facility at DOE’s Argonne National Laboratory, hosts a new group of students to explore high-performance computing (HPC) tools and topics and work on some of the world’s most powerful supercomputing resources available for open scientific research. The ALCF’s annual summer student program creates unique opportunities for young researchers, and possibly a pathway to future careers in DOE labs.
“These opportunities advance their skills beyond what they traditionally develop in the classroom, yet are essential for a successful career in science,” said Michael E. Papka, the director of ALCF and Argonne’s deputy associate laboratory director for computing, environment, and life sciences.
Through programs like DOE’s Science Undergraduate Laboratory Internship (SULI) and Argonne Research Aide Appointments, the students come from colleges and universities all across the country to work alongside Argonne’s scientific and engineering staff on various research projects. This year, several students joined the ALCF and other Argonne divisions through the Sustainable Research Pathways program, an initiative for workforce development designed to foster connections between students and faculty from underrepresented groups with scientists from DOE national laboratories.
Some 40 undergraduates and graduates tackled ALCF projects aimed at exploring scientific data in virtual reality, advancing X-ray imaging of brain tissue, improving application memory performance, and more. Some of the students’ work resulted in workshop papers and posters that will be presented at the prestigious SC23 conference.
We spoke to five students about what they worked on, and where they think the experience will take them next.
Improving Scheduler Algorithms
Rohan Basu Roy, a Ph.D. student in computer engineering at Northeastern University, worked on scheduling and resource management tools used in HPC and cloud computing systems like serverless platforms. He focused on schedulers used in ALCF systems, including Polaris and Theta, and his project aimed to reduce job queue wait times, improve job throughput and system utilization, and achieve fairness in job scheduling.
To do that, Roy designed a simulator for optimizing the job sort formula of the schedulers using the system accounting logs. His tool was deployed on the ALCF’s Cooley visualization cluster and several researchers are currently using it to carry out their simulations in optimizing several metrics of job scheduling.
“The ALCF has a long history of producing sophisticated HPC scheduler designs for heterogeneous computing systems and for various kinds of workloads,” said Roy. “Working at the ALCF is an ideal opportunity for anyone working with schedulers to test and experiment with their designed techniques.”
Roy realized that the techniques he designed while at Northeastern were extremely complicated, and that an easy-to-understand and scalable solution was the option that researchers most preferred in practice. Roy said this experience will help him to design and deploy better solutions in the future, and he hopes to continue to work with the ALCF, collaborating on papers and developing solutions to improve user experience.
Scientific Data Exploration in Virtual Reality
Sustainable Research Pathways program participant and undergraduate student Idunnuoluwa Adeniji spent her time at ALCF exploring various pipelines for integrating scientific data into virtual reality. Adeniji, who studies computational science and engineering at Kean University, came across a poster presentation given by Joseph Insley, ALCF visualization and data analytics team lead, during the pre-match process for the Sustainable Research Pathways program.
Adeniji worked with Insley and others to develop a virtual reality environment that displays particle data in a time-lapse format. “This will benefit ALCF users who want to use VR to discern patterns and clusters within their data,” said Adeniji.
Adeniji said that her time at ALCF introduced her to the vast field of scientific visualization and helped hone her problem-solving skills. “I realized the importance of being a part of a like-minded community,” said Adeniji. Moreover, she plans to undertake an independent study on virtual reality next semester and possibly pursue a Ph.D. in scientific visualization.
Memory Performance Analysis on ALCF Systems
Yumeng Liu, a Ph.D. student at Rice University, ran HPC benchmarks on ALCF systems to analyze the memory performance of applications running on GPU hardware. Liu worked with various application teams that are preparing to use the ALCF’s future exascale supercomputer, Aurora.
“Working with ALCF’s performance engineering team, I had the chance to help researchers make their applications more efficient, which is interesting to me,” said Liu. “This summer I learned the importance of not being afraid to ask for help.”
Liu created a microbenchmark that she ran on the Sunspot testbed, a testing and development system made from Aurora hardware. During her testing, she discovered some unexpected behaviors. To determine the root causes, Liu and the team used various tools to compare the theoretical performance metrics with their actual results. They then experimented with different configurations to validate their assumptions.
“Understanding these behaviors can help application teams avoid some previously unknown performance traps,” said Liu. “Furthermore, we can formalize the problems and build technologies to automatically detect such issues, and then apply them to a more general and broader community.”
Virtual Reality Digital Twins
Iowa State University mechanical engineering student James Morrissette collaborated with ALCF computer scientist Victor Mateevitsi, his former project mentor from Argonne’s ACT-SO (Afro-Academic, Cultural, Technological & Scientific Olympics) program.
Initially interested in robotics research, Mateevitsi introduced Morrissette to the concept of digital twins, which involve advanced cyber-physical systems that link simulations to the physical world through an array of sensors. Digital twins allow researchers to explore experiments that are typically avoided because of their intensive time requirements and/or potential hazards, and determine whether they could be completed remotely and automatically.
Morrisette’s work centered around controlling a robotic arm in virtual reality using NVIDIA’s Omniverse application, Isaac Sim, and aimed to make the process intuitive and user friendly. Virtual reality was chosen as a viable method of operation due to its commercial accessibility and utility.
Morrissette said the biggest obstacle he faced this summer was his proficiency with the Python programming language. Despite attending basic introductory camps and online classes, he struggled to read through other people’s code or navigate through documentation. Despite those challenges, he said the experience sparked his interest in the field of cyber-physical systems. “Before this, I did not give much thought to pursuing research, but now I feel compelled to return and work on this project again,” he said.
Building Computational Tools to Study Nanomaterial Systems More Effectively
Luis Rangel DaCosta, a doctoral student in materials science at the University of California, Berkeley, develops new methodologies for calculating the optoelectronic properties of molecules and solids. He resumed work he started as an ALCF summer student last year for improving an algorithm to calculate spectral densities, which measures the way a material responds to light and helps determine how it may function in a technological device.
“My initial approach was to excite electrons from their molecular orbitals independently and then see how the system responds,” said Rangel DaCosta. “However, we have all the information to figure out how the system responds when you excite all the orbitals together, which allows their responses to be coupled and more accurately reflects what happens in experiments.”
This year Rangel DaCosta implemented and optimized his algorithm and ran various case studies. He found that having test cases to validate his approach was incredibly useful in making progress quickly and troubleshooting problems along the way.
“Down the line, I’ll be returning to ALCF to re-implement this in an exascale version of the software, which will let me calculate spectral properties of realistic systems on much larger scales,” said Rangel DaCosta. “Through that, and so far, the technical experience has been valuable in making me feel more confident in my ability to write HPC-ready code and in my ability to understand physics.”