Computing is one of the least diverse science, technology, engineering, and mathematics (STEM) fields, with an under-representation of women and minorities, including African Americans and Hispanics. Leveraging this largely untapped talent pool will help address our nation’s growing demand for data scientists. Computational approaches for extracting insights from big data require the creativity, innovation, and collaboration of a diverse workforce.
As part of its efforts to train the next generation of computational and computer scientists, this past summer, the Computational Science Initiative (CSI) at the U.S. Department of Energy’s (DOE) Brookhaven National Laboratory hosted a diverse group of high school, undergraduate, and graduate students. This group included students from Jackson State University and Lincoln University, both historically black colleges and universities. The Lincoln University students were supported through the National Science Foundation’s Louis Stokes Alliances for Minority Participation program, which provides research and other academic opportunities for minority students to advance in STEM. Two of the students are recipients of prestigious fellowship programs: the Graduate Education for Minorities (GEM) Fellowship, through which qualified students from underrepresented minorities receive funding to pursue STEM graduate education; and the DOE Computational Science Graduate Fellowship (CSGF), which supports doctoral research using mathematics and computers to solve problems in many scientific fields of study, including astrophysics, environmental science, and nuclear engineering.
To address challenges in science, we need to bring together the best minds available,” said CSI Director Kerstin Kleese van Dam. “Great talents are rare but can be found among all groups, so we reach out to the broadest talent pools in search of our top researchers at every education level and career stage. In return, we offer them the opportunity to work on some of the most exciting problems with experts who are pushing the state of the art in computer science and applied mathematics.”
Pursuing diverse research topics
The students’ research spanned many areas, including visualization and machine learning techniques for big data analysis, modeling and simulation applications, and automated approaches to data validation and verification.
Quentarius Moore, who graduated this past spring from Jackson State University with a master’s degree in chemistry, spent five weeks implementing an electron correlation model in a computational chemistry code called NWChem for an ongoing DOE Exascale Computing Project, NWChemEx: Tackling Chemical, Materials and Biomolecular Challenges in the Exascale Era. In the fall, he will begin his doctoral studies in chemistry at Texas A&M University through DOE’s CSGF. Unlike most other students, Moore did not come to Brookhaven through a formal internship program—he was connected with computational chemist Hubertus van Dam after reaching out to Robert Harrison and Barbara Chapman, both experts in high-performance computing who hold leadership positions at Brookhaven Lab and teach at nearby Stony Brook University.
I was born and raised in Jackson, Mississippi, and opportunities like conducting world-class research are scarce among the people I know and underrepresented groups in general,” said Moore. “I had never heard about Brookhaven or the national lab system, but now I hope to help minority students seek similar learning experiences.”
Stony Brook University undergraduatestudent Raffaele Miceli—a Science Undergraduate Laboratory Internships (SULI) program intern sponsored by the DOE Office of Science’s Office of Workforce Development for Teachers and Scientists (WDTS)—applied computer graphics to high-energy physics, including visualizing the potential energy of the Higgs field in beyond the Standard Model of particle physics and dark matter models. He was subsequently hired as a student assistant.
Four students joined a CSI team that is investigating methods and devices to perform computations on streaming data while they are in transit. Shilpi Bhattacharyya, a doctoral student in computer science at Stony Brook University, was hired as a student assistant to continue building a virtual environment for this “analysis on the wire” project.
I think CSI is an awesome place for computer scientists,” said Bhattacharyya, who will continue contributing to the project as a research assistant. “I am more confident, disciplined, focused, and motivated because I got the real feel of a research environment here. Talent and hard work is valued at Brookhaven Lab. I never felt any different as a woman pursuing computer science. Gender does not come into the picture at all.”
Undergraduate interns Alya Boumiza, a mathematics major at City University of New York Borough of Manhattan Community College; Cole Lewis, a computer engineering major at South Plains College; and Adam Martin, a computer science major at South Plains College had coordinated assignments to address the main challenge of analysis on the wire: efficiently plugging in and running a streaming algorithm. They collaborated to select and modify a suitable algorithm and examined ways to use hardware accelerators.
Joining the big data conversation
In addition to carrying out their research projects and presenting them during a closing ceremony at Brookhaven, all of the students had the opportunity to attend the CSI-led New York Scientific Data Summit(NYSDS) that was held at New York University from Aug. 7 through 9. This annual conference brings together data experts, scientists, application developers, and end users from national labs, universities, technology companies, utilities, and federal and state governments to share ideas for unlocking insights from scientific big data.
The students submitted papers to the conference and discussed their research with U.S. data science leaders during a poster session. Three students also presented their research in a talk: Ziqiao Guan, a doctoral student in computer science at Stony Brook University; Ronald Lashley, who graduated in May 2017 from Lincoln University with an undergraduate degree in computer science and a minor in visual arts; and Nicole Meister, a high-school student and participant in the Simons Summer Research Program at Stony Brook University. These students were part of a multi-organizational team involving Brookhaven Lab, Lincoln University, and the New Jersey Institute of Technology (NJIT) that designed a deep learning–based image classification software for analyzing the x-ray scattering images produced by scientists at the National Synchrotron Light Source II (NSLS-II)—a DOE Office of Science User Facility at Brookhaven. Each day at NSLS-II, up to four terabytes of images are generated. Approximately 50,000 trees made into paper would be needed to print out one terabyte of data. Classifying the images through deep learning—a type of machine learning in which the features important to classification, say symmetry or orientation, are automatically extracted from raw data—helps scientists recognize patterns in their samples, infer materials’ physical properties, and make decisions for follow-on experiments.
The students more than held their own at such an in-depth scientific event,” said Kleese van Dam.
For the students, NYSDS not only provided them with the opportunity to present their research and network with the larger data science community but also exposed them to current research topics. This year’s conference focused on streaming data analysis, autonomous experimental design, interactive exploration of petascale data, and performance for big data.
While it is challenging to pursue computer science in a male-dominated environment, I was extremely lucky to work with colleagues who were very responsive to my questions,” said Meister. “Machine learning was a fairly new concept to me, so I had to overcome a steep learning curve. Presenting my research at the NYSDS was a surreal experience, and it was fascinating to see what other people in the field were working on. This research opportunity has sparked my interest in machine learning and inspired me to continue working in this area of computer science.”
In November, CSI will again be participating in the international SuperComputing Conference for high-performance computing, networking, storage, and analysis. Staff from CSI will have a table at the student and postdoc job fair, where they will discuss internships, fellowships, assistantships, and permanent employment opportunities.
Unfortunately, our applicant pool is not as diverse as we would like, and so we are always looking for ways to reach out to members of underrepresented groups and encourage them to consider a career with CSI,” said Kleese van Dam. “By raising our visibility among all talented groups through these various outreach efforts, we hope to increase diversity within CSI so that we are fully equipped to solve the scientific data challenges of today and tomorrow.”
The Office of Educational Programs (OEP) manages Brookhaven Lab’s student and teacher programs, which are primarily funded by DOE’s Office of Science, other DOE offices, Brookhaven Science Associates, and other federal and nonfederal agencies.
Source: Brookhaven Lab