Video: Doug Kothe Looks Ahead at The Exascale Computing Project

Print Friendly, PDF & Email

Doug Kothe is Project Director for the Exascale Computing Project.

In this video, Doug Kothe from ORNl provides an update on the Exascale Computing Project.

With respect to progress, marrying high-risk exploratory and high-return R&D with formal project management is a formidable challenge. In January, through what is called DOE’s Independent Project Review, or IPR, process, we learned that we can indeed meet that challenge in a way that allows us to drive hard with a sense of urgency and still deliver on the essential products and solutions. In short, we passed the review with flying colors—and what’s especially encouraging is that the feedback we received tells us what we can do to improve.

Kothe also highlights successes in three ECP research focus areas: Application Development (AD), Software Technology (ST), and Hardware and Integration (HI).

We’ve made significant headway in identifying key AD and ST products. AD has demonstrated effectiveness by releasing a number of applications over the last several months while also developing deep-dive algorithms and models. The ST effort, with relatively new leadership, has been moving from R&D to product development and deployment. ST has a good plan for packaging our various-size components into bite-size chunks of software that the DOE laboratories will consume, integrate, and test.

The scope of Hardware and Integration (HI) includes support for US vendor research and development focused on innovative architectures for competitive exascale system designs (PathForward), hardware evaluation, an integrated and continuously tested exascale software ecosystem deployed at DOE facilities, accelerated application readiness on targeted exascale architectures, and training on key ECP technologies to accelerate the software development cycle and optimize productivity of application and software developers. In the HI PathForward activity, which funds US vendor R&D for nodes that are tuned for our applications and system designs, the vendors have been hitting milestones on schedule and on time. We are feeling very optimistic that the vendor R&D will appear in key products in the exascale systems when they’re procured.

Developing a Diverse Portfolio of Applications

ECP supports all of the key program offices in DOE (Office of Science, applied offices, NNSA Defense Programs), and so our incredible teams are engaged in several main categories of applications research. Examples of some of those categories are national security, energy, fundamental materials and chemistry, scientific discovery, and data analytics.

For national security, we’re developing next-generation applications in support of the NNSA’s stockpile stewardship program, namely reliability testing and maintenance of U.S. nuclear weapons without the use of nuclear testing.

In energy, ECP work is centered on fission and fusion reactors, wind plants, combustion for internal engines and land-based gas turbines, advanced particle accelerator design, and chemical looping reactors for the clean combustion of fossil fuels. The chemistry and materials category is looking at everything from strongly correlated quantum materials to atomistic design of materials for extreme environments to advanced additive manufacturing process design.

Other areas of interest include:

  • Our researchers in additive manufacturing are endeavoring to understand that process essentially to allow the printing of qualified metal alloys for defense and aerospace. On the chemical side, a great example of what we’re doing is catalyst design. We’re also addressing the very foundations of matter via the study of the strong nuclear force and the associated Standard Model, which is among the most fundamental focus areas of nuclear and high-energy physics.
  • Our earth and space science applications include astrophysics and cosmology (e.g., understanding the origin of elements in the universe, and understanding the evolution of the universe and trying to explain dark matter and dark energy). Other key applications include subsurface, or the accurate modeling of the geologic subsurface for fossil fuel extraction, waste disposal, and carbon capture technologies; developing a cloud-resolving Earth system model to enable regional climate change impact assessments; and addressing the risks and response of the nation’s infrastructure to large earthquakes.
  • Within the data analytics category, we have artificial intelligence and machine-learning applications focused on the cancer moonshot, which is basically precision medicine for oncology. We’re also investigating metagenomics data sets for new products and life forms. We are also focused on optimization of the US power grid for the efficient use of new technologies in support of new consumer products and on a multiscale, multisector urban simulation framework that supports the design, planning, policies, and optimized operation of cities. Another facet of our data analytics work involves seeing how we can extract more knowledge from the experimental data coming from the DOE Science facilities. Our study is focused on SLAC’s Linear Coherent Light Source (LCLS) facility, but we are committed to helping myriad facilities across the DOE complex in terms of the streaming of data and trying to determine what’s in it and how we can drive experiments or computationally steer them to give us more insight.

Impacting Industry

ECP aims to be a thought leader and provide direction, whether the subject is programming next-generation hardware or designing models and algorithms to target certain physical phenomena, for example. We know we must interface with industry—from small businesses to large corporations—to avoid missing functional requirements that are important to them.

That’s why we stood up ECP’s Industry Council to work with us as an external advisory group. It is really helping to guide us concerning the challenge problems we’re addressing. The council gives us advice on whether the applications we’re tackling can be leveraged in their environments and, if not, how we can move in that direction. We meet with the council every couple of months to discuss the status of progress and where ECP is headed to ensure it will best fit the needs of US industry.

Understanding and Mitigating Risks

ECP must adhere to a very aggressive schedule, and I believe we are, and with the proper sense of urgency. The schedule is not only extremely dynamic but also abounding with risks. We can, however, unassumingly say that we are on track because we rigorously monitor the work to a granular level. To help us perform the tracking, we use tools called the schedule performance index and the cost performance index.

Some projects have higher risk and more technical challenges than others, and that’s understandable. We rely heavily on our project office and our leadership team to understand what the risks are—both the known unknowns and the unknown unknowns.

I believe that within the next year or so as we learn more about the first three exascale systems deployed in this country, a lot of our risks will either be retired or moved into the known unknowns category, which we can mitigate with our project’s use of contingency. We execute according to a certain funding profile, and so we hope that our DOE sponsors will be able to deliver on what we believe is the funding profile necessary for success.

Another very important consideration for us is ensuring that the right programming models are available for the hardware, from both the software stack and the applications sides of ECP, so that the heterogeneity of the memory and the CPU hierarchy in the exascale systems can be optimally exploited.

Workforce development is a risk as well. We have been fortunate to be able to staff our project teams with some of the best and brightest in the world. Ensuring they’re working on problems that are fun and challenging so they’ll stay with us is very important. These scientists and engineers are arguably among of the most marketable people anywhere, so they’re really in high demand outside of ECP.

One other especially notable risk is ensuring that the US vendors deliver with the hardware and low-level software we need for our applications. The PathForward project allows us to inject resources for crucial vendor R&D. Through PathForward partnerships, we can pull in products sooner so we can extract the product quality and efficacy we need more quickly.

Finally, we anticipate dozens of more milestones to be completed by the end of the year that most definitely inform what we believe are exciting responses to the exascale Request for Proposals. ECP has the job of ensuring the vendors’ R&D is in a good place to propose exciting products for the exascale platform, and we’re working very hard to make that happen.

Download the MP3

Sign up for our insideHPC Newsletter