Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


New TOP500 List topped by DOE Supercomputers

The latest TOP500 list of the world’s fastest supercomputers is out, a remarkable ranking that shows five Department of Energy supercomputers in the top 10, with the first two captured by Summit at Oak Ridge and Sierra at Livermore. With the number one and number two systems on the planet, the “Rebel Alliance” vendors of IBM, Mellanox, and NVIDIA stand far and tall above the others.

DOE to Showcase World-Class Computational Science at SC18

Researchers and staff from 15 National Labs will showcase DOE’s latest computing and networking innovations and accomplishments at SC18 in Dallas next week. “Several of the talks and demos will highlight achievements by DOE’s Exascale Computing Program (ECP), a multi-lab, seven-year collaborative effort focused on accelerating the delivery of a capable exascale computing ecosystem by 2021.”

Video: Unified Memory on Summit (Power9 + V100)

Jeff Larkin from NVIDIA gave this talk at the Summit Application Readiness Workshop. The event had the primary objective of providing the detailed technical information and hands-on help required for select application teams to meet the scalability and performance metrics required for Early Science proposals. Technical representatives from the IBM/NVIDIA Center of Excellence will be delivering a few plenary presentations, but most of the time will be set aside for the extended application teams to carry out hands-on technical work on Summit.”

Summit Supercomputer Breaks Exaop Barrier on Neural Net Trained to Recognize Extreme Weather Patterns

“Using a climate dataset from Berkeley Lab on the Summit supercomputer at Oak Ridge, they trained a deep neural network to identify extreme weather patterns from high-resolution climate simulations. By tapping into the specialized NVIDIA Tensor Cores built into the GPUs at scale, the researchers achieved a peak performance of 1.13 exaops and a sustained performance of 0.999 – the fastest deep learning algorithm reported to date and an achievement that earned them a spot on this year’s list of finalists for the Gordon Bell Prize.”

Video: ExaAM – Transforming Additive Manufacturing through Exascale Simulation

In this video from the HPC User Forum in Detroit, John Turner, Oak Ridge National Laboratory presents: ExaAM – Transforming Additive Manufacturing through Exascale Simulation. “The goal of ExaAM is to develop an AM simulator that will give researchers a tool to determine the best method to print parts with complex geometries and site-specific properties, complemented by real-time, in situ process visualization, analyses and optimization. Coupled with a modern computer-aided design tool, the simulator will allow the routine use of AM to build unique, qualifiable metal alloy parts across many industries relevant to DOE.”

Interview: Dan Jacobson from ORNL on Why PASC19 will be The Show to Attend Next Year

In this video from PASC18, Dan Jacobson from ORNL describes the highlights from the conference and why scientists and engineers should consider attending PASC19 in Zurich. “Next year, PASC19 happens the week before the ISC 2019 conference in Germany. Be sure to catch both shows!”

Podcast: From Here to Ai with Jack Wells from Oak Ridge

In this Conversations with Dez podcast, Dez Blanchfield sits down with Jack Wells from ORNL to talk about about his personal & professional life journey, his role at Oak Ridge National Lab, how Artificial Intelligence is being deployed & leveraged in HPC, and the role IBM’s POWER9 solution is playing in supporting Oak Ridge and its mission. Jack Wells is the Director of Science for the Oak Ridge Leadership Computing Facility.

Characterizing Faults, Errors and Failures in Extreme-Scale Computing Systems

Christian Engelmann from ORNL gave this talk at PASC18. “Building a reliable supercomputer that achieves the expected performance within a given cost budget and providing efficiency and correctness during operation in the presence of faults, errors, and failures requires a full understanding of the resilience problem. The Catalog project develops a fault taxonomy, catalog and models that capture the observed and inferred conditions in current supercomputers and extrapolates this knowledge to future-generation systems. To date, the Catalog project has analyzed billions of node hours of system logs from supercomputers at Oak Ridge National Laboratory and Argonne National Laboratory. This talk provides an overview of our findings and lessons learned.”

Video: Intro to OpenMP

In this video, Markus Eisenbach and Dmitry Liakh from ORNL present: Intro to OpenMP, Part 1. “This video was recorded as part of the “Introduction to HPC” workshop that took place at ORNL from June 26-28. This is video 1 of 2, which gives a brief overview of parallel computing with OpenMP.”

InfiniBand Powers World’s Fastest Supercomputer

Today the InfiniBand Trade Association (IBTA) announced that the latest TOP500 List results that report the world’s new fastest supercomputer, Oak Ridge National Laboratory’s Summit system, is accelerated by InfiniBand EDR. InfiniBand now powers the top three and four of the top five systems. The latest rankings underscore InfiniBand’s continued position as the interconnect of choice for the industry’s most demanding high performance computing (HPC) platforms. “As the makeup of the world’s fastest supercomputers evolve to include more non-HPC systems such as cloud and hyperscale, the IBTA remains confident in the InfiniBand Architecture’s flexibility to support the increasing variety of demanding deployments,” said Bill Lee, IBTA Marketing Working Group Co-Chair. “As evident in the latest TOP500 List, the reinforced position of InfiniBand among the most powerful HPC systems and growing prominence of RoCE-capable non-HPC platforms demonstrate the technology’s unparalleled performance capabilities across a diverse set of applications.”