Epic HPC Road Trip Continues to Sandia National Lab

Print Friendly, PDF & Email

Dan Olds from OrionX hits the road for SC18 in Dallas.

In this special guest feature, Dan Olds from OrionX continues his Epic HPC Road Trip series with a stop at Sandia National Laboratories in New Mexico.

It was a sunny day in Albuquerque when I rolled into town to visit Sandia National Laboratory. I was in in for a treat as I wasn’t just interviewing a Sandia HPC expert, but was given the opportunity to meet with their hardware brain trust. This included James Laros, computer architect and Vanguard Project Lead, Ken Alvin, Senior Manager of Sandia’s Extreme Scale Computing Group, and Rob Hoekstra, Technical Manager of Scalable Architectures.

One of the first topics of conversation was hearing from each of the guys where they think HPC is going, or, maybe more importantly, where it should be going. James talked about how he was looking for more custom processors that could be better adapted to HPC workloads. He alluded to the fact that simply adding more cores to existing CPU designs isn’t really getting the job done for HPC and perhaps customers should turn to other designs (maybe Arm) in order to get processors more attuned to HPC. Ken discussed how Sandia is hoping to stimulate more of a system-on-a-chip type of technology where extra CPU real estate might be better utilized for better HPC componentry and customization.

Rob pointed out that we’ve hit the end of Dennard Scaling, meaning that device scaling has hit the end of the road. This has forced the HPC community to look outside the box for the next performance improvements. GPUs are good examples of these innovations, but we need more advances like GPUs to move application performance forward in the future.

FPGAs are a good vehicle for early optimization and discovery, but the programming model for FPGAs is still a big stumbling block for Sandia and most other HPC users. The rise of the Arm processor presents a great opportunity for HPC-centric customization and optimization. Because there are many potential manufacturers of Arm processors, it might be cost effective to approach them with an addition of one or two components to the Arm CPU in order to make it a much better fit for particular HPC workloads. This could be a cost-effective route to get much better application performance at low power and a reasonable cost.

From here, our conversation went on to touch on memory bandwidth and the need for a more balanced architecture, then started discussing their new Arm-based Astra supercomputer. We also discussed the most prevalent benchmarks, HPL and HPCG, and how appropriate they are as measures for their relative system performance. Not surprisingly, HPCG is much closer to the efficiency of their code vs. HPL. Looking forward to their next system, the team is looking for new vendor partners who have made technology breakthroughs, but who don’t yet have the ability to deliver a true supercomputer to a client like the DOE. Using their prototype program, Sandia is helping these vendors test out their innovations and, at the same time, helping them continue development.

Please check out the video above for more detailed discussion and content. But I do have to apologize for the fuzziness of it. I adjusted the focus when we first started but bumped the camera just before we started talking – thus taking it out of focus. I was so engrossed in the conversation that I didn’t even glance into the viewfinder again. So the video looks like I was using the “Elizabeth Taylor Lens”, the soft focus lens used in the Saturday Night Live “White Diamonds” sketch.

With Sandia in the bag, it’s time to move on to SC18 in Dallas, a short 650 mile jaunt down the road.

Many thanks go out to Cray for sponsoring this journey. After SC18, I have to move quickly to get to California in order to visit Lawrence Berkeley and Lawrence Livermore labs, stay tuned.

Dan Olds is an Industry Analyst at OrionX.net. An authority on technology trends and customer sentiment, Dan Olds is a frequently quoted expert in industry and business publications such as The Wall Street Journal, Bloomberg News, Computerworld, eWeek, CIO, and PCWorld. In addition to server, storage, and network technologies, Dan closely follows the Big Data, Cloud, and HPC markets. He writes the HPC Blog on The Register, co-hosts the popular Radio Free HPC podcast, and is the go-to person for the coverage and analysis of the supercomputing industry’s Student Cluster Challenge.

Sign up for our insideHPC Newsletter