As an Intel Fellow and Chief Architect for Exascale Systems at Intel Corporation, Alan Gara spearheads Intel’s drive to deliver practical computers capable of exascale levels of performance and beyond. Gara is also principal architect of the 180 TFlops Aurora supercomputer that Intel and Cray will install at Argonne National Lab for the US Department of Energy (DOE) in 2018.
He spoke with Jan Rowell, an Oregon-based writer who’s been tracking HPC trends since 1985, when Intel introduced the first parallel computer build from commercial, off the shelf components. Led by Justin Rattner, one of the first Intel Fellows, the Intel® Scientific Computers group set a new course for the industry with the iSC™/1. That historic system offered 32 nodes based on the Intel 80286 processor and 80287 math coprocessor—64 or 128 nodes in the souped-up configurations—and provided a whopping 2 Mflops of peak performance.
Today, Intel appears to have a bead on exascale and to be moving hard toward the target. With Gara charting the vision, the company has introduced Intel® Scalable System Framework (Intel® SSF), a flexible, next-generation holistic approach for developing HPC systems at diverse scales. Intel has also announced or previewed plans for a range of breakthrough technologies, including a new end-to-end fabric, Intel® Omni-Path Architecture (Intel® OPA); a revolutionary convergence of storage and memory, Intel 3D XPoint™ non-volatile memory technology; and significant new capabilities for its Intel® Xeon® and Intel® Xeon Phi™ processors. Intel’s approach to system design is the result of collaborating with numerous end-user organizations to align its platforms with real-world requirements to provide capabilities needed to run both compute-intensive and data-intensive workloads.
What impact does Gara expect these innovations to have for end users? What opportunities will they create? How much effort will be required to take advantage? We asked Gara what he sees.
insideHPC: Al, you’ve been forecasting HPC’s future and then making it a reality for a long time. From your vantage point, where we are? Are we reaching a tipping point?
Alan Gara: One thing that’s happening is that there are many new technologies emerging at this time. This is bringing with it an opportunity to really reoptimize where we are from a system perspective and move things forward in a very big way in the high-performance computing area.
Having said that, we want to exploit these new technologies in a way that doesn’t present fundamentally new challenges to users. While it might be a disruptive technology from a microarchitecture perspective, we want to present it in a way that lets users take advantage of all the goodness while minimizing the adoption impact.
insideHPC: How do you do that?
Alan Gara:One big way we’re doing that is through Intel Scalable System Framework. Intel SSF is a framework that allows us to integrate all of the technologies—hardware and software—in a way that we’ve validated and we know is a well-balanced approach that’s proven to work efficiently and to minimize the overhead of adoption for OEMs and users alike. There are huge strides and I think huge performance opportunities coming forward that these technologies are going to offer. But we have to do this in the right way so users can really take advantage of it.
When we think about architecture, it’s common to think about building architects. When they went from wood to brick to steel, each of those new building materials gave them new opportunities to fundamentally build things in different ways.
We have technologies emerging that also offer those kind of disruptive opportunities. It’s not 30 percent or a factor of two, but orders-of-magnitude potential improvement if we do this right. There are good reasons nobody built 100-story buildings out of wood. But now we can, because of the new materials that are available to work with. The new technology capabilities allow us to do things we couldn’t do before. That’s really exciting.
insideHPC: What are we seeing on the technology side?
Alan Gara: I look at this as a series of wave fronts rolling forward. The wave fronts are the higher levels of integration we’ve been able to do. As we integrate, we drive costs down, we drive performance up, we drive power down. These are all good things.
So more and more gets pulled into the chip, and you get this constant rolling in of technology for higher and higher levels of integration. But it’s not like the whole thing collapses onto the chip. Instead, something else emerges. So we’re going to higher and higher levels of integration of things like the fabric and memory, and we’re also seeing whole new technologies start to emerge and they’re like a new emerging wave.
Intel 3D XPoint is a perfect example. As flash memory becomes more and more cost effective, it starts to replace I/O and file systems, and not just as a cache but as the file system itself. All these things start to roll in while new technologies continue to emerge on the boundaries. Existing technologies keep getting integrated so that there are higher and higher levels of integration.
Because of Intel’s broad technology investment and architectural expertise, as well as the long history in developing processors, we’re in a great position to integrate these technologies and enable users to take maximum advantage of the resulting benefits.
insideHPC: So you have disruptive technologies and disruptive opportunities. How much recoding will users need to do to take advantage of them?
Alan Gara: We’re working to develop what I think of as transitional architectures, so you can run the same applications, and they’ll continue to run really well. In addition, some of the new technologies open up dimensions that we didn’t know existed before and we didn’t think would exist. Current applications aren’t architected today to exploit those. So in addition to enabling the legacy program models and applications, we want to open up that new space so applications can optimize for the new capabilities and get additional large returns.
There’s also a class of disruption we expect will make systems easier to use. We tend to be an industry where as we move forward, you need higher levels of expertise to bring out all the performance. One of the exciting things I think is going to come our way is that some of these new technologies are going to turn back the clock and bring us back to where you can get exceptional performance without having to be a heroic programmer. Those types of opportunities are going to be something users can exploit without having to reoptimize their applications.
insideHPC: Any examples?
Alan Gara: Some of the memory technologies coming into play now have very dense memory that can get integrated into the socket. I believe this will open up a dimension where we’ll get tremendous performance advantages, similar to what we used to have in terms of the memory-bandwidth-to-compute-performance ratios. I think developers will be able to exploit this quite naturally.
I’ll mention Intel 3D XPoint technology again as a technology that offers game-changing opportunities in terms of the balance between capacity and bandwidth, but falls into that other dimension where we’ll want to retune the application. It fits into a spot that we never had anything before, so that’s an area where rethinking how we think about I/O and retuning the application will offer tremendous benefits. When we do that, we see opportunities for orders-of-magnitude improvement for things like in-memory databases.
insideHPC: Let’s talk about real-world impacts. What are you looking to see, and what are some of the roadblocks?
Alan Gara: HPC continues to be a growing industry, both in the United States and worldwide. Countries have identified HPC as a key driver to economic growth. For that reason, there’s enormous growth and investment in this area.
The challenge is that the pool of people with the skills to program HPC systems is not as large as it needs to be. So we want to continue to help the community to find their way into high-performance computing and give them the tools, training, workshops, and so forth to bring up that next generation of HPC experts. We’re doing what we can from a system architecture to help this along, and some of the new technologies are playing into that. There’s just so much opportunity out there if we had more skills in this area.
Growing that talent-base is really key for us in total to fully take advantage of all the computing resources we have. It’s not really about building the biggest, fastest computers. It’s more about making computing cost effective and straightforward to use and impactful on real-life problems.
We love the large problems. Everyone gets excited about solving the largest problems, and many of them are really important—climate modeling and energy discovery are extremely important and do require those largest computers. Those are exciting platforms, and the work is very exciting. HPC and Intel® technologies are helping uncover nature’s secrets beyond what we ever imagined. They played a huge role in the direct observation of gravitational waves earlier this year—a discovery by Caltech and MIT and a huge international consortium that opens the door to a whole new field of gravitational astrophysics.
But a lot of ways we’re going to really move the needle is by having smaller-scale organizations continue to adopt HPC.
insideHPC: What are some areas where you’re seeing interesting things in smaller-scale HPC?
Alan Gara: Manufacturing is a huge one, especially as we bring in the Internet of Things. So much of the way we manufacture things today is automated, and that automation is driven through a computing resource. Now manufacturers finding that by instrumenting these processes even more highly, they can gain tremendous further efficiencies.
As the cost of the computing comes down and the expertise grows, they’re able to impact the way they design and manufacture things—not just cars and airplanes, but simple, inexpensive products. Simulation becomes practical and cost effective even at very different scales. Something like an envelope moving down a conveyor line—a research lab used a simulation to see why some envelopes were flipping up as they moved down a conveyor line. Cell phone manufacturers simulate a phone dropping just as you would a car crash, and they use the results to design a more break-resistant phone.
I saw a demonstration the other day on energy discovery. They were simulating a wind farm, and the blades of a particular wind turbine. This is a good example of how HPC has surpassed experiments in some respects. The simulations are so effective, they’re not just trying to simulate how much energy they get and what the shape of the blade should be. They’re also simulating the vibrations in the materials to see what kind of sound it generates, so they can design quieter ones.
When we have those abilities to look inside the problem, it opens up a whole other dimension. Energy discovery is certainly something that is touching everybody’s life right now, as we all know.
insideHPC: Any other application areas that are especially interesting right now?
Alan Gara: Healthcare is really ramping up. There are tremendous opportunities in healthcare and life sciences to apply computing to make things easier and more effective—for example, to find good candidates for drug trials by searching medical databases rather than through trial and error. That whole field is coming into its own, and we see that really growing over time.
I mentioned the Internet of Things. We’re starting to instrument the world in new ways. As we do, we create enormous amounts of data that, if analyzed, will allow us to explore and learn and predict and be more efficient. Water conservation, the usage and the design of materials for light collecting, when we talk about solar panels—all these involve a huge amount of computation.
Or the design of a battery—the world would be changed if we could suddenly get batteries three times more energy efficient than the current batteries. We know that batteries can be much more energy efficient than they are now, but it’s an extremely complex problem involving quantum chemistry and material science all coupled together. What are the right materials for a battery? What is the right geometry for the battery? Simulating these things is turning out to be critical, and HPC is playing a role in all of them.
insideHPC: Are there workloads that maybe don’t involve simulation, where HPC hasn’t been widely used before, that you expect to open up now? What about something like machine learning on the Intel Xeon Phi product family?
Alan Gara: We are definitely seeing some new emerging application areas, and machine learning is an example. There is a growing interest in the industry in the artificial intelligence aspect of machine learning. Those things that Google does to recognize faces and behavior, for example, are all captured now under the machine learning umbrella.
One can argue that machine learning is not necessarily HPC. But when you look at it in some detail, you find tremendous amounts of parallel activity in terms of the requirements for the algorithms, how they would behave, and even how you architect those algorithms.
Machine learning is following a similar trajectory to what I’ve seen in HPC in general. Typically, when the algorithms for an application are first written and architected, it’s all about trying to get it to work. It’s not about trying to get it to work with high performance. We’re seeing that with machine learning. The algorithms today are effective at very small scale, but they’re not yet scaling, although there’s enormous activity in the industry to do this.
Over time, we will continue to work with the industry to make that happen, and we expect that the same metrics that are used for general applications, like performance and total cost of ownership, will apply to machine learning. We believe that in this space, the best solutions will be on Intel® architecture, and we’re working to make sure we have a very strong solution in that space.
The interesting thing is that the lessons the HPC community has learned over the last 20 years are paying off in new ways. HPC has given us a tool chest of algorithms that are broadly applicable across many spaces. Some of the same HPC tools we’ve been using for years are turning out to be very effective in addressing these emerging application areas. There’s always some unique character to the new areas, but it helps to have such a broad, already-existing base that can address many different types of problems.
insideHPC: You’ve been working with a number of HPC user sites to push the envelope and make sure you’re designing the right products for them. What can you tell us about that?
Alan Gara: We have outreach programs and collaborations worldwide to develop HPC, and a number of programs that allow us to work directly with academic, industry, and government researchers. These relationships not only enable them, but they’re extremely important to us because we learn what works and what doesn’t work.
We have a number of very close relationships with customers. I’m always a little hesitant to give out names because I don’t want to miss someone. But Argonne National Labs is certainly one that we’ve been very public with about cooperating with on co-design and on making sure the machine we’re trying to deliver to Argonne, Lawrence Livermore National Laboratory, and Oak Ridge National Laboratory under the CORAL program is highly effective and that that architecture is appropriate for their needs. The National Energy Research Scientific Computing Center (NERSC), Sandia and Los Alamos National Labs are another group we’re working closely with.
We’re also looking at long-term direction. The CORAL program and the DOE programs that follow it are all focused on really trying to reach exascale in a continuous way. We’re trying to get there without stepping backwards from a productivity perspective. We don’t want to get to exascale by saying, “If you rewrite all your code, then we have a solution for you.” We’re really trying to get there in more of a transitional perspective—to avoid a take-it-or-leave-it kind of approach towards architecture innovation.
So I think that we have solutions. As we approach exascale and we start to look at some of those technologies alongside our co-design partners, we see enormous opportunities that go beyond even the technologies we’re talking about today. There are new innovations and architectures that we think are really going to allow us to get to exceptional energy efficiency while maintaining systems that have the same level of usability if not better than our current systems.
insideHPC: You’re talking about advances in architecture along with advances in technologies?
Alan Gara: They go hand in hand. Again, just as home builders work with the materials they have available, we’re working with the technologies that we have. We’re trying to come up with the best possible architecture given these new constraints, which are in some cases very, very different from what they were before.
As we do that, it’s all about getting real application performance at the right power level and the right cost level. That’s where our users are the real experts. Having these discussions with real application experts at places like the US National Labs allows us to short circuit the process. We’re not just building something and finding out how well it works. We can get with the right people and explore whether it works before we build it. That’s what these co-design activities are about.
HPC continues to move forward at a phenomenal pace. When you have that rate of improvement, it offers a tremendous opportunity to change the game. But it also means we have to innovate at a very high level to get to there. We have to take risks. We are pushing the envelope to get to those levels of innovation that achieve that performance, and to evaluate what works and what doesn’t and what can be improved and how to improve it.
We are constantly bouncing ideas back and forth at both a macro level and at a very small fine-grained level with our users, so there’s a whole spectrum of things that we get feedback on. These collaborations enable us to get that feedback quickly to make sure we can turnaround things and take the right action and either move more aggressively or perhaps adjust and move tangentially in terms of the particular aspect of the architecture we wanted to look at.
insideHPC: Do you have an overall message to the developer community?
Alan Gara: The message I would give them is that we now see our way to exascale. It is not without its challenges, but we’re really encouraged that we can get there with the right levels investment. We think we see a technically feasible direction to get there.
I think that’s really exciting for those organizations that rely on that level of computing throughout their work. At the same time, the innovations and activities will waterfall down and be utilized for the broader HPC world as well.
As we look at the broader world of HPC, we can see that we’re moving towards systems that continue on this trend of more cost-effective computing over time. I don’t see us throwing a wrench in how they do things. We see continuity in how they’re currently doing things.
We do see opportunities for even larger changes for those who do want to move more aggressively. There are some new opportunities and some brand new directions that are going to be opening up.
insideHPC: If you climb on your soapbox, are there any actions you like to urge stakeholders to take?
Alan Gara: We want to encourage the kind of collaboration we’re seeing in programs like the National Strategic Computing Initiative in the United States. NSCI as a really exciting program. Part of its mission focuses on government agencies where the problems can benefit from HPC but the agency might not have deep computing expertise internally to take full advantage of the computing resources. NSCI encourages cross-agency collaboration to help them remedy that.
We want to encourage and support that collaborative behavior in whatever way we can, because there are a multitude of problems in government agencies and commercial entities that seem to have high performance computing solutions. Think of bringing together the tremendous computational expertise you find from the DOE labs with the problems that someone like the National Institutes of Health is trying to solve. You couple those two together and you really can create something amazing that will affect all our lives.
We want to broaden their exposure to the possibilities of HPC and help that along. It’s important, and it will allow all of us in HPC to more broadly impact the world with the large systems as well as the more moderate-scale systems.
Alan Gara will present Computing in 2030 – Intel’s View through the Crystal Ball on June 21 at ISC 2016 in Frankfurt.
Jan Rowell writes about technology trends in HPC, healthcare, life sciences, K-12 education, and other industry segments.