My wife Jennifer is a late riser. She goes to bed late after whatever fun or work she had the night before. She snoozes the morning away, and awakes noon-ish to me either making her breakfast (on the weekends) or calling her to wake her up (the rest of the time). She assumes that gnomes of morning have made ready many good things while she was in dreamland. She wakes ready to take advantage of that bequest in her new day. There is an analogy in there… someplace.
Happy New Year! For all of its hits and misses 2011 was an amazing year for the HPC industry, in my last post on SC11 and disruptive innovation I covered the highlights of the last big event of 2011. Looking at what’s ahead, I am expecting 2012 to be the year of Application and Gnomes.
Roadrunner, the first and only IBM system to reach petascale on the top500 list, was hard to use and hard to program. That’s fine for a one-of-a-kind box. But, I expect by the end of 2012 there will be 20+ petascale systems and they will be doing real work, real science.
The “Practical Petascale” era dawned at SC11 and 2012 will see a great proliferation of petaflop machines. Two years ago, a petaflop machine was over 10,000 nodes and was an expensive beast. Now, an Intel Xeon E5 based cluster will achieve a petaflop with roughly 3,000 2 socket nodes. These systems are programmable with standard tools and techniques and can be rapidly applied to a broader range of applications.
Everybody will want one. Who knows, soon it will be a measure of the Rich and Famous… I could see it now – “… and darling, in this room, we keep the Van Gogh’s, and over there… is our petaflop cluster, its being used to support famine relief and protecting endangered species in New Guinea.”
Many nations and institutions will put together something like that to solve their toughest problems. The tools are in place to make scaling applications easier. With this in mind, I am focusing the next few months on understanding practical petascale applications. What are these new systems doing? How are they contributing to science? How are they contributing to national competency?
Over the past 4-5 years a tremendous amount of technology has been developed and put in place to create this era of HPC innovation and application. Many technologies take 4-6 years to go from the first inklings of technology to its commercial deployment. If 2018-2020 is the arising of exascale (Intel has committed to an effort for 2018 for a 20MW / Exaflop – Kirk Skaugen’s ISC11 talk) then 2012 is 5AM for the Exascale Gnomes, its dawn. Time to get to work.
Practical Exascale will need solutions to the canonical “exascale problems” such as “PRESS” – Programmability, Reliability, Efficiency, System Scalability.* Each of those has to have Gnomes at the ready in 2012.
The more I look at the research into exascale applications like CFD, weather modeling, and molecular simulation, the more exascale problems don’t look like bigger versions of their petascale brothers. Data will be less organized and less monolithic. The macro and micro level simulation will be modeled and interactions between the two will drive complexity, with millions of threads running, coordinating and communicating with each other. Programming all of these and keeping all of it working across a wide variety of data will be a significant problem. Gnomes will need to continue work on the optimization of the Seven Dwarves as well (ouch, I didn’t foresee that one). Perhaps programming and system scaling people have another year or two to get their acts together, but not more than that.
Reliability Gnomes also have to begin serious work. Historically, power efficiency and reliability have been competing interests. Its physics as much as anything, smaller swings of signal means less energy to store information. Eventually errors will show up. The bigger the system, the more combinations of errors will affect system performance. Detection and recovery is expensive from a silicon perspective and weighs against the power budget as well. Of the exascale issues, this one scares me the most. If the Programming Gnomes have two years to crack their problem, Reliability Gnomes really have about one… their results need to feed into process research and design.
Work on Efficiency is really work on efficient performance. In exascale I don’t think we can discuss one without holding the other relatively constant. There are lots of wasteful parts of the exascale system: power delivery; cooling; storage; and interconnect. These all have to be power streamlined. Gnomes can already be heard singing a happy work song on all of these. The use of dedicated highly parallel architectures like MIC and radically different interconnect approaches are at least asking the right questions if they aren’t getting answers yet.
So looking forward to 2012, I expect real movement on these four key questions. So when the rest of the world awakens in, oh 2016, or so… they will find their metaphorical Exascale Breakfast plates full of what I made Jen last weekend.
*- I love memory aids and acronyms… witness “Intel® Many Integrated Core (MIC) architecture.”