In this video, Conference Chair Scott Lathrop presents his perspectives on the Technical Program at SC11. Recorded at the Pre-Show Press Conference Nov. 14 in Seattle.
In this video, Matthihjs van Waveren from the OpenMP ARB discusses the organization’s mission to oversee the OpenMP specification and organize conference, workshops and other related events. Recorded at SC11 in Seattle.
The International Workshop on OpenMP (IWOMP) is an annual workshop dedicated to the promotion and advancement of all aspects of parallel programming with OpenMP. It is the premier forum to present and discuss issues, trends, recent research ideas and results related to parallel programming with OpenMP. The international workshop affords an opportunity for OpenMP users as well as developers to come together for discussions and sharing new ideas and information on this topic. Deadline is January 31, 2012.
Our Video Sunday feature continues with this animation depicting a 100G network demo between the University of Victoria and Caltech at SC11.
The animation shows the DELL cluster connecting the Brocade MLXe 100 GE switch both located in the UVIC Data Centre. The network goes to the Ciena OME 6500 located in the BCNET Transit Exchange in downtown Victoria. From Victoria the data is transported over the CANARIE network to the Washington State Convention Centre in Seattle.
Learn more about the demo at Caltech’s SC11 site.
In this special guest feature, Intel’s John Hengeveld reflects on the amazing week that was SC11.
There is so much to cover from SC11. It was a thrilling week of meetings, technical sessions, and new technology. I learned a lot, and appreciate how much of great and exciting ride we are in for in the years to come. The key things I was looking for from my preSC11 column are shown here.
- New CPUs and the Top500: Interlagos — the AMD Opteron 6200 launched on Monday with a focus on core count and power efficiency per core. Intel made a press announcement of the performance levels of the future Intel Xeon E5 family as shown on the top500, and further announced that Xeon E5 will support PCIe 3.0. The top500 list had listings from each of these new CPUs including Cray systems with AMD processors, and HP, Bull and Appro systems with Intel Processors. At the end of the day, with banner products from both vendors, the industry is set up for a fresh push forward.
- New Big Systems and new systems across the Globe: While the #1 system surged to over 10PF, the top10 remained unchanged. I heard about some systems in development that will be coming soon (GENCI Curie, LRZ, Titan) but it was the news that there were not any new top10 systems that surprised me. More interesting is this occurred while the top500 bottom moved up aggressively from 40.187 TF to 50.94 TF. Hopefully this pause in the Top10 is a breather while we wait for new systems.
- GPU vs. Intel MIC part 4: Kepler – Was I the only one disappointed by Jen-Hsun Huang’s keynote? We heard nothing further on Kepler. He struggled to avoid saying Intel and ARM (at one point smoothly saying NVidia when he meant Intel). He made a case that exascale in 2020 at 20MW is a key goal and that lower power solutions would be required to get there. But the substance of his talk jumped off of Clayton Christensen’s keynote from last year and talked about the “Innovators Dilemma” on the path to Exascale.
I taught corporate strategy at Portland State University for many years and have often taught the key insights in “The innovators Dilemma” and “The Innovators Solution.” So I was emotionally connected when Huang started out there. The key principles of how a “low end disruptor” captures the mainstream of the market with a lower cost “good enough” solution are valuable insights to the technology world in general and HPC in specific.
The point to the referencing Christensen – Was trying to illuminate that GPUs would represent a disruptive innovation for the mainstream of HPC. While he very clearly made the case that NVidia graphics accelerators were once a new market disruptor in gaming and a low-end disruptor in workstations. Where he went off target is trying to stretch that into the HPC space.
At 31 minutes in the keynote, Huang says: “If I can just figure out how to program it, if I can describe all my problems as a triangle. I could solve the worlds problems.”
In this comment, Huang admits that adoption of GPU technology is predicated on adopting an isomorphism. The programming model of GPU is the transformation of a problem into manipulations of triangles… and thus the debate in the industry now.
The issue in HPC is not “does the industry need the density of performance at lower power” (we do), but rather “must we adopt the isomorphism of thinking of the world as triangles to use the system with highly efficient performance levels”. This is the core of the GPU vs. Intel MIC architecture debate.
The substance of the keynote was demonstrations of the impact of increased compute density on gaming examples (BF3, Assassin’s Creed, etc) and a plug for his Maximus WS product, which while fun for the gamer in all of us left us feeling a bit… hollow.
MIC: 1TF per socket, I got excited when I found out the Knights Corner silicon would be powered on by SC11 and deeply hoped the gnomes working on it would be able to run Linpack or DGEMM on it by that time. My friend and boss Joseph Curley pressed the team to complete the demo on time. We made it, you can see in the picture of Joe with Raj Hazra (the GM of Intel’s Technical Computing Group) proudly holding up one of the first Knights Corner parts in the picture below. Joe is looking stern in the picture here, no doubt from exhaustion. He’s been a busy man of late.
The more interesting element of Raj’s talk was the discussion that an Intel MIC product appears to applications as a fully functional compute node able to run its own open source operating system. This means that many applications will port to MIC with a simple recompile. Robert Harrison stood up and presented results from porting “10’s of millions of lines of code” to the Intel MIC software development vehicle.
So the GPU vs. MIC debate is engaged in full force. NVidia and Intel are now mostly publically aligned on the goal, 20MW / Exaflop in this decade. The debate on performance is over; the debate on programming has begun.
- PCIe 3: Mellanox announced in their quarterly earnings release some news on their solutions for Infiniband. A few of the new systems in the top500 had Mellanox IB solutions. Intel announced that their future Xeon E5 processor integrates PCIe 3.0 on die. No word from AMD on this, and no announcements from other graphics cards or interconnect suppliers. The beginning of the PCIe 3.0 transition is here. Interconnect bandwidth is going to be a key element in delivering performance in some cluster architectures and in some key workloads. Again, I expected more from other IB manufacturers. This transition will accelerate in greater force in the first half of 2012.
** Footnote: A sustaining innovation is the opposite of disruption. It is the normal progress of maturing of a technology to be accessible to a larger portion of an overall market space. Unmet needs are now met such that customers see higher value in the product they are using and either pay more for it or buy more of it. Adding a feature to a product like the iphone4S speech recognition feature is an example.
At Tip of the Hat goes to the Samsung Voices blog for pointing us to this video.
In this video from SC11, Michael Norman (SDSC), Joseph Insley (ANL), and Rick Wagner (SDSC) describe a set of ground-breaking collaborative astrophysics simulations. Essentially, the visualizations depict the formation of the first galaxies, and what happened to the surrounding intergalactic gas when they lit up.
The light from early galaxies had a dramatic impact on the gasses filling the universe. This video highlights the spatial structure of the light’s effect, by comparing two simulations: one with a self-consistent radiation field (RHD), and one without (HD). The comparison shown is the relative difference of the density. The colors show whether the density is greater in radiative or non-radiative case.”
The simulation was run on 50,000 cores of the Jaguar supercomputer at ORNL and consumed over 20 million CPU hours to date. These are some stunning visualizations, folks. Be sure to check out the videos below in full HD resolution.
The comparison shown is the relative difference of the ionization fraction. The colors show whether the ionization fraction is greater in radiative or non-radiative case.
The comparison shown is the relative difference of the density. The colors show whether the density is greater in radiative or non-radiative case.
The comparison shown is the relative difference of the temperature. The colors show whether the temperature is greater in radiative or non-radiative case.
- Science: Robert Harkness, SDSC: Daniel R. Reynolds, Southern Methodist University: Michael Norman. SDSC: Rick Wagner, SDSC
- Visualization: Mark Hereld, Argonne National Lab: Joeseph A. Insley, Argonne National Lab: Michael E. Papka, Argonne National Lab: Venkartram Vishwanath, Argonne National Lab
November is over, so this is the last day of our SC11 Special Events Edition. So I was thinking: what better way to sum things up than with a wrapup video?
At the close of SC11 exhibits, Brock Palen and Jeff Squyres of RCE Podcast fame met with Rich Brueckner of insideHPC to discuss biggest surprises, biggest disappointments, and the coolest things they saw at the show. Our lists may not agree with yours, but we had great fun putting this show together as a form of stress relief.
I have a couple of more SC11 videos in the works, some of the best for last in fact. So stay tuned for SC11 Analyst Crossfire and more!
In this video, Robert Read from Whamcloud demonstrates their new Chroma management system for the Lustre file system. Recorded at SC11 in Seattle.
Chroma is a central management system that is deeply integrated with Lustre. It brings together information from multiple sources to provide a unified view of what is going on in a storage system ― while vastly simplifying installation, configuration, maintenance, monitoring, and fault diagnosis. This enables enterprises with storage intensive applications to get much more out of their storage environment.
Read the Full Story.
In this video, Professor David Bader from Georgia Tech discusses his participation in the DARPA ADAMS project. The Anomaly Detection at Multiple Scales (ADAMS) program uses Big Data Analytics to look for cleared personnel that might be on the verge of “Breaking Bad” and becoming internal security threats.
Recorded at SC11 in Seattle. Learn more about David Bader’s work in our Rock Stars of HPC profile.
In this video, IBM’s Chris Espinosa describes the company’s Intelligent Clusters for HPC. He then goes on to show a demo where GPU powered HS22 blades and IDataplex platforms provide a 10x speedup on a heart MRI application.
In this video, IBM’s Keith Olsen describes the company’s BladeCenter H systems for HPC. Over 10 percent of the TOP500 systems are based on the BladeCenter H platform.
With the November 2011 TOP500 list, IBM is once again #1 in leadership with the:
- Most installed aggregate throughput with over 20,234 out of 74,064 Teraflops, taking the lead for 25 lists in a row
- Most systems in the TOP500 with 223 (HP had 142. Oracle had 10, a decrease from June when they had 12.)
- Most energy-efficient system with the IBM Blue Gene/Q
- 5 Most energy-efficient systems
Recorded at SC11 in Seattle. Read the Full Story.
In this video, Kirby Collins from Convey Computer discusses why the company’s hybrid-core architecture is so effective on Graph500 computing applications. Recorded at SC11 in Seattle.
On the most recent list, announced at SC11, multiple Convey single-node, hybrid-core systems clocked in at between 1.60 to 1.76 GTEP/s (billion edges per second) on problem sizes 27 and 28. Convey has a total of six entries on the Graph500 List including submissions from Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center (LBL/NERSC), Sandia National Laboratories (SNL), and Bielefeld University. Compared pound-for-pound and watt-for-watt, Convey’s family of reconfigurable (FPGA) systems provide superior processing power on the Graph500 ( www.graph500.org ) list.(1) The Graph500 organization establishes and maintains a set of large-scale benchmarks that measure performance of “big data” applications.
Read the Full Story.
The SC11 Keynote by Nvidia’s CEO Jen-Hsun Huang is now posted in the ACM Digital Library. It is not easy to find, and non-ACM members will need to create an email account to view the thing. Bless their little hearts.
For the less patient, this Vimeo copy posted by Lewey Anton streams directly if you want to get around their hoops.
In this video, Gilad Shainer from the HPC Advisory Council describes the organization’s efforts to share best practices and do outreach and education . The Council is an active supporter of the Student Cluster Challenge at SC11 and ISC’12 and recently became the recipient of Intel’s Explorer Award:
The HPC Advisory Council has been honored with the ‘Explorer Award’ from the Intel Cluster Ready team at Intel, which recognizes organizations who have continued to explore and implement Intel Cluster Ready certified systems. The award was given as a result of the numerous joint activities between the HPC Advisory Council and Intel Cluster Ready, such as testing and benchmarking applications on Intel Cluster Ready certified clusters, developing documents and best practices for implementing open source applications on Intel Cluster Ready certified clusters, and extending Intel Cluster Ready’s reach into HPC end-user audiences.