Team submissions are now being accepted for the SC13 edition of the Student Cluster Competition, a spirited event featuring young supercomputing talent from around the world competing to build and operate powerful cluster computers. As the world’s largest gathering of HPC professionals, the SC13 conference will be held Nov. 17-22, 2013, in Denver.
The energy and dedication that the student teams bring to the Student Cluster Competition is inspiring, especially as they work around the clock to overcome obstacles and get their systems up and running,” said Student Cluster Competition Chair Dustin Leverman of Oak Ridge National Laboratory. “Though they are competing against one another, the teams also share a camaraderie as they race to the end.”
In this video, Dan Olds from Gabriel Consulting brings us the resultes from the SC12 Student Cluster Competition.
The deadline for team submissions is Friday, April 12, 2013. Read the Full Story.
In related news, you can now follow all three of the major worldwide Student Cluster Competitions at their new home site.
In this video from the Mellanox booth at SC12, Paul Kinyon from SGI presents on the SGI ICE X, a fifth-generation blade-based x86 cluster. According to Kinyon, the SGI Ice X was the first platform to support FDR InfiniBand at the node level, the switch level, and the fabric level.
So what is SR-IOV? The short answer is that SR-IOV is a specification that allows a PCIe device to appear to be multiple separate physical PCIe devices. The SR-IOV specification was created and is maintained by the PCI SIG, with the idea that a standard specification will help promote interoperability.
Today Cycle Computing announced that the company capped off its record-breaking fiscal year by winning the IDC HPC Innovation Excellence Award. IDC recognized Cycle’s 50,000-core utility supercomputer run in the Amazon Web Services (AWS) cloud for pharmaceutical companies Schrödinger and Nimbus Discovery. In an unprecedented computer run, the cluster completed 12.5 processor years of computation in less than three hours. Running at a cost of less than $4,900 per hour, the computational drug discovery job was recognized by IDC for its impressive return on investment.
In an industry that is evolving as rapidly as HPC, it’s fascinating to be a part of the creativity and innovation we’ve seen in the past year,” said Chirag Dekate, an analyst with IDC. “Cycle Computing’s impressive 50,000 run for Schrödinger and Nimbus Discovery demonstrated a strong ROI from the use of HPC, and we were pleased to recognize their accomplishment.”
Cycle Computing also reported 85 percent client growth in 2012 and has staffed up its sales and support staff. Read the Full Story or check out this interview with Cycle Computing CEO Jason Stowe from SC12.
In this video from the Mellanox event at SC12, Anke Kamrath from the National Center for Atmospheric Research (NCAR) presents: The Art of Networking a Petascale System. The video includes remarkable images of networking topology from the Yellowstone supercomputer in Cheyenne, Wyoming.
In this video from the Intel Xeon Phi announcement at SC12, Dr. Dan Duffy at NASA Goddard describes the installation of his IBM iDataPlex M4 servers. Using the IBM Intelligent Cluster process, his team was able to complete the installation in 48 hours as well as a Linpack run that landed them at number 52 on the TOP500 supercomputer list.
These latest editions of Moab demonstrate our continued commitment to improving the user experience of Moab as well as the back-end functionality,” noted Michael Jackson, president of Adaptive Computing. “By integrating the latest technology from other industry leaders into our solutions, we are making HPC systems run more effectively, which means manufacturers and researchers can more quickly bring their discoveries to the world.”
In this podcast, the Radio Free HPC team is still talking about the recently concluded SC12 conference in Salt Lake City. The conversation starts with a short review of Thanksgiving dinner (including disgusting eating noises added in at no additional charge) before moving on to more weighty topics such as Intel’s formal introduction of their Xeon Phi coprocessor, including some performance and price information.
Rich and Henry think that Intel has a strong hand with Phi, but Dan isn’t so sure…
In this wrap-up review of SC12, the Radio Free HPC team discusses the Student Cluster Competition, covering the teams, results, and a discussion of how the competition has evolved over the years and where it should go in the future.
In this podcast, the Radio Free HPC team quits griping about the horrible WiFi at SC12 and moves on to a truly big issue: Are LINPACK and HPCC benchmarks useful? Should they be constantly re-evaluated? And shouldn’t you really test machines on the kinds of workloads they’re designed to run?
The catalyst for this discussion is the Blue Waters system, for which no LINPACK numbers have been submitted. Yes, it’s behind schedule, and sure, they’re busy doing the science… but is it also a shot across the bow? Are they rebelling against industry philosophy? If they are, that’s a good thing, according to Henry – because a system is about what you plan to do with it, not how many flops you can get out of it. Rich agrees: if you get a giant LINPACK number on a system that has reliability issues, and you can’t output real science because all your time and money is invested in brute computation, what good is it? And the industry sectors doing meaningful work – where are their systems on the Top500? They’re not playing this game.
Spoiler alert: Henry agrees with Dan on something. Really. It’s at the 10:00 mark, if you’ve got to see it to believe it. We hardly believed it ourselves.
TORQUE, SLURM, and other schedulers/resource managers provide for a periodic “node health check” script to be executed on each compute node to verify that the node is working properly. Nodes which are determined to be “unhealthy” can be marked as down or offline so as to prevent jobs from being scheduled or run on them. This helps increase the reliability and throughput of a cluster by reducing preventable job failures due to misconfiguration, hardware failures, etc. Though many sites have created their own scripts to serve this function, the vast majority are one-off efforts with little attention paid to extensibility, flexibility, reliability, speed, or reuse. The Warewulf developers hope to change that with their Node Health Check project.”