In this video from SC12, Caps-Enterprise CTO François Bodin describes the company’s programming tools for moving applications to the new Intel Xeon Phi.
In this video from SC12, Caps-Enterprise CTO François Bodin describes the company’s programming tools for moving applications to the new Intel Xeon Phi.
Last week at SC12, Colfax International was showcasing their new CXP9000, giving the public its first look at a 4U server with up to 8 Intel Xeon Phi coprocessors and 16 Xeon cores. The system boasts nearly 8 Teraflops of peak double precision performance for HPC applications.
Colfax has consistently been first to market with innovative solutions that maximize performance for our customers in high performance computing (HPC) and enterprise segments,” said Gautam Shah, CEO, Colfax International. “Thanks to our close relationship with Intel and engagement in the early testing program, Colfax is uniquely qualified and positioned to provide a complete portfolio of products to support the Intel Xeon Phi coprocessor.”
Read the Full Story or check out the new Colfax training course on code optimization for Xeon Phi.
Our Video Sunday feature continues with this animated timeline showing the history of computing industry since the first IEEE/ACM Supercomputing Conference. The SC13 conference in Denver will mark the 25th anniversary of the conference.
As a critique, I think that while this is a visually stunning presentation, the inclusion of trivial non-HPC milestones like 2008′s “The Hulu website is released to the pubic” is incredibly lame considering the context. Less would have been more this time, guys. If you have to reach that far, why just not add: “Stimpy hits the History Eraser Button” from 1991?
Editor’s Note: While the original video was posted on YouTube as a silent movie (probably due to music licensing problems) this version features a public domain performance of Hebrides Overture Fingal’s Cave by Jakob Ludwig Felix Mendelssohn Bartholdy. You can find it and other classical downloads at the MUSOPEN project, a Kickstarter-funded non-profit focused on improving access and exposure to music by creating free resources and educational materials.
We try to keep this page up to date, but you can always find our latest works on our RichReport YouTube Channel.
This year at SC12, we shot over 50 video interviews. It was a lot of work, but we want to bring you the very best of what this amazing conference had to offer.
SC12 Videos (Alphabetical by Vendor Name):
Adaptive Computing
Adaptive Computing SC12 Booth Theater
Aeon
Allinea
Altair
AMD
Asetek
Bull
CAPS-Enterprise
Colfax International
Cycle Computing
DDN
Dell
Gnodal
HPC Advisory Council
IBM
IDC
Inktank
Intel
Intersect360 Research
NAG
Nirvana
Numascale
Nvidia
Mellanox
OpenSFS & EOFS
Panasas
Penguin Computing
Rogue Wave Software
Samplify
SC12 Committee
Scalable Informatics
Seneca
SGI
Solarflare
Spectra Logic
Supermicro
Sugon
Texas Instruments
The Portland Group
VMware
Xyratex
This week Allinea Software announced a new performance analysis tool, Allinea MAP at SC12 in Salt Lake City. Better known for their high-performance, scalable parallel debugger, Allinea DDT, the company has taken an unusual tack in developing the new product. Working with a range of leading HPC labs, Allinea tried to figure out what could get more people to profile their codes. This collaborative process resulted in some rather unique features in the software. Allinea MAP eschews the classic instrumentation-based MPI timeline in favor of a dynamic sampling engine that claims to scale tens of thousands of processes whilst adding just 5% to the total runtime.
We were able to cheat a bit by building on top of Allinea DDT’s infrastructure,” admits David Lecomber, CTO at Allinea Software. “We already had an infrastructure to launch and merge data proven at 275,000 processes. This gave us a common visual interface over the entire Allinea environment as well as saving development time, which let us focus much more on the user experience.”
Customers who sign up during the rest of SC12 will receive exclusive invitations into an extended development phase to fit the product to their users’ needs.
Visit the Allinea booth #2531 at SC12. Read the Full Story.
This week the RSC Group from Russia announced that the new RSC Tornado SUSU supercomputer was deployed at the South Ural State University (SUSU). As Europe’s largest university supercomputing system equipped with Intel Xeon Phi coprocessors, the new system includes 192 computing nodes with direct liquid cooling and 236.8 TFLOP of performance.
In recent years Russia has actively developed computing centers at the national level, primarily in the leading universities. The creation of the Europe’s largest university supercomputer with the latest Intel Xeon Phi coprocessors SE10X in South Ural State University further confirms Russian government’s and Leadership’s commitment to improve Russia’s competitiveness by actively developing the intellectual potential of the country and the use of Intel’s state-of-the-art high-performance and energy efficient technology,” said Christian Morales, Intel’s Vice-President and General Manager in EMEA.
Look for the RSC Group booth #4721 at SC12. Read the Full Story.
This week T-Platforms announced that the first Russian-built HPC system in the USA has been delivered to the State University of New York at Stony Brook (SBU). According to SBU, T-Platforms scale-out ‘V-Class’ system was chosen for its compute density, power efficiency, sustained performance, and integrated chassis-level management.
We have developed a unique method to determine the optimal and stable structures of materials that have never existed before,” said Professor Artem Oganov of SBU. The algorithm that we have tried to recreate is carefully designed, mimicking the one of that exists in the nature. Two approaches might be used when creating new materials. The first is to search for all possible combinations of atoms within a crystal structure. The problem is that the number of variations within the structure is astronomical. We have developed an evolutionary method requiring much less computational effort and [it] shows remarkable reliability. The T-Platforms computing system has demonstrated the highest levels of performance in carrying out tasks using this method, and we are planning to expand the system in the near future.”
Based on T-Platforms’ V5000 enclosure, the 2.5 Petaflop system has been fine-tuned to run VASP quantum mechanics and molecular dynamics software for modeling of atomic-molecular and electron-nuclear systems. Read the Full Story.
Asetek is showcasing a remarkable new liquid-cooled HPC cluster at SC12 with extreme power efficiency, performance and density. In a standard rack, the system packs in 23 2U 4-node Intel H2216WPJR servers, each with dual Xeon E5 2690 CPUs providing.
Under load, the fully populated rack consumes 37kW of power that is converted into heat. It would be impossible to run at these densities using traditional server air cooling and room air conditioning. It would require another 22kW of power just to keep it cool (Industry Average PUE 1.8/pPUE1.6). In the showcase cluster, all CPU and memory heat, (85% of the total server heat generated) is cooled by Asetek’s warm water liquid cooling RackCDU system. As the servers are cooled by warm water, no power goes into actively chilling the liquid. This means that 85% of all server generated heat is cooled by free ambient air.
Formal presentations and live demonstrations will be available at Asetek’s booth #4045. Read the Full Story.
This week DataDirect Networks (DDN) announced the SFA7700, a hybrid flash storage appliance with a unique ability to anticipate and optimize the workloads of big data-intensive applications.
DDN has extended its SFA technology to feature even greater levels of efficiency and modularity,” said Henry Baltazar, Senior Analyst, 451 Research. “With appliance-level integration into DDN’s file storage technology and new automation with DDN’s cloud collaboration tools, the SFA7700 is an ideal foundation to DDN’s Big Data portfolio and will enable organizations to ingest, process, store and distribute data with simplicity and scalability.”
According to the company, the SFA7700 hybrid storage appliance is an ideal entry-level system for organizations with big data storage needs. It can start small, supporting 60 SSDs and/or HDDs in a 4U rack for a maximum capacity of 240TBs, and can expand to support up to 396 disks in 20U rack for a maximum capacity of 960TBs. In addition, the SFA7700 also allows organizations to migrate data to a public or private cloud when integrated with DDN’s Web Object Scaler (WOS) cloud storage appliance, facilitating file sharing and collaboration. Read the Full Story.
Today Colfax International announced a new set of developer training programs for the new Intel Xeon Phi coprocessor. As part of the course material, Colfax Developer Training (CDT) will discuss the applicability of the Intel many-core technology, demonstrate the programming models for Intel Xeon Phi coprocessor including native execution and offload-based approaches, and provide extensive optimization techniques.
As developers look to quickly and efficiently harness the power of the Intel Xeon Phi coprocessor, Colfax is excited to announce a complete development solution including wide range of training programs for both novices and experts, complemented by professional workstations,” said Gautam Shah, CEO and President of Colfax International. “We have worked with Intel’s Xeon Phi team for close to two years, acquiring significant expertise and are therefore uniquely qualified and positioned to provide Intel Xeon Phi training and systems for highly parallel computing workloads.”
Colfax will demonstrate the Intel Xeon Phi coprocessor based developer workstation and registering attendees for their training class at SC12 show in the Colfax booth #2409 and a theater presentation, “Maximizing Performance with Intel Xeon Phi Coprocessor” at the Intel booth #2601. Read the Full Story.
Samplify is relatively new company in the HPC market, but their APAX compression technology is already making waves in the market with both software and hardware approaches. To learn more, I caught up with Samplify’s CTO, Al Wegener, author of a new white paper that details how users can apply the APAX Profiler to increase application performance.
insideHPC: What is the APAX white paper about?
Al Wegener: The APAX white paper describes the Memory Wall problem associated with high-performance computing (HPC)), where additional CPU and GPU cores don’t generate faster results because you can’t “feed the beasts” (HPC processors) with operands from memory quickly enough. The paper describes a novel solution (encoding of numerical operands) that results in a measured Memory Wall reduction between 3:1 and 8:1 on HPC application as diverse as multi-physics, climate modeling, and k-means clustering. The APAX encoder works with the APAX Profiler tool to give HPC users new insight into the uncertainty of their input datasets. By encoding operands in software (today) and in memory controller hardware (soon), APAX numerical encoding gives HPC users an adaptive, controllable, and flexible way to reduce DDR, PCIe, Ethernet, Infiniband, and SAS/SATA bottlenecks by 3x to 10x.
insideHPC: What is the APAX Profiler?
Al Wegener: HPC datasets contain both uncertainty and redundancy. While HPC scientists may think their sensor-derived 32-bit or 64-bit data is perfect, typical HPC datasets pick up a lot of noise between the analog sensor and the multi-core CPU or GPU. The APAX Profiler software tool (also available on the Samplify web site) allows HPC users to upload their datasets in order to quantify uncertainties, and to determine the Profiler-recommended APAX encoding operating point that results in “five nines” (0.99999) of correlation between the original dataset and the decoded dataset. For many HPC datasets, “five nines” of decoded quality comes with encoding rates above 3:1, thus reducing the HPC Memory Wall while delivering identical HPC simulation results.
insideHPC: What is the “overcasting” problem and how does APAX help?
Al Wegener: Many HPC simulations, including climate, multi-physics, earthquake, genetic sequencing, and finite element analysis, begin and/or end with real-world sensor measurements. HPC simulations use sensor input to make predictions about the future, but HPC predictions must be compared to the “real world” via subsequent sensor measurements. Sensors generate integer values, but HPC simulations usually use 32-bit and 64-bit floats for computation. “Overcasting” is the tendency in HPC to cast integer values (often with 12 integer bits or less of quality) into floating-point values, without recognizing that the resulting float has been “overcast,” i.e. contains uncertainty that is not reflected in the 32-bit float. The APAX Profiler quantifies the degree of overcasting in HPC datasets by using spectral techniques (FFTs). After recommending an appropriate level of accuracy (uncertainty) for each dataset, the Profiler allows APAX users to fine-tune the accuracy of each dataset while significantly reducing bandwidth and storage requirements.
insideHPC: How is APAX technology a potential enabler for Exascale computing?
Al Wegener: According to the US DARPA Exascale study (2008), Exascale has memory, network, and disk problems, not compute problems. According to DARPA, in order to deliver 1018 flops per second (Exascale), DDR3 memory would have to get 16x faster, while disk drives would have to get 100x faster. By encoding HPC operands (numbers) as they are transferred between multi-core CPU and GPU sockets and DDR, network, and disk drives, APAX reduces the DDR, network, and disk drive bottlenecks of Exascale by user-controllable factors between 3x and 8x.
insideHPC: How does APAX encoding save energy and reduce cloud computing costs?
Al Wegener: Cloud computing depends on cloud-based hardware, but cloud users have to send their data to the cloud and then they have to download the results. By reducing both upload and download costs for users of HPC-on-demand services like Amazon EC2 and Microsoft Azure, APAX saves cloud users both time and money. In addition, experienced cloud users know that CPUs only draw about 40% of server power, while the other 60% is dissipated by DDR memory and disk drives. When APAX reduces DDR and disk bottlenecks, HPC users get their result faster, which reduces cloud-based energy usage. In one memory-bound HPC application, APAX 4:1 encoding resulted in a 3.8x speed-up in “time to results,” and thus a 3.8x reduction in server power consumption.
insideHPC: How is APAX effectively lossless?
Al Wegener: Since sensor samples often comprise the source material for HPC simulations, it’s important to recognize that floating-point numbers are using more bits than required to represent the dynamic range of integer samples. The APAX profiler quantifies the degree to which HPC datasets were overcast and encodes those datasets into “simply the bits that matter.” As APAX beta-testers in HPC climate, multi-physics, and earthquake simulations have verified, their HPC simulation results are identical, but the results come out faster. That’s what Samplify calls “effectively lossless” encoding – the size of HPC input and intermediate datasets are reduced by 3x to 8x, but the results remain the same.
Samplify will demonstrate APAX next week at SC12 booth #4151.
by Natalie Bates, Co-chair Energy Efficient HPC Working Group (EE HPC WG)
Energy efficiency will again be a hot topic at SC12, with at least 38 Technical Program sessions focused on energy efficiency. A complete list of these sessions organized both chronologically and by topic can be found on the Energy Efficient HPC Working Group website. SC12, the annual International Conference for High Performance Computing, Networking, Storage and Analysis, will be held Nov. 10-16 in Salt Lake City, Utah. For more information, see the SC12 website.
BROAD SCOPE SESSIONS
“The Third Annual Workshop on Energy Efficient High Performance Computing – Redefining System Architecture and Data Centers” promises to be interesting to a broad audience. Some of the featured speakers include; Peter Kogge, University of Notre Dame who will look at the historical trends of power, energy and supercomputing; John Shalf, Lawrence Berkeley National Laboratory whose talk will focus on the energy requirements for applications; as well as Herbert Huber, Leibniz Supercomputing Center and Steve Hammond, National Renewable Energy Laboratory who will speak about energy efficient data centers.
There are four other technical programs that will cover the topic of energy efficiency at a high level. Kirk Cameron, Virginia Tech is on the slate to give two talks, both of which have clever and enticing titles with phrases about a “Growing Power Struggle” and “Energy Oddities.” Prohibitive energy costs motivated Thomas Ludwig, German Climate Computing Center to consider the cost and benefits of “HPC-Based Science in the Exascale Era.” Finally, there is a “Cool Supercomputing” Birds of Feather (BoF) organized by Pacific Northwest National Laboratory that covers tools and techniques for optimizing energy consumption at all levels.
“Setting Trends for Energy Efficiency” is a BoF representing a collaborative effort by the Top500, Green500, the Energy Efficient HPC Working Group and The Green Grid to standardize the power measurement methodology used when running system workloads for architectural comparison, such as High Performance Linpack. This is one of seven sessions that cover energy efficiency measures and metrics. The Green500, Top500 and now the Graph500 have their own BoFs and will report power consumption and energy efficiency as well as performance for their Lists. The High Performance Group at at the Standard Performance Evaluation Corporation (SPEC) has also organized a BoF that will discuss a new OpenMP benchmark suite with an optional energy metric that scales to 512 threads. From the home of the Green500 at Virginia Tech, Balaji Subramaniam will present his doctoral showcase on metrics for energy efficiency. Finally, an Intel team will present a paper on tuning for the Graph500 Traversal which includes both performance and energy efficiency results.
SESSIONS FOCUSSED ON SYSTEM HARDWARE
Thirteen of the sessions are exploring system hardware energy efficiency. Of these thirteen, seven of them focus on alternative processors like GPU and ARM that are continuing the trend towards aggregating low-power processors and using accelerators. There are three BoFs that explore alternative processors and all three are organized by Europeans. The Partnership for Advanced Computing in Europe (PRACE) explores a set of prototypes to test and evaluate promising new technologies for future multi- Petaflop/s systems that include GPUs, ARM processors, DSPs and FPGAs. The Barcelona Supercomputing Center is heading up an ARM-based exascale demonstration and will review their research results and plans at two BoFs; “Energy Efficient HPC” and “Exascale Research- The European Approach.” Besides these BoFs, there is a session as part of Broader Exchange where Calxeda, an ARM-based server provider, will present their products and roadmaps. NEC is presenting an exhibitor forum on “Hybrid Solutions with a Vector-Architecutre for Efficiency.” There is also a paper on “Multi-Core DSP” and a poster on modeling “Power-Performance Efficiency” for GPUs.
A new topic for SC this year is a focus on memory technologies, which was presaged by a keynote at the International Supercomputing Conference held in Hamburg, Germany last June when Dr. Byungse So, Samsung Senior Vice President gave a talk on “Advanced Memory Technology – #1 Factor for Energy Efficient HPC”. Two papers, RAMZzz and Mage, both explore novel memory system designs. Samsung and Micron, respectively are presenting exhibitor forums on “How Memory and SSDs can Optimize Data Center Operations” and “Hybrid Memory Cube (HMC)”.
Whereas memory is on the uptake, the focus on liquid cooling has waned with only two sessions this year compared to six last year at SC’11. Eurotech will present an exhibitor forum on “Differences Between Cold and Hot Water Cooling on CPU and Hybrid Supercomputers” and Green Revolution Cooling will present on “100% Server Heat Recapture in Data Centers is Now a Reality.”
DATA CENTER SESSIONS
Kimberly Cupps, Lawrence Livermore National Laboratory will present on “The Sequoia System and Facilities Integration Story”. It appears that she will be giving the same talk at two different sessions; on Monday during Broader Engagement as well as on Tuesday as an Invited Speaker. Also, the M+W Group will present an exhibitor forum on “Reducing First Costs and Improving Future Flexibility in the Construction of High Performance Computing Facilities.”
APPLICATION TUNING AND JOB SCHEDULING
There are nine sessions that describe research on tuning applications for energy efficiency and various aspects of energy efficient job scheduling. Seven of the nine sessions are doctoral showcases, papers or posters. There is a BoF on “Power and Energy Measurement Modeling”. In this BoF, members of the research community and industry will present current state-of-the-art and limitations in measuring and modeling power and energy consumption and their effect on HPC application performance. An open discussion about future directions for such work will follow, with the intention of creating a “wish list” of feature requests to HPC vendors. Another BoF of interest is the SLURM User Group Meeting, which provides an open source job scheduler. Also, Charles Lively, ORNL will give a talk during Broader Engagement on “Heading Towards Exascale – Techniques to Improve Application Performance and Energy Consumption Using Application-Level Tools”.
Following is a list of the titles for the doctoral showcases:
Following is a list of the titles for the papers:
Following is a list of the titles for the posters:
OTHER SESSIONS
Two other sessions that will cover energy efficiency include an all day workshop on “High Performance Computing, Networking and Analytics for the Power Grid” and a poster on “Pay as You Go in the Cloud: One Watt at a Time.”
Although this is a list of sessions with a specific focus on energy efficiency, many more sessions will include energy efficiency as part of a broader focus.
Today Appro introduced a new Xtreme-Cool supercomputer that provides up to three times more energy efficiency per rack in a data center versus air-cooled designs. With its blade architecture, the Appro Xtreme-Cool Supercomputer can reportedly scale to 25 Petaflops of performance and be configured as Fat Tree or 3D Torus architecture with interconnect options for single or dual rail, InfiniBand or Ethernet making it optimized for superior application performance.
Customers who are pressing the state of the art in scientific discovery are looking for not only outstanding performance and energy-efficiency, but also programmability and manageability”, said Dr. Rajeeb Hazra, VP Intel Architecture Group and GM Technical Computing, Datacenter and Connected Systems Group. “The Appro Xtreme-Cool meets those needs by combining the power of Intel Xeon processor E5 family with the programmability and energy efficiency of the Intel Many Integrated (Intel MIC) architecture based Intel® Xeon Phi coprocessors. This combination of technologies establishes a new standard for both programmer productivity and performance per watt.”
Appro will showcase the Xtreme-Cool technology at SC12 booth #2443 in Salt Lake City, Utah – November 12-15.
Big Data is getting its own quarterly journal with a little help from Chief Editor Edd Dumbill.
Big Data, a highly innovative, open access peer-reviewed journal, provides a unique forum for world-class research exploring the challenges and opportunities in collecting, analyzing, and disseminating vast amounts of data, including data science, big data infrastructure and analytics, and pervasive computing. The Journal addresses questions surrounding this powerful and growing field of data science and facilitates the efforts of researchers, business managers, analysts, developers, data scientists, physicists, statisticians, infrastructure developers, academics, and policymakers to improve operations, profitability, and communications within their businesses and institutions. Spanning a broad array of disciplines focusing on novel big data technologies, policies, and innovations, the Journal brings together the community to address current challenges and enforce effective efforts to organize, store, disseminate, protect, manipulate, and, most importantly, find the most effective strategies to make this incredible amount of information work to benefit society, industry, academia, and government.
Read the Full Story.
This week SGI announced it has been chosen by Air Liquide to provide a 15 Teraflop HPC solution. As a long-standing SGI client, Air Liquide is installing the third upgrade of the SGI ICE blade server initially installed in 2008.
High Performance Computing requirements are steadily increasing in our sector and are allowing for accelerated innovation in the areas of energy, environment and health,” said Frédéric Camy-Peyret, Modelling & Numerical Simulation R&D program director for Air Liquide. “With simulations that are increasingly reliable and detailed, our molecular modelling and fluid dynamics applications require not only innovative computing architectures, but also high-performance storage. This is a comprehensive solution that SGI can provide.”
The new upgrade is based on Intel Xeon E5-2670 processors and Mellanox FDR InfiniBand Non-blocking Interconnect technology. The system is configured in a dual-plane ‘hypercube’ topology, one of four topologies supported by SGI ICE X, and contains several hundred Xeon cores and 2.3 terabytes of memory. Read the Full Story.
insideHPC.com is a production of insideHPC, LLC. © 2006-2013 Sitemap