Coming in the second half of 2016: The HPE Apollo 6500 System provides the tools and the confidence to deliver high performance computing (HPC) innovation. The system consists of three key elements: The HPE ProLiant XL270 Gen9 Server tray, the HPE Apollo 6500 Chassis, and the HPE Apollo 6000 Power Shelf. Although final configurations and performance are not yet available, the system appears capable of delivering over 40 teraflop/s double precision, and significantly more in single or half precision modes.
George Slota presented this talk at the Blue Waters Symposium. “In recent years, many graph processing frameworks have been introduced with the goal to simplify analysis of real-world graphs on commodity hardware. However, these popular frameworks lack scalability to modern massive-scale datasets. This work introduces a methodology for graph processing on distributed HPC systems that is simple to implement, generalizable to broad classes of graph algorithms, and scales to systems with hundreds of thousands of cores and graphs of billions of vertices and trillions of edges.”
Researchers from the RAND Corporation and LLNL have joined forces to combine HPC with innovative public policy analysis to improve planning for particularly complex issues such as water resource management. By using supercomputer simulations, the participants were able to customize and speed up the analysis guiding the deliberations of decision makers. “In the latest workshop we performed and evaluated about 60,000 simulations over lunch. What would have taken about 14 days of continuous computations in 2012 was completed in 45 mins — about 500 times faster,” said Ed Balkovich, senior information scientist at the RAND Corporation, a nonprofit research organization.
“With up to 72 processing cores, the Intel Xeon Phi processor x200 can accelerate applications tremendously. Each core contains two Advanced Vector Extensions, which speeds up the floating point performance. This is important for machine learning applications which in many cases use the Fused Multiply-Add (FMA) instruction.”
Wen-mei Hwu from the University of Illinois at Urbana-Champaign presented this talk at the Blue Waters Symposium. “In the 21st Century, we are able to understand, design, and create what we can compute. Computational models are allowing us to see even farther, going back and forth in time, learn better, test hypothesis that cannot be verified any other way, and create safe artificial processes.”
This week Nvidia CEO Jen-Hsun Huang hand-delivered one of the company’s new DGX-1 Machine Learning supercomputers to the OpenAI non-profit in San Francisco. “The DGX-1 is a huge advance,” OpenAI Research Scientist Ilya Sutskever said. “It will allow us to explore problems that were completely unexplored before, and it will allow us to achieve levels of performance that weren’t achievable.”
In this video from the 2016 Intel Developer Forum, Diane Bryant describes the company’s efforts to advance Machine Learning and Artificial Intelligence. Along the way, she offers a sneak peak at the Knights Mill processor, the next generation of Intel Xeon Phi slated for release sometime in 2017. “Now you can scale your machine learning and deep learning applications quickly – and gain insights more efficiently – with your existing hardware infrastructure. Popular open frameworks newly optimized for Intel, together with our advanced math libraries, make Intel Architecture-based platforms a smart choice for these projects.”
Deep learning solutions are typically a part of a broader high performance analytics function in for profit enterprises, with a requirement to deliver a fusion of business and data requirements. In addition to support large scale deployments, industrial solutions typically require portability, support for a range of development environments, and ease of use.
“Few fields are moving faster right now than deep learning,” writes Buck. “Today’s neural networks are 6x deeper and more powerful than just a few years ago. There are new techniques in multi-GPU scaling that offer even faster training performance. In addition, our architecture and software have improved neural network training time by over 10x in a year by moving from Kepler to Maxwell to today’s latest Pascal-based systems, like the DGX-1 with eight Tesla P100 GPUs. So it’s understandable that newcomers to the field may not be aware of all the developments that have been taking place in both hardware and software.”
Today Cycle Computing announced its continued involvement in optimizing research spearheaded by NASA’s Center for Climate Simulation (NCCS) and the University of Minnesota. Currently, a biomass measurement effort is underway in a coast-to-coast band of Sub-Saharan Africa. An over 10 million square kilometer region of Africa’s trees, a swath of acreage bigger than the entirety […]