MailChimp Developer

Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:

Video: Optimizing Applications for the CORI Supercomputer at NERSC

In this video from SC15, NERSC shares its experience on optimizing applications to run on the new Intel Xeon Phi processors (code name Knights Landing) that will empower the Cori supercomputer by the summer of 2016. “A key goal of the Cori Phase 1 system is to support the increasingly data-intensive computing needs of NERSC users. Toward this end, Phase 1 of Cori will feature more than 1,400 Intel Haswell compute nodes, each with 128 gigabytes of memory per node. The system will provide about the same sustained application performance as NERSC’s Hopper system, which will be retired later this year. The Cori interconnect will have a dragonfly topology based on the Aries interconnect, identical to NERSC’s Edison system.”

Creating an Exascale Ecosystem Under the NSCI Banner

“We expect NCSI to run for the next two decades. It’s a bit audacious to start a 20 year project in the last 18 months of an administration, but one of the things that gives us momentum is that we are not starting from a clean sheet of paper. There are many government agencies already involved and what we’re really doing is increasing their coordination and collaboration. Also we will be working very hard over the next 18 months to build momentum and establish new working relationships with academia and industry.”

Second Intel Parallel Computing Center Opens at SDSC

Intel has opened a second parallel computing center at the San Diego Supercomputer Center (SDSC), at the University of California, San Diego. The focus of this new engagement is on earthquake research, including detailed computer simulations of major seismic activity that can be used to better inform and assist disaster recovery and relief efforts.

Video: Theta & Aurora – Big Systems for Big Science

“Aurora’s revolutionary architecture features Intel’s HPC scalable system framework and 2nd generation Intel Omni-Path Fabric. The system will have a combined total of over 8 Petabytes of on package high bandwidth memory and persistent memory, connected and communicating via a high-performance system fabric to achieve landmark throughput. The nodes will be linked to a dedicated burst buffer and a high-performance parallel storage solution. A second system, named Theta, will be delivered in 2016. Theta will be based on Intel’s second-generation Xeon Phi processor and will serve as an early production system for the ALCF.”

Video: Supercomputing at the University of Buffalo

In this WGRZ video, researchers describe supercomputing at the Center for Computational Research at the University of Buffalo. “The Center’s extensive computing facilities, which are housed in a state-of-the-art 4000 sq ft machine room, include a generally accessible (to all UB researchers) Linux cluster with more than 8000 processor cores and QDR Infiniband, a subset (32) of which contain (64) NVidia Tesla M2050 “Fermi” graphics processing units (GPUs).”

Video: AMD’s next Generation GPU and High Bandwidth Memory Architecture

“HBM is a new type of CPU/GPU memory (“RAM”) that vertically stacks memory chips, like floors in a skyscraper. In doing so, it shortens your information commute. Those towers connect to the CPU or GPU through an ultra-fast interconnect called the “interposer.” Several stacks of HBM are plugged into the interposer alongside a CPU or GPU, and that assembled module connects to a circuit board. Though these HBM stacks are not physically integrated with the CPU or GPU, they are so closely and quickly connected via the interposer that HBM’s characteristics are nearly indistinguishable from on-chip integrated RAM.”

Video: Meet IME – The World’s First Burst Buffer

“DDN’s IME14K revolutionizes how information is saved and accessed by compute. IME software allows data to reside next to compute in a very fast, shared pool of non-volatile memory (NVM). This new data adjacency significantly reduces latency by allowing IME software’s revolutionary, fast data communication layer to pass data without the file locking contention inherent in today’s parallel file systems.”

Video: Altera’s Stratix 10 – 14nm FPGA Targeting 1GHz Performance

In this video from the 2015 Hot Chips Conference, Mike Hutton from Altera presents: Stratix 10 Altera’s 14nm FPGA Targeting 1GHz Performance. “Stratix 10 FPGAs and SoCs deliver breakthrough advantages in performance, power efficiency, density, and system integration: advantages that are unmatched in the industry. Featuring the revolutionary HyperFlex core fabric architecture and built on the Intel 14 nm Tri-Gate process, Stratix 10 devices deliver 2X core performance gains over previous-generation, high-performance FPGAs with up to 70% lower power.”

Chalk Talk: What is a Data Lake?

“If you think of a data mart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.” These “data lake” systems will hold massive amounts of data and be accessible through file and web interfaces. Data protection for data lakes will consist of replicas and will not require backup since the data is not updated. Erasure coding will be used to protect large data sets and enable fast recovery. Open source will be used to reduce licensing costs and compute systems will be optimized for map reduce analytics. Automated tiering will be employed for performance and long-term retention requirements. Cold storage, storage that will not require power for long-term retention, will be introduced in the form of tape or optical media.”

Video: Bill Dally on Scaling Performance in the Post-Dennard Era

“It was indicated in my keynote this morning there are two really fundamental challenges we’re facing in the next two years in all sorts of computing – from supercomputers to cell phones. The first is that of energy efficiency. With the end of Dennard scaling, we’re no longer getting a big improvement in performance per watt from each technology generation. The performance improvement has dropped from a factor of 2.8 x back when we used to scale supply voltage with each new generation, now to about 1.3 x in the post-Dennard era. With this comes a real challenge for us to come up with architecture techniques and circuit techniques for better performance per watt.”