MailChimp Developer

Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:

Paving the Way for Theta and Aurora

In this special guest feature, John Kirkley writes that Argonne is already building code for their future Theta and Aurora supercomputers based on Intel Knights Landing. “One of the ALCF’s primary tasks is to help prepare key applications for two advanced supercomputers. One is the 8.5-petaflops Theta system based on the upcoming Intel® Xeon Phi™ processor, code-named Knights Landing (KNL) and due for deployment this year. The other is a larger 180-petaflops Aurora supercomputer scheduled for 2018 using Intel Xeon Phi processors, code-named Knights Hill. A key goal is to solidify libraries and other essential elements, such as compilers and debuggers that support the systems’ current and future production applications.”

PreFetch for Intel Xeon Phi – Part 2

“An interesting aspect to prefetching is the distance ahead of the data that is being used to prefetch more data. This is a critical parameter for success and can be defined as how many iterations ahead to issue a prefetch instruction, and can be referred to as the distance. A compiler will automatically determine the distance to prefetch, and can be determined by looking at the compiler optimization reports.”

A Look at HPC and Hyperscale Trends for 2016

“In this talk, Intersect360 Research returns with an annual deep dive into the trends, technologies and usage models that will be propelling the HPC community through 2016 and beyond. Emerging areas of focus and opportunities to expand will be explored along with insightful observations needed to support measurably positive decision making within your operations.”

Prefetching Data for Intel Xeon Phi

“Prefetching on a coprocessor such as the Intel Xeon Phi coprocessor can be more important than on a main CPU such as the Intel Xeon CPUs. Since the cores on the Intel Xeon Phi coprocessor are in-order, they cannot hide memory latency as compared to an out-of-order CPU. In addition, since a coprocessor does not have an L3 cache, L2 misses must then access the slower memory subsystem.”

Advanced Hands-On OpenCL Tutorial To Kick-Off IWOCL 2016

Registration is now open for the Advanced Hands-On OpenCL Tutorial at the IWOCL 2016 conferernce. The tutorial focuses on advanced OpenCL concepts and is an extension of the highly successful “Hands on OpenCL” course which has received over 6,500 downloads from GitHub. Simon McIntosh-Smith, Associate Professor in High Performance Computing at the University of Bristol and one of the tutorial authors will lead the sessions.

Changes Afoot from the HPC Crystal Ball

In this special guest feature from Scientific Computing World, Andrew Jones from NAG looks ahead at what 2016 has in store for HPC and finds people, not technology, to be the most important issue. “A disconcertingly large proportion of the software used in computational science and engineering today was written for friendlier and less complex technology. An explosion of attention is needed to drag software into a state where it can effectively deliver science using future HPC platforms.”

OpenMP and OpenCL on Intel Xeon Phi

“In a heterogeneous system that combines both the Intel Xeon CPU and the Intel Xeon Phi coprocessor, there are various options available to optimize applications. Whether one has an advantage over another is somewhat dependent on the application that is being run. Comparisons can be made comparing the two methods, as long as the algorithm lends itself to run and take advantage of either OpenMP or OpenCL.”

Video: Optimizing Applications for the CORI Supercomputer at NERSC

In this video from SC15, NERSC shares its experience on optimizing applications to run on the new Intel Xeon Phi processors (code name Knights Landing) that will empower the Cori supercomputer by the summer of 2016. “A key goal of the Cori Phase 1 system is to support the increasingly data-intensive computing needs of NERSC users. Toward this end, Phase 1 of Cori will feature more than 1,400 Intel Haswell compute nodes, each with 128 gigabytes of memory per node. The system will provide about the same sustained application performance as NERSC’s Hopper system, which will be retired later this year. The Cori interconnect will have a dragonfly topology based on the Aries interconnect, identical to NERSC’s Edison system.”

Video: Theta & Aurora – Big Systems for Big Science

“Aurora’s revolutionary architecture features Intel’s HPC scalable system framework and 2nd generation Intel Omni-Path Fabric. The system will have a combined total of over 8 Petabytes of on package high bandwidth memory and persistent memory, connected and communicating via a high-performance system fabric to achieve landmark throughput. The nodes will be linked to a dedicated burst buffer and a high-performance parallel storage solution. A second system, named Theta, will be delivered in 2016. Theta will be based on Intel’s second-generation Xeon Phi processor and will serve as an early production system for the ALCF.”

Call for Contributions: Hot Chips 2016

hotchipsThe Hot Chips 2016 conference has issues its Call for Proposals. The event takes place August 21-23 in Cupertino, California. “Presentations at HOT CHIPS are in the form of 30 minute talks using PowerPoint or PDF. Presentation slides will be published in the HOT CHIPS Proceedings. Participants are not required to submit written papers, but a select group will be invited to submit a paper for inclusion in a special issue of IEEE Micro.”