“Bridges’ new nodes add large-memory and GPU resources that enable researchers who have never used high-performance computing to easily scale their applications to tackle much larger analyses,” says Nick Nystrom, principal investigator in the Bridges project and Senior Director of Research at PSC. “Our goal with Bridges is to transform researchers’ thinking from ‘What can I do within my local computing environment?’ to ‘What problems do I really want to solve?’”
“The multidisciplinary research team and computational facilities –including MareNostrum– make BSC an international centre of excellence in e-Science. Since its establishment in 2005, BSC has developed an active role in fostering HPC in Spain and Europe as an essential tool for international competitiveness in science and engineering. The center manages the Red Española de Supercomputación (RES), and is a hosting member of the Partnership for Advanced Computing in Europe (PRACE) initiative.”
With modern processors that contain a large number of cores, to get maximum performance it is necessary to structure an application to use as many cores as possible. Explicitly developing a program to do this can take a significant amount of effort. It is important to understand the science and algorithms behind the application, and then use whatever programming techniques that are available. “Intel Threaded Building Blocks (TBB) can help tremendously in the effort to achieve very high performance for the application.”
Libraries that are tuned to the underlying hardware architecture can increase performance tremendously. Higher level libraries such at the Intel Data Analytics Acceleration Library (Intel DAAL) can assist the developer with highly tuned algorithms for data analysis as well as machine learning. Intel DAAL functions can be called within other, more comprehensive frameworks that deal with the various types of data and storage, increasing the performance and lowering the development time of a wide range of applications.
Today, SGI announced that the United States Department of Defense (DoD) has selected SGI ICE XA for two of its Army Research Laboratory Defense Supercomputing Resource Center systems. The upgrades are part of a technology insertion, known as TI-16, for their High Performance Computing Modernization Program (HPCMP). “We’re excited to partner with SGI for our TI-16 DoD program, and have full confidence in the system’s ability to provide excellent performance,” said Dr. Raju Namburu, director of ARL DSRC. “Choosing the right HPC partners is crucial, as we rely on supercomputing and large-scale analytics and predictive sciences to provide the competitive edge we need to maintain our position as the nation’s premier laboratory for land forces.”
“With up to 72 out-of-order cores, the new Intel Xeon Phi processor delivers over 3 teraFLOPS (floating-point operations per second) of double-precision peak while providing 3.5 times higher performance per watt than the previous generation. As a bootable CPU with integrated architecture, the Intel Xeon Phi processor eliminates PCIe* bottlenecks, includes on-package high-bandwidth memory, and available integrated Intel Omni-Path fabric architecture to deliver fast, low-latency performance.”
With the advent of the tremendous compute density of new processors, it is important to understand if an application can take advantage of multicore. “Developers should understand if an application might be ready to run in a highly vectorized or many core environment before attempting to do the work necessary to obtain the high performance that might be expected.”
In this video, Dave Hart, CISL User Services Manager presents: Cheyenne – NCAR’s Next-Generation Data-Centric Supercomputing Environment. “Cheyenne is a new 5.34-petaflops, high-performance computer built for NCAR by SGI. The hardware was delivered on Monday, September 12, at the NCAR-Wyoming Supercomputing Center (NWSC) and the system is on schedule to become operational at the beginning of 2017. All of the compute racks were powered up and nodes booted up within a few days of delivery.”
“Our high-performance computing solutions enable deep learning, engineering, and scientific fields to scale out their compute clusters to accelerate their most demanding workloads and achieve fastest time-to-results with maximum performance per watt, per square foot, and per dollar,” said Charles Liang, President and CEO of Supermicro. “With our latest innovations incorporating the new NVIDIA P100 processors in a performance and density optimized 1U and 4U architectures with NVLink, our customers can accelerate their applications and innovations to address the most complex real world problems.”
Today Amazon Web Services announced the availability of P2 instances, a new GPU instance type for Amazon Elastic Compute Cloud designed for compute-intensive applications that require massive parallel floating point performance, including artificial intelligence, computational fluid dynamics, computational finance, seismic analysis, molecular modeling, genomics, and rendering. With up to 16 NVIDIA Tesla K80 GPUs, P2 instances are the most powerful GPU instances available in the cloud.