With the introduction of the Intel Scalable System Framework, the Intel Xeon Phi processor can speed up Finite Element Analysis significantly. Using highly tuned math libraries such as the Intel Math Kernel Library (Intel MKL), FEA applications can execute math routines in parallel on the Intel Xeon Phi processor.
Nikos Trikoupis from the City University of New York gave this talk at the HPC User Forum in Austin. “We focus on measuring the aggregate throughput delivered by 12 Intel SSD DC P3700 for NVMe cards installed on the SGI UV 300 scale-up system in the CUNY High Performance Computing Center. We establish a performance baseline for a single SSD. The 12 SSDs are assembled into a single RAID-0 volume using Linux Software RAID and the XVM Volume Manager. The aggregate read and write throughput is measured against different configurations that include the XFS and the GPFS file systems.”
Today Nvidia announced the general availability of CUDA 8 toolkit for GPU developers. “A crucial goal for CUDA 8 is to provide support for the powerful new Pascal architecture, the first incarnation of which was launched at GTC 2016: Tesla P100,” said Nvidia’s Mark Harris in a blog post. “One of NVIDIA’s goals is to support CUDA across the entire NVIDIA platform, so CUDA 8 supports all new Pascal GPUs, including Tesla P100, P40, and P4, as well as NVIDIA Titan X, and Pascal-based GeForce, Quadro, and DrivePX GPUs.”
“Our customers are looking for a highly integrated server adapter that solves their pressing need for network performance, efficiency and security,” said Gilad Shainer, vice president of marketing, Mellanox Technologies. “The Innova adapter provides IPsec offload to deliver complete end-to-end security for traffic moving within the data center. Combined with the intelligent network offload and acceleration engines, Innova IPsec is the ideal solution for cloud, telecommunication, Web 2.0, high-performance compute and storage infrastructures.”
“We are at an inflection point in the big data era,” said Bob Picciano, senior vice president, IBM Analytics. “We know that users spend up to 80 percent of their time on data preparation, no matter the task, even when they are applying the most sophisticated AI. Project DataWorks helps transform this challenge by bringing together all data sources on one common platform, enabling users to get the data ready for insight and action, faster than ever before.”
Today D-Wave Systems announced details of its most advanced quantum computing system, featuring a new 2000-qubit processor. The announcement is being made at the company’s inaugural users group conference in Santa Fe, New Mexico. The new processor doubles the number of qubits over the previous generation D-Wave 2X system, enabling larger problems to be solved and extending D-Wave’s significant lead over all quantum computing competitors. The new system also introduces control features that allow users to tune the quantum computational process to solve problems faster and find more diverse solutions when they exist. In early tests these new features have yielded performance improvements of up to 1000 times over the D-Wave 2X system.
“Starting in 2015, Oak Ridge National Laboratory partnered with the University of Tennessee to offer a minor-degree program in data center technology and management, one of the first offerings of its kind in the country. ORNL staff members developed the senior-level course in collaboration with UT College of Engineering professor Mark Dean after an ORNL strategic partner identified a need for employees who could bridge both the facilities and operational aspects of running a data center. In addition to developing the course curriculum, ORNL staff members are also serving as guest lecturers.”
“Deep learning developers and researchers want to train neural networks as fast as possible. Right now we are limited by computing performance,” said Dr. Diamos. “The first step in improving performance is to measure it, so we created DeepBench and are opening it up to the deep learning community. We believe that tracking performance on different hardware platforms will help processor designers better optimize their hardware for deep learning applications.”