Today the Google Cloud Platform announced that it is the first cloud provider to offer the next generation Intel Xeon processor, codenamed Skylake. “Skylake includes Intel Advanced Vector Extensions (AVX-512), which make it ideal for scientific modeling, genomic research, 3D rendering, data analytics and engineering simulations. When compared to previous generations, Skylake’s AVX-512 doubles the floating-point performance for the heaviest calculations. In our own internal tests, it improved application performance by up to 30%.”
Intel DAAL is a high-performance library specifically optimized for big data analysis on the latest Intel platforms, including Intel Xeon®, and Intel Xeon Phi™. It provides the algorithmic building blocks for all stages in data analysis in offline, batch, streaming, and distributed processing environments. It was designed for efficient use over all the popular data platforms and APIs in use today, including MPI, Hadoop, Spark, R, MATLAB, Python, C++, and Java.
Today Allinea Software launched the first update to its well-established toolset for debugging, profiling and optimizing high performance code since being acquired by ARM in December 2016. “The V7.0 release provides new integrations for the Allinea Forge debugger and profiler and Allinea Performance Reports and will mean more efficient code development and optimization for users, especially those wishing to take software performance to new levels across Xeon Phi, CUDA and IBM Power platforms,” said Mark O’Connor, ARM Director, Product Management HPC tools.
“By implementing popular Python packages such as NumPy, SciPy, scikit-learn, to call the Intel Math Kernel Library (Intel MKL) and the Intel Data Analytics Acceleration Library (Intel DAAL), Python applications are automatically optimized to take advantage of the latest architectures. These libraries have also been optimized for multithreading through calls to the Intel Threading Building Blocks (Intel TBB) library. This means that existing Python applications will perform significantly better merely by switching to the Intel distribution.”
Applications that can take advantage of the new vectorization capabilities of the Intel Xeon Phi processor will show tremendous performance gains. “When considering vectorization, there are different tools that can assist the developer in determining where to look further. The first is to look at the optimization reports that are generated by the Intel compiler and then to also use the Vector Analyzer that can give specific advice on what to do to get more vectorization from the code.”
In-Memory Computing can accelerate traditional applications by using a memory first design. Applicable to a wide range of domains, In-Memory Computing and In-Memory Data Grids take advantage of the latest trends in computer systems technology. “In-memory computing is designed to address some of the most critical and real-time task requirements today. This include real-time fraud detection, biometrics and border security and financial risk analytics. All of these use cases require very low latency access to data from very large amounts of data, which results in faster and more accurate decisions.”
“CUDA C++ is just one of the ways you can create massively parallel applications with CUDA. It lets you use the powerful C++ programming language to develop high performance algorithms accelerated by thousands of parallel threads running on GPUs. Many developers have accelerated their computation- and bandwidth-hungry applications this way, including the libraries and frameworks that underpin the ongoing revolution in artificial intelligence known as Deep Learning.”
“Many of the libraries developed in the 70s and 80s for core linear algebra and scientific math computation, such as BLAS, LAPACK, FFT, are still in use today with C, C++, Fortran, and even Python programs. With MKL, Intel has engineered a ready-to-use, royalty-free library that implements these numerical algorithms optimized specifically to take advantage of the latest features of Intel chip architectures. Even the best compiler can’t compete with the level of performance possible from a hand-optimized library. Any application that already relies on the BLAS or LAPACK functionality will achieve better performance on Intel and compatible architectures just by downloading and re-linking with Intel MKL.”
Today Appentra announced it has joined the OpenPOWER Foundation, an open development community based on the POWER microprocessor architecture. Founded in 2012, Appentra is a technology company providing software tools for guided parallelization in high-performance computing and HPC-like technologies. “The development model of the OpenPOWER Foundation is one that elicits collaboration and represents a new way in exploiting and innovating around processor technology.” says Calista Redmond, Director of OpenPOWER Global Alliances at IBM. “With the Power architecture designed for Big Data and Cloud, new OpenPOWER Foundation members like Appentra, will be able to add their own innovations on top of the technology to create new applications that capitalize on emerging workloads.”
In this video, Rich Brueckner from insideHPC moderates a panel discussion on Code Modernization. “SC15 luminary panelists reflect on collaboration with Intel and how building on hardware and software standards facilitates performance on parallel platforms with greater ease and productivity. By sharing their experiences modernizing code we hope to shed light on what you might see from modernizing your own code.”