Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Articles and news on parallel programming and code modernization

Six Steps Towards Better Performance on Intel Xeon Phi

“As with all new technology, developers will have to create processes in order to modernize applications to take advantage of any new feature. Rather than randomly trying to improve the performance of an application, it is wise to be very familiar with the application and use available tools to understand bottlenecks and look for areas of improvement.”

Intel Releases Optimized Python for HPC

“By implementing popular Python packages such as NumPy, SciPy, scikit-learn, to call the Intel Math Kernel Library (Intel MKL) and the Intel Data Analytics Acceleration Library (Intel DAAL), Python applications are automatically optimized to take advantage of the latest architectures. These libraries have also been optimized for multithreading through calls to the Intel Threading Building Blocks (Intel TBB) library. This means that existing Python applications will perform significantly better merely by switching to the Intel distribution.”

PRACE Posts Best Practice Guide for Intel Xeon Phi

The European PRACE initiative has published a new Best Practice Guide for Intel Xeon Phi, Knights Landing Edition. “This best practice guide provides information about Intel’s MIC architecture and programming models for the Intel Xeon Phi co-processor in order to enable programmers to achieve good performance of their applications. The guide covers a wide range of topics from the description of the hardware of the Intel Xeon Phi co-processor through information about the basic programming models as well as information about porting programs up to tools and strategies how to analyze and improve the performance of applications.”

Vectorization Leads to Performance Gains

Applications that can take advantage of the new vectorization capabilities of the Intel Xeon Phi processor will show tremendous performance gains. “When considering vectorization, there are different tools that can assist the developer in determining where to look further. The first is to look at the optimization reports that are generated by the Intel compiler and then to also use the Vector Analyzer that can give specific advice on what to do to get more vectorization from the code.”

Intel Xeon Phi Processor Programming in a Nutshell

In this special guest feature, James Reinders looks at Intel Xeon Phi processors from a programmer’s perspective. “How does a programmer think of Intel Xeon Phi processors? In this brief article, I will convey how I, as a programmer, think of them. In subsequent articles, I will dive a bit more into details of various programming modes, and techniques employed for some key applications. In this article, I will endeavor to not stray into deep details – but rather offer an approachable perspective on how to think about programming for Intel Xeon Phi processors.”

Achieving High-Performance Math Processing with Intel MKL 2017

“Many of the libraries developed in the 70s and 80s for core linear algebra and scientific math computation, such as BLAS, LAPACK, FFT, are still in use today with C, C++, Fortran, and even Python programs. With MKL, Intel has engineered a ready-to-use, royalty-free library that implements these numerical algorithms optimized specifically to take advantage of the latest features of Intel chip architectures. Even the best compiler can’t compete with the level of performance possible from a hand-optimized library. Any application that already relies on the BLAS or LAPACK functionality will achieve better performance on Intel and compatible architectures just by downloading and re-linking with Intel MKL.”

Appentra Joins OpenPOWER Foundation for Auto-Parallelization

Today Appentra announced it has joined the OpenPOWER Foundation, an open development community based on the POWER microprocessor architecture. Founded in 2012, Appentra is a technology company providing software tools for guided parallelization in high-performance computing and HPC-like technologies. “The development model of the OpenPOWER Foundation is one that elicits collaboration and represents a new way in exploiting and innovating around processor technology.” says Calista Redmond, Director of OpenPOWER Global Alliances at IBM. “With the Power architecture designed for Big Data and Cloud, new OpenPOWER Foundation members like Appentra, will be able to add their own innovations on top of the technology to create new applications that capitalize on emerging workloads.”

Video: Modern Code – Making the Impossible Possible

In this video, Rich Brueckner from insideHPC moderates a panel discussion on Code Modernization. “SC15 luminary panelists reflect on collaboration with Intel and how building on hardware and software standards facilitates performance on parallel platforms with greater ease and productivity. By sharing their experiences modernizing code we hope to shed light on what you might see from modernizing your own code.”

Managing Lots of Tasks for Intel Xeon Phi

“OpenMP, Fortran 2008 and TBB are standards that can help to create parallel areas of an application. MKL could also be considered to be part of this family, because it uses OpenMP within the library. OpenMP is well known and has been used for quite some time and is continues to be enhanced. Some estimates are as high as 75 % of cycles used are for Fortran applications. Thus, in order to modernize some of the most significant number crunchers today, Fortran 2008 should be investigated. TBB is for C++ applications only, and does not require compiler modifications. An additional benefit to using OpenMP and Fortran 2008 is that these are standards, which allows code to be more portable.”

Scaling Software for In-Memory Computing

“The move away from the traditional single processor/memory design has fostered new programming paradigms that address multiple processors (cores). Existing single core applications need to be modified to use extra processors (and accelerators). Unfortunately there is no single portable and efficient programming solution that addresses both scale-up and scale-out systems.”