Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Articles and news on parallel programming and code modernization

Achieving Parallelism in Intel Distribution for Python with Numba

The rapid growth in popularity of Python as a programming language for mathematics, science, and engineering applications has been amazing. Not only is it easy to learn, but there is a vast treasure of packaged open source libraries out there targeted at just about every computational domain imaginable. This sponsored post from Intel highlights how today’s enterprises can achieve high levels of parallelism in large scale Python applications using the Intel Distribution for Python with Numba. 

The Challenges of Updating Scientific Codes for New HPC Architectures

In this video from PASC19 in Zurich, Benedikt Riedel from the University of Wisconsin describes the challenges researchers face when it comes to updating their scientific codes for new HPC architectures. After that he describes his work on the IceCube Neutrino Observatory.

Video: Data-Centric Parallel Programming

In this slidecast, Torsten Hoefler from ETH Zurich presents: Data-Centric Parallel Programming. “To maintain performance portability in the future, it is imperative to decouple architecture-specific programming paradigms from the underlying scientific computations. We present the Stateful DataFlow multiGraph (SDFG), a data-centric intermediate representation that enables separating code definition from its optimization.”

Intel Optimized Libraries Accelerate Deep Learning Applications on Intel Platforms

Whatever the platform, getting the best possible performance out of an application always presents big challenges. This is especially true when developing AI and machine learning applications on CPUs. This sponsored post from Intel explores how to effectively train and execute machine learning and deep learning projects on CPUs.

Video: Portable Programming Models Highlighted at PASC19

In this video from PASC19 in Zurich, Technical Papers co-chair Sunita Chandrasekaran provides some highlights from the conference. After that, Sunita previews the upcoming Workshop on Performance Portable Programming Models for Accelerators (P3MA) at ISC 2019. “This workshop will provide a forum to bring together researchers and developers to discuss community’s proposals and solutions to performance.”

Appentra Releases Parallelware Trainer 1.2

Appentra is pleased to announce the release of Parallelware Trainer 1.2, further improving the provision of accessible HPC and parallel programming training using OpenMP and OpenACC. “Appentra has a clear goal: to make parallel programming easier, enabling everyone to make the best use of parallel computing hardware from the multi-cores in a laptop to the fastest supercomputers. Parallelware Trainer 1.2 provides an enhanced interactive learning environment, including provision for a knowledge base designed around the code being developed and several parallelization paradigms, including multithreading, tasking and offloading to GPUs.”

Are Memory Bottlenecks Limiting Your Application’s Performance?

Often, it’s not enough to parallelize and vectorize an application to get the best performance. You also need to take a deep dive into how the application is accessing memory to find and eliminate bottlenecks in the code that could ultimately be limiting performance. Intel Advisor, a component of both Intel Parallel Studio XE and Intel System Studio, can help you identify and diagnose memory performance issues, and suggest strategies to improve the efficiency of your code.

Software-Defined Visualization with Intel Rendering Framework – No Special Hardware Needed

This sponsored post from Intel explores how the Intel Rendering Framework, which brings together a number of optimized, open source rendering libraries, can deliver better performance at a higher degree of fidelity — without having to invest in extra hardware. By letting the CPU do the work, visualization applications can run anywhere without specialized hardware, and users are seeing better performance than they could get from dedicated graphics hardware and limited memory. 

Making Python Fly: Accelerate Performance Without Recoding

Developers are increasingly besieged by the big data deluge. Intel Distribution for Python uses tried-and-true libraries like the Intel Math Kernel Library (Intel MKL)and the Intel Data Analytics Acceleration Library to make Python code scream right out of the box – no recoding required. Intel highlights some of the benefits dev teams can expect in this sponsored post.

CPU, GPU, FPGA, or DSP: Heterogeneous Computing Multiplies the Processing Power

Whether your code will run on industry-standard PCs or is embedded in devices for specific uses, chances are there’s more than one processor that you can utilize. Graphics processors, DSPs and other hardware accelerators often sit idle while CPUs crank away at code better served elsewhere. This sponsored post from Intel highlights the potential of Intel SDK for OpenCL Applications, which can ramp up processing power.