Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Articles and news on parallel programming and code modernization

Making Computer Vision Real Today – For Any Application

With the demand for intelligent vision solutions increasing everywhere from edge to cloud, enterprises of every type are demanding visually-enabled – and intelligent – applications. Up till now, most intelligent computer vision applications have required a wealth of machine learning, deep learning, and data science knowledge to enable simple object recognition, much less facial recognition or collision avoidance. That’s changed with the introduction of Intel’s Distribution of OpenVINO toolkit.

Improving HPC Performance with the Roofline Model

“When we are optimizing our objective is to determine which hardware resource the code is exhausting (there must be one, otherwise it would run faster!), and then see how to modify the code to reduce its need for that resource. It is therefore essential to understand the maximum theoretical performance of that aspect of the machine, since if we are already achieving the peak performance we should give up, or choose a different algorithm.”

Python Power: Intel SDK Accelerates Python Development and Execution

It was with one goal – accelerating Python execution performance – that lead to the creation of Intel Distribution for Python, a set of tools designed to provide Python application performance right out of the box, usually with no code changes required. This sponsored post from Intel highlights how Intel SDK can enhance Python development and execution, as Python continues to grow in popularity.

Putting Computer Vision to Work with OpenVINO

OpenVINO is a single toolkit, optimized for Intel hardware, that the data scientist and AI software developer can use for quickly developing high-performance applications that employ neural network inference and deep learning to emulate human vision over various platforms. “This toolkit supports heterogeneous execution across CPUs and computer vision accelerators including GPUs, Intel® Movidius™ hardware, and FPGAs.”

Video: Speeding up Programs with OpenACC in GCC

Thomas Schwinge from Mentor gave this talk at FOSDEM’19. “Requiring only few changes to your existing source code, OpenACC allows for easy parallelization and code offloading to accelerators such as GPUs. We will present a short introduction of GCC and OpenACC, implementation status, examples, and performance results.”

Argonne Looks to Singularity for HPC Code Portability

Over at Argonne, Nils Heinonen writes that Researchers are using the open source Singularity framework as a kind of Rosetta Stone for running supercomputing code almost anywhere. “Once a containerized workflow is defined, its image can be snapshotted, archived, and preserved for future use. The snapshot itself represents a boon for scientific provenance by detailing the exact conditions under which given data were generated: in theory, by providing the machine, the software stack, and the parameters, one’s work can be completely reproduced.”

Are Platform Configuration Problems Degrading Your Application’s Performance?

The Intel VTune™ Amplifier Platform Profiler on Windows* and Linux* systems shows you critical data about the running platform that help identify common system configuration errors that may be causing performance issues and bottlenecks. Fixing these issues, or modifying the application to work around them, can greatly improve overall performance.

Podcast: Doug Kothe Looks back at the Exascale Computing Project Annual Meeting

In this podcast, Doug Kothe from the Exascale Computing Project describes the 2019 ECP Annual Meeting. “Key topics to be covered at the meeting are discussions of future systems, software stack plans, and interactions with facilities. Several parallel sessions are also planned throughout the meeting.”

Accelerated Python for Data Science

The Intel Distribution for Python takes advantage of the Intel® Advanced Vector Extensions (Intel® AVX) and multiple cores in the latest Intel architectures. By utilizing the highly optimized Intel MKL BLAS and LAPACK routines, key functions run up to 200 times faster on servers and 10 times faster on desktop systems. This means that existing Python applications will perform significantly better merely by switching to the Intel distribution.

Apply Now for 2019 Argonne Training Program on Extreme-Scale Computing

Computational scientists are invited to apply for the upcoming Argonne Training Program on Extreme-Scale Computing (ATPESC) this Summer. “This program provides intensive hands-on training on the key skills, approaches, and tools to design, implement, and execute computational science and engineering applications on current supercomputers and the HPC systems of the future. As a bridge to that future, this two-week program fills many gaps that exist in the training computational scientists typically receive through formal education or other shorter courses.”