Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Articles and news on parallel programming and code modernization

Intel Advisor’s TBB Flow Graph Analyzer: Making Complex Layers of Parallelism More Manageable

Some deep learning applications tend to have very complex graphs with thousands of nodes and edges. To make it easier to visualize, analyze, design, and tune such complex parallel applications employing Intel TBB flow graphs, Intel provides Intel Advisor Flow Graph Analyzer (Intel FGA). It gives developers a comprehensive set of tools to examine, debug, and analyze Intel TBB flow graphs.

Unlocking the Power of Parallel Coding to Access Better Performance in Multi-Core Environments

A number of different frameworks and standards can be employed for parallel coding. The choice of the most suitable depends on the purpose of the application, its overall requirements and the target execution environment. Selecting the right framework is imperative to obtaining the best possible performance increase. The choice of framework is based on the available memory, overheads, controls and support.

Intel Parallel Studio XE AVX-512: Tuning for Success with the Latest SIMD Extensions and Intel® Advanced Vector Extensions 512

With the introduction of Intel Parallel Studio XE, instructions for utilizing the vector extensions have been enhanced and new instructions have been added. Applications in diverse domains such as data compression and decompression, scientific simulations and cryptography can take advantage of these new and enhanced instructions. “Although microkernels can demonstrate the effectiveness of the new SIMD instructions, understanding why the new instructions benefit the code can then lead to even greater performance.”

OpenMP ARB Releases New Technical Report and Asks for Feedback

In this video from SC17 in Denver, Michael Klemm from the OpenMP ARB describes how the OpenMP programming community is moving forward to new levels of scalable performance. The OpenMP Architecture Review Board (ARB) is seeking feedback on the newly released Technical Report 6.

A New Way to Visualize Performance Optimization Tradeoffs

A valuable feature of Intel Advisor is its Roofline Analysis Chart, which provides an intuitive and powerful visualization of actual performance measured against hardware-imposed performance ceilings. Intel Advisor’s vector parallelism optimization analysis and memory-versus-compute roofline analysis, working together, offer a powerful tool for visualizing an application’s complete current and potential performance profile on a given platform.

Ray Tracing on Intel Xeon Phi with Embree

In computer graphics, ray tracing is a rendering technique for generating an image by tracing the path of light as pixels in an image plane and simulating the effects of its encounters with virtual objects. “Experienced computer graphics developers that understand how ray tracing works, in conjunction with a deep knowledge of the Intel Xeon Phi processor hardware have created a set of ray tracing kernels that take advantage of the underlying instruction sets and the available number of computing cores.”

New Report Offers insight on OpenMP

Technical Report 6 demonstrates the importance of user feedback to the OpenMP specification,” says Bronis R. de Supinski, the Chair of the OpenMP Language Committee. “Users have indicated that several features are vitally important to them, such as multilevel memory support, deep copy, easy access to unified shared memory and a descriptive loop construct. As a result of that feedback, OpenMP 5.0 will include all of these major additions.”

Building Fast Data Compression Code with Intel Integrated Performance Primitives (Intel IPP) 2018

Intel® Integrated Performance Primitives (Intel IPP) is a highly optimized, production-ready, library for lossless data compression/decompression targeting image, signal, and data processing, and cryptography applications. Intel IPP includes more than 2,500 image processing, 1,300 signal processing, 500 computer vision, and 300 cryptography optimized functions for creating digital media, enterprise data, embedded, communications, and scientific, technical, and security applications.

Visualizing with Software Rendering with Intel Xeon Phi

There are two main categories or uses where rendering on the Intel Xeon Phi processors should be investigated. The first is what could be called “Professional rendering” and the second, “Scientific visualization.” “Software based visualization, whether for photo-realistic rendering or scientific visualization can be accelerated with a software only approach. This allows for new algorithms to be implemented faster than waiting for the next generation of hardware systems to appear. As the number of computing elements increases, performance can increase as well.”

Intel Compilers 18.0 Tune for AVX-512 ISA Extensions

Intel Compilers 18.0 and Intel Parallel Studio XE 2018 tuning software fully support the AVX-512 instructions. By widening and deepening the vector registers, the new instructions and added enhancements let the compiler squeeze more vector parallelism out of applications than before. Applications compiled with the –xCORE-AVX512 will generate an executable that utilizes these new high-performance instructions.