Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Flow Graph Analyzer – Speed Up Your Applications

Using the Intel® Advisor Flow Graph Analyzer (FGA), an application such as those that are needed for autonomous driving can be developed and implemented using very high performing underlying software and hardware. Under the Intel FGA, are the Intel Threaded Building Blocks which take advantage of the multiple cores that are available on all types of systems today.

Vectorization Now More Important Than Ever

Vectorization, the hardware optimization technique synonymous with early vector supercomputers like the Cray-1 (1975), has reappeared with even greater importance than before. Today, 40+ years later, the AVX-512 vector instructions in the most recent many-core Intel Xeon and Intel® Xeon PhiTM processors can increase application performance by 16x for single-precision codes.

Artificial Intelligence: The Next Industrial Revolution

It has been said that artificial intelligence will create the next industrial revolution, the fourth industrial revolution that modern-day society has experienced since the dawn of mechanical production and steam power energy documented in 1784. In this guest article, Mellanox Technologies’ Scot Schultz explores how artificial intelligence is shaping up to launch the next big wave of innovation.

Intel Select Solutions: Enabling HPC for a Broader Set of Users

Building upon the success of Intel Cluster Ready and Intel Scalable System Framework (Intel SSF), Intel is facilitating the deployment of computing resources to non-traditional HPC users with the recently announced Intel Select Solutions for HPC. The following guest article from Intel explores how Intel Select Solutions is working to enable and simplify HPC for a broader set of users. 

Intel MKL Speeds Up Small Matrix-Matrix Multiplication for Automatic Driving

Certain applications, such as automated driving, require low latency small matrix-matrix multiplication in real time. They use specialized libraries that can be customized for small matrix operations. Recompiling and linking those libraries with the highly optimized DGEMM routine in the Intel® Math Kernel Library 2018 can give speedups many times over native libraries.

Enabling FPGAs

Field Programmable Gate Arrays (FPGAs) are an exciting technology that allows hardware designers to create new digital circuits through a programming environment. Compared to hardware that is designed once or software which must adhere to the hardware architecture, an FPGA allows developers to draw a circuit to solve a specific problem.

Using the Intel C++ Compiler’s Optimization Features to Improve MySQL Performance

IT operations and maintenance developers have found that just by compiling the MySQL source code with the Intel C++ Compiler and turning on its Interprocedural Optimization feature, you can improve database performance from 5 to 35% compared with other compilers. “While there may be many factors affecting MySQL performance, such as hardware and software configuration, having a thoroughly optimized MySQL package is a good place to start.”

Use Intel® Inspector to Diagnose Hidden Memory and Threading Errors in Parallel Code

Intel Inspector is an integrated debugger that can easily diagnose latent and intermittent errors and guide users to locate the root cause. It does this by instrumenting the binaries, including dynamically generated or linked libraries, even when the source code is not available. This includes C, C++, and legacy Fortran codes.

Intel Parallel Studio 2018: Modernize Your Code

“Intel Parallel Studio 2018 has been designed to recognize the latest CPU architectures including the Intel Xeon Scalable processor family and the Intel Xeon Phi processors in order to get maximum performance from their differing architectures, yet remain binary compatible. With the recent introduction of the Intel  AVX-512 vectorization instructions, application developers can more easily take advantage of these new instructions when developing and compiling with the Intel Parallel Studio 2018.”

Intel Advisor’s TBB Flow Graph Analyzer: Making Complex Layers of Parallelism More Manageable

Some deep learning applications tend to have very complex graphs with thousands of nodes and edges. To make it easier to visualize, analyze, design, and tune such complex parallel applications employing Intel TBB flow graphs, Intel provides Intel Advisor Flow Graph Analyzer (Intel FGA). It gives developers a comprehensive set of tools to examine, debug, and analyze Intel TBB flow graphs.