Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:

Flow Graph Analyzer – Speed Up Your Applications

Using the Intel® Advisor Flow Graph Analyzer (FGA), an application such as those that are needed for autonomous driving can be developed and implemented using very high performing underlying software and hardware. Under the Intel FGA, are the Intel Threaded Building Blocks which take advantage of the multiple cores that are available on all types of systems today.

Vectorization Now More Important Than Ever

Vectorization, the hardware optimization technique synonymous with early vector supercomputers like the Cray-1 (1975), has reappeared with even greater importance than before. Today, 40+ years later, the AVX-512 vector instructions in the most recent many-core Intel Xeon and Intel® Xeon PhiTM processors can increase application performance by 16x for single-precision codes.

Intel MKL Speeds Up Small Matrix-Matrix Multiplication for Automatic Driving

Certain applications, such as automated driving, require low latency small matrix-matrix multiplication in real time. They use specialized libraries that can be customized for small matrix operations. Recompiling and linking those libraries with the highly optimized DGEMM routine in the Intel® Math Kernel Library 2018 can give speedups many times over native libraries.

Enabling FPGAs

Field Programmable Gate Arrays (FPGAs) are an exciting technology that allows hardware designers to create new digital circuits through a programming environment. Compared to hardware that is designed once or software which must adhere to the hardware architecture, an FPGA allows developers to draw a circuit to solve a specific problem.

Using the Intel C++ Compiler’s Optimization Features to Improve MySQL Performance

IT operations and maintenance developers have found that just by compiling the MySQL source code with the Intel C++ Compiler and turning on its Interprocedural Optimization feature, you can improve database performance from 5 to 35% compared with other compilers. “While there may be many factors affecting MySQL performance, such as hardware and software configuration, having a thoroughly optimized MySQL package is a good place to start.”

Use Intel® Inspector to Diagnose Hidden Memory and Threading Errors in Parallel Code

Intel Inspector is an integrated debugger that can easily diagnose latent and intermittent errors and guide users to locate the root cause. It does this by instrumenting the binaries, including dynamically generated or linked libraries, even when the source code is not available. This includes C, C++, and legacy Fortran codes.

Intel Parallel Studio 2018: Modernize Your Code

“Intel Parallel Studio 2018 has been designed to recognize the latest CPU architectures including the Intel Xeon Scalable processor family and the Intel Xeon Phi processors in order to get maximum performance from their differing architectures, yet remain binary compatible. With the recent introduction of the Intel  AVX-512 vectorization instructions, application developers can more easily take advantage of these new instructions when developing and compiling with the Intel Parallel Studio 2018.”

Intel Advisor’s TBB Flow Graph Analyzer: Making Complex Layers of Parallelism More Manageable

Some deep learning applications tend to have very complex graphs with thousands of nodes and edges. To make it easier to visualize, analyze, design, and tune such complex parallel applications employing Intel TBB flow graphs, Intel provides Intel Advisor Flow Graph Analyzer (Intel FGA). It gives developers a comprehensive set of tools to examine, debug, and analyze Intel TBB flow graphs.

Intel Parallel Studio XE AVX-512: Tuning for Success with the Latest SIMD Extensions and Intel® Advanced Vector Extensions 512

With the introduction of Intel Parallel Studio XE, instructions for utilizing the vector extensions have been enhanced and new instructions have been added. Applications in diverse domains such as data compression and decompression, scientific simulations and cryptography can take advantage of these new and enhanced instructions. “Although microkernels can demonstrate the effectiveness of the new SIMD instructions, understanding why the new instructions benefit the code can then lead to even greater performance.”

Interview: Cray to Deploy Largest Supercomputer in South Korea at KISTI

In this video from SC17, Dr Kwang Jin Oh, Director of Supercomputing Service Center at KISTI describes the new Intel-powered Cray supercomputer coming to South Korea. “Our cluster supercomputers are specifically designed to give customers like KISTI the computing resources they need for achieving scientific breakthroughs throughout a wide array of increasingly-complex, data-intensive challenges across modeling, simulation, analytics, and artificial intelligence. We look forward to working closely with KISTI now and into the future.”