Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Accelerated Python for Data Science

The Intel Distribution for Python takes advantage of the Intel® Advanced Vector Extensions (Intel® AVX) and multiple cores in the latest Intel architectures. By utilizing the highly optimized Intel MKL BLAS and LAPACK routines, key functions run up to 200 times faster on servers and 10 times faster on desktop systems. This means that existing Python applications will perform significantly better merely by switching to the Intel distribution.

Latest Intel Tools Make Code Modernization Possible

Code modernization means ensuring that an application makes full use of the performance potential of the underlying processors. And that means implementing vectorization, threading, memory caching, and fast algorithms wherever possible. But where do you begin? How do you take your complex, industrial-strength application code to the next performance level?

Development Tools are More Important Now Than Ever

In this video, Sanjiv Shah, Vice President of the Core Visual Computing Group, and General Manager of Technical, Enterprise, and Cloud Computing Software Tools at Intel offers his perspective on the evolving nature of the developer’s role, and the latest resources to address persistent issues in application coding.

Learn What to Do Next with Intel VTune Amplifier Application Performance Snapshot

Tuning code has, for a long time, been an art. Knowing what to look for and how to correct inefficiencies in serious numerical computations has not been easy for most programmers. It’s often hard to even know which tool to start with. Which is why the Intel® VTune™ Amplifier Application Performance Snapshot could prove to be a great way to get an instant summary of an application’s performance characteristics and issues.

Use Intel Media SDK to Build Cross-Platform High-Quality Video Workflows

The latest release of Intel® Media SDK offers a single, cross-platform, GPU-enabled API for building optimized media and video applications from PC’s to workstations and into the cloud.

Video: Speed Your Code with Intel Parallel Studio XE

“Modern processors perform their best with parallel code that’s both vectorized and threaded, which can run more than 100 times faster more than serial code. So how can you accomplish this more easily through parallel programming? Enter Parallel Studio XE, a suite of tools that simplifies and speeds the design, building, tuning, and scaling of applications with the latest code modernization methods.”

Intel AVX Gives Numerical Computations in Java a Big Boost

Recent Intel® enhancements to Java enable faster and better numerical computing. In particular, the Java Virtual Machine (JVM) now uses the Fused Multiply Add (FMA) instructions on Intel Intel Xeon® PhiTM processors with Advanced Vector Instructions (Intel AVX) to implement the Open JDK9 Math.fma()API. This gives significant performance improvements for matrix multiplications, the most basic computation found in most HPC, Machine Learning, and AI applications.

Performance Insights Using the Intel Advisor Python API

Tuning a complex application for today’s heterogeneous platforms requires an understanding of the application itself as well as familiarity with tools that are available for assisting with analyzing where in the code itself to look for bottlenecks.  The process for optimizing the performance of an application, in general, requires the following steps that are most likely applicable for a wide range of applications.

Vectorization Now More Important Than Ever

Vectorization, the hardware optimization technique synonymous with early vector supercomputers like the Cray-1 (1975), has reappeared with even greater importance than before. Today, 40+ years later, the AVX-512 vector instructions in the most recent many-core Intel Xeon and Intel® Xeon PhiTM processors can increase application performance by 16x for single-precision codes.

A New Way to Visualize Performance Optimization Tradeoffs

A valuable feature of Intel Advisor is its Roofline Analysis Chart, which provides an intuitive and powerful visualization of actual performance measured against hardware-imposed performance ceilings. Intel Advisor’s vector parallelism optimization analysis and memory-versus-compute roofline analysis, working together, offer a powerful tool for visualizing an application’s complete current and potential performance profile on a given platform.