Video: Profiling Python Workloads with Intel VTune Amplifier

Paulius Velesko from Intel gave this talk at the ALCF Many-Core Developer Sessions. “This talk covers efficient profiling techniques that can help to dramatically improve the performance of code by identifying CPU and memory bottlenecks. Efficient profiling techniques can help dramatically improve the performance of code by identifying CPU and memory bottlenecks. We will demonstrate how to profile a Python application using Intel VTune Amplifier, a full-featured profiling tool.”

A Survey on Evaluating and Optimizing Performance of Intel Xeon Phi

ntel Xeon Phi combines the parallel processing power of a many-core accelerator with the programming ease of CPUs. Phi has powered many supercomputers, e.g., in June 2018 list of Top500 supercomputers, 19 supercomputers used Phi as the main processing unit. This paper surveys works that study the architecture of Phi and use it as an accelerator for various applications. It critically examines the performance bottlenecks and optimization strategies for Phi.

Podcast: Simulating Galaxy Clusters with XSEDE Supercomputers

In this TACC podcast, researchers describe how they are using XSEDE supercomputers to run some of the highest resolution simulations ever of galaxy clusters. One really cool thing about simulations is that we know what’s going on everywhere inside the simulated box,” Butsky said. “We can make some synthetic observations and compare them to what we actually see in absorption spectra and then connect the dots and match the spectra that’s observed and try to understand what’s really going on in this simulated box.”

Penguin Computing to Deploy Supercomputer at ICHEC in Ireland

Today Penguin Computing (a subsidiary of SMART Global Holdings) announced that it will deliver the new national supercomputer to the Irish Centre for High-End Computing (ICHEC) at the National University of Ireland (NUI) Galway. “With 11 supercomputers in the Top500 list and a bare-metal HPC Cloud service since 2009, we knew we could rely on Penguin Computing’s HPC expertise to address our needs in an innovative way.”

Call For Presentations: MVAPICH User Group Meeting (MUG 2018)

The MVAPICH User Group Meeting (MUG 2018) has issued its Call For Presentations. The event will take place from August 6-8 in Columbus, Ohio. “MUG aims to bring together MVAPICH2 users, researchers, developers, and system administrators to share their experience and knowledge and learn from each other. The event includes Keynote Talks, Invited Tutorials, Invited Talks, Contributed Presentations, Open MIC session, hands-on sessions  MVAPICH developers, etc.”

High Performance Big Data Computing Using Harp-DAAL

Harp-DAAL is a framework developed at Indiana University that brings together the capabilities of big data (Hadoop) and techniques that have previously been adopted for high performance computing.  Together, employees can become more productive and gain deeper insights to massive amounts of data.

Researchers Tune HPC Codes for Intel Xeon Phi at Brookhaven Hackathon

“The goal of this hands-on workshop was to help participants optimize their application codes to exploit the different levels of parallelism and memory hierarchies in the Xeon Phi architecture,” said CSI computational scientist Meifeng Lin. “By the end of the hackathon, the participants had not only made their codes run more efficiently on Xeon Phi–based systems, but also learned about strategies that could be applied to other CPU-based systems to improve code performance.”

Intel AVX Gives Numerical Computations in Java a Big Boost

Recent Intel® enhancements to Java enable faster and better numerical computing. In particular, the Java Virtual Machine (JVM) now uses the Fused Multiply Add (FMA) instructions on Intel Intel Xeon® PhiTM processors with Advanced Vector Instructions (Intel AVX) to implement the Open JDK9 Math.fma()API. This gives significant performance improvements for matrix multiplications, the most basic computation found in most HPC, Machine Learning, and AI applications.

XSEDE offers free HPC Training from Cornell Virtual Workshop

Today Cornell University announced that four new Cornell Virtual Workshop training topics are available at the Extreme Science and Engineering Discovery Environment (XSEDE) user portal. “The Cornell University Center for Advanced Computing (CAC) is a leader in the development and deployment of Web-based training programs designed to enhance the computational skills of researchers, broaden the participation of underrepresented groups in the sciences and engineering, and accelerate the adoption of new and emerging technologies.”

Performance Insights Using the Intel Advisor Python API

Tuning a complex application for today’s heterogeneous platforms requires an understanding of the application itself as well as familiarity with tools that are available for assisting with analyzing where in the code itself to look for bottlenecks.  The process for optimizing the performance of an application, in general, requires the following steps that are most likely applicable for a wide range of applications.