Sign up for our newsletter and get the latest HPC news and analysis.

Video: High Performance Computing with Python


“This talk will discuss various strategies to make a serial Python code faster, for example using libraries like NumPy, or tools like Cython which compile Python code. The talk will also discuss the available tools for running Python in parallel, focusing on the mpi4py module which implements MPI (Message Passing Interface) in Python.”

Call for Papers: HiPEAC 2016 in Prague


The HiPEAC 2016 Conference has issued its Call for Papers. HiPEAC is the European Network of Excellence on High Performance and Embedded Architecture and Compilation.

Direct N-Body Simulation


In some domains, an N-Body simulation is key to solving for the movement and forces of a dynamic system of particles. At each time step, the force that one body exacts on each other, and then the velocity can be computed. The simulation can continue up to a desired number of time steps.

Comparing OpenACC and OpenMP Performance and Programmability


“OpenACC and OpenMP provide programmers with two good options for portable, high-level parallel programming for GPUs. This talk will discuss similarities and differences between the two specifications in terms of programmability, portability, and performance.”

UPC and OpenSHMEM PGAS Models on GPU Clusters

DK Panda, Ohio State University

“Learn about extensions that enable efficient use of Partitioned Global Address Space (PGAS) Models like OpenSHMEM and UPC on supercomputing clusters with NVIDIA GPUs. PGAS models are gaining attention for providing shared memory abstractions that make it easy to develop applications with dynamic and irregular communication patterns. However, the existing UPC and OpenSHMEM standards do not allow communication calls to be made directly on GPU device memory. This talk discusses simple extensions to the OpenSHMEM and UPC models to address this issue.”

Podcast: Cray’s Steve Scott on Programming for the Next Decade

Steve Scott

“Our computing systems continue to evolve, providing significant challenges to the programming teams managing large, long-lived projects. Issues include rapidly increasing on-node parallelism, varying forms of heterogeneity, deepening memory hierarchies, growing concerns around resiliency and silent data corruption, and worsening storage bottlenecks.”

Video: Enabling OpenACC Performance Analysis


Learn how OpenACC runtimes also exposes performance-related information revealing where your OpenACC applications are wasting clock cycles. The talk will show that profilers can connect with OpenACC applications to record how much time is spent in OpenACC regions and what device activity it turns into.

Call for Papers: Workshop on Accelerator Programming using Directives

The 2nd Workshop on Accelerator Programming using Directives has issued its Call for Papers. The WACCPD Workshop takes place Nov. 16 in Austin in conjunction with SC15.

Slidecast: Vectorize or Die – Unlocking Performance


“The free ride of faster performance with increased clock speeds is long gone. Software must be both threaded and vectorized to fully utilize today’s and tomorrow’s hardware. But modernization is not without cost. Not all threading or vectorization designs are worthwhile. How do you choose which designs to implement without disrupting ongoing development? Learn how data driven threading and vectorization design can yield long term performance growth with less risk and more impact.”

NAG Library adds New Algorithms for Application Developers


Today the Numerical Algorithms Group (NAG) released their latest NAG Library including over 80 new mathematical and statistical algorithms.