In this video from TechCrunch Disrupt 2015, Subbu Rama from Bitfusion describes how the company delivers powerful hardware acceleration technologies to boost application performance.
A new paper from ORNL’s Sparsh Mittal and Jeffrey Vetter seeks to change the mindset of researchers using GPUs. Entitled, “A Survey of CPU-GPU Heterogeneous Computing Techniques,” the paper contends that merely offloading computational tasks to GPUs is not optimal, instead, using both CPU and GPU can lead to potentially higher speedup.
“Learn about extensions that enable efficient use of Partitioned Global Address Space (PGAS) Models like OpenSHMEM and UPC on supercomputing clusters with NVIDIA GPUs. PGAS models are gaining attention for providing shared memory abstractions that make it easy to develop applications with dynamic and irregular communication patterns. However, the existing UPC and OpenSHMEM standards do not allow communication calls to be made directly on GPU device memory. This talk discusses simple extensions to the OpenSHMEM and UPC models to address this issue.”
“We present a state-of-the-art image recognition system, Deep Image, developed using end-to-end deep learning. The key components are a custom-built supercomputer dedicated to deep learning, a highly optimized parallel algorithm using new strategies for data partitioning and communication, larger deep neural network models, novel data augmentation approaches, and usage of multi-scale high-resolution images.”
Learn how OpenACC runtimes also exposes performance-related information revealing where your OpenACC applications are wasting clock cycles. The talk will show that profilers can connect with OpenACC applications to record how much time is spent in OpenACC regions and what device activity it turns into.
The 2nd Workshop on Accelerator Programming using Directives has issued its Call for Papers. The WACCPD Workshop takes place Nov. 16 in Austin in conjunction with SC15.