Search Results for: cuda

CUDA-X HPC: Libraries and Tools for your Next Scientific Breakthrough

Today NVIDIA announced CUDA-X HPC, a collection of libraries, tools, compilers and APIs that helps developers solve the world’s most challenging problems. “CUDA-X HPC includes highly tuned kernels essential for high-performance computing. GPU-accelerated libraries for linear algebra, parallel algorithms, signal and image processing lay the foundation for compute-intensive applications in areas such as computational physics, chemistry, molecular dynamics, and seismic exploration.”

NVIDIA Brings CUDA to Arm for HPC

Today NVIDIA announced its support for Arm CPUs, providing the high performance computing industry a new path to build extremely energy-efficient, AI-enabled exascale supercomputers. “NVIDIA is making available to the Arm ecosystem its full stack of AI and HPC software — which accelerates more than 600 HPC applications and all AI frameworks — by year’s end. The stack includes all NVIDIA CUDA-X AI and HPC libraries, GPU-accelerated AI frameworks and software development tools such as PGI compilers with OpenACC support and profilers.”

Video: NVIDIA Announces Turing GPUs and CUDA Toolkit 10

Today at SIGGRPAH, NVIDIA announced its newest GPUs based on the new Turing architecture. ”
This fundamentally changes how computer graphics will be done, it’s a step change in realism,” Huang told an audience of more than 1,200 graphics pros gathered at the sleek glass and steel Vancouver Convention Center, which sits across a waterway criss-crossed by cruise ships and seaplanes from the stunning North Shore mountains.

The Simulation of the Behavior of the Human Brain using CUDA

Pedro Valero-Lara from BSC gave this talk at the GPU Technology Conference. “The attendees can learn about how the behavior of Human Brain is simulated by using current computers, and the different challenges which the implementation has to deal with. We cover the main steps of the simulation and the methodologies behind this simulation. In particular we highlight and focus on those transformations and optimizations carried out to achieve a good performance on NVIDIA GPUs.”

Inside the Volta GPU Architecture and CUDA 9

“This presentation will give an overview about the new NVIDIA Volta GPU architecture and the latest CUDA 9 release. The NVIDIA Volta architecture powers the worlds most advanced data center GPU for AI, HPC, and Graphics. Volta features a new Streaming Multiprocessor (SM) architecture and includes enhanced features like NVLINK2 and the Multi-Process Service (MPS) that delivers major improvements in performance, energy efficiency, and ease of programmability. You”ll learn about new programming model enhancements and performance improvements in the latest CUDA9 release.”

NVIDIA Releases Cuda 9.2 for GPU Code Acceleration

Today NVIDIA released Cuda 9.2, which includes updates to libraries, a new library for accelerating custom linear-algebra algorithms, and lower kernel launch latency. “CUDA 9 is the most powerful software platform for GPU-accelerated applications. It has been built for Volta GPUs and includes faster GPU-accelerated libraries, a new programming model for flexible thread management, and improvements to the compiler and developer tools. With CUDA 9 you can speed up your applications while making them more scalable and robust.”

Porting Scientific Research Codes to GPUs with CUDA Fortran

Josh Romero from NVIDIA gave this talk at the Stanford HPC Conference. “In this session, we intend to provide guidance and techniques for porting scientific research codes to NVIDIA GPUs using CUDA Fortran. The GPU porting effort of an incompressible fluid dynamics solver using the immersed boundary method will be described. Several examples from this program will be used to illustrate available features in CUDA Fortran, from simple directive-based programming using CUF kernels to lower level programming using CUDA kernels.”

Accelerating HPC Programmer Productivity with OpenACC and CUDA Unified Memory

Doug Miles from NVIDIA gave this talk at SC17. “CUDA Unified Memory for NVIDIA Tesla GPUs offers programmers a unified view of memory on GPU-accelerated compute nodes. The CPUs can access GPU high-bandwidth memory directly, the GPUs can access CPU main memory directly, and memory pages migrate automatically between the two when the CUDA Unified Memory manager determines it is performance-profitable. PGI OpenACC compilers now leverage this capability on allocatable data to dramatically simplify parallelization and incremental optimization of HPC applications for GPUs.”

dCUDA: Distributed GPU Computing with Hardware Overlap

“Over the last decade, CUDA and the underlying GPU hardware architecture have continuously gained popularity in various high-performance computing application domains such as climate modeling, computational chemistry, or machine learning. Despite this popularity, we lack a single coherent programming model for GPU clusters. We therefore introduce the dCUDA programming model, which implements device-side remote memory access.”

CUDA Made Easy: An Introduction

“CUDA C++ is just one of the ways you can create massively parallel applications with CUDA. It lets you use the powerful C++ programming language to develop high performance algorithms accelerated by thousands of parallel threads running on GPUs. Many developers have accelerated their computation- and bandwidth-hungry applications this way, including the libraries and frameworks that underpin the ongoing revolution in artificial intelligence known as Deep Learning.”