Today ArrayFire released the latest version of their ArrayFire open source library of parallel computing functions supporting CUDA, OpenCL, and CPU devices. ArrayFire v3.4 improves features and performance for applications in machine learning, computer vision, signal processing, statistics, finance, and more.
In this video from the 2016 Blue Waters Symposium, GPU Performance Nuggets – Carl Pearson and Simon Garcia De Gonzalo from the University of Illinois present: GPU Performance Nuggets. “In this talk, we introduce a pair of Nvidia performance tools available on Blue Waters. We discuss what the GPU memory hierarchy provides for your application. We then present a case study that explores if memory hierarchy optimization can go too far.”
AMD’s motivation for developing these open-source GPU tools is based on an opportunity to remove the added complexity of proprietary programming frameworks to GPU application development. “If successful, these tools – or similar versions – could help to democratize GPU application development, removing the need for proprietary frameworks, which then makes the HPC accelerator market much more competitive for smaller players. For example, HPC users could potentially use these tools to convert CUDA code into C++ and then run it on an Intel Xeon co-processor.”
In this video from PYCON 2016 in Portland, Lorena Barba from George Washinton University presents: Beyond Learning to Program, Education, Open Source Culture, Structured Collaboration, and Language. “PyCon is the largest annual gathering for the community using and developing the open-source Python programming language.”
In this Programming Throwdown podcast, Mark Harris from Nvidia describes Cuda programming for GPUs. “CUDA is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU). With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are finding broad-ranging uses for GPU computing with CUDA.”
In this video from the 2016 OpenFabrics Workshop, Zili Zheng from LBNL presents: UPC++. “UPC++ is a parallel programming extension for developing C++ applications with the partitioned global address space (PGAS) model. UPC++ has demonstrated excellent performance and scalability with applications and benchmarks such as global seismic tomography, Hartree-Fock, BoxLib AMR framework and more. In this talk, we will give an overview of UPC++ and discuss the opportunities and challenges of leveraging modern network features.”
Today Nvidia announced that Brookhaven National Laboratory has been named a 2016 GPU Research Center. “The center will enable Brookhaven Lab to collaborate with Nvidia on the development of widely deployed codes that will benefit from more effective GPU use, and in the delivery of on-site GPU training to increase staff and guest researchers’ proficiency,” said Kerstin Kleese van Dam, director of CSI and chair of the Lab’s Center for Data-Driven Discovery.
In this special guest feature from Scientific Computing World, Robert Roe writes that software scalability and portability may be more important even than energy efficiency to the future of HPC. “As the HPC market searches for the optimal strategy to reach exascale, it is clear that the major roadblock to improving the performance of applications will be the scalability of software, rather than the hardware configuration – or even the energy costs associated with running the system.”
Today ArrayFire announced the release of Version 3.0 of their high-speed software library for GPU computing. The new version features major changes to ArrayFire’s visualization library, a new CPU backend, and dense linear algebra for OpenCL devices. It also includes improvements across the board for ArrayFire’s OpenCL backend.