With modern processors that contain a large number of cores, to get maximum performance it is necessary to structure an application to use as many cores as possible. Explicitly developing a program to do this can take a significant amount of effort. It is important to understand the science and algorithms behind the application, and then use whatever programming techniques that are available. “Intel Threaded Building Blocks (TBB) can help tremendously in the effort to achieve very high performance for the application.”
Today ORNL announced the full schedule of 2017 GPU Hackathons at multiple locations around the world. “The goal of each hackathon is for current or prospective user groups of large hybrid CPU-GPU systems to send teams of at least 3 developers along with either (1) a (potentially) scalable application that could benefit from GPU accelerators, or (2) an application running on accelerators that need optimization. There will be intensive mentoring during this 5-day hands-on workshop, with the goal that the teams leave with applications running on GPUs, or at least with a clear roadmap of how to get there.”
Manuel Arenaz from Appentra presented this talk at the OpenMP booth at SC16. “Parallware is a new technology for static analysis of programs based on the production-grade LLVM compiler infrastructure. Using a fast, extensible hierarchical classification scheme to address dependence analysis, it discovers parallelism and annotates the source code with the most appropriate OpenMP & OpenACC directives.”
The Seventh International Workshop on Accelerators and Hybrid Exascale Systems (AsHES) has issued its Call for Papers. The event takes place May 29 in Orlando, Florida in conjunction with the IEEE International Parallel and Distributed Processing Symposium.
“Writing and deploying software that exploits the ever increasing computing power of clusters and supercomputers is a demanding challenge – it needs to run fast, and run right, and that’s exactly what our suite of tools is designed to enable,” said David Lecomber, CEO, Allinea. “As part of ARM, we’ll continue to work with the HPC community, our customers and our partners to advance the development of our cross-platform technology, and take advantage of product synergies between ARM’s compilers, libraries and advisory tools and our existing and future debugging and analysis tools. Our combined expertise and understanding of the challenges this market faces will deliver new solutions to this growing ecosystem.”
“New Radeon Instinct accelerators will offer organizations powerful GPU-based solutions for deep learning inference and training. Along with the new hardware offerings, AMD announced MIOpen, a free, open-source library for GPU accelerators intended to enable high-performance machine intelligence implementations, and new, optimized deep learning frameworks on AMD’s ROCm software to build the foundation of the next evolution of machine intelligence workloads.”
Libraries that are tuned to the underlying hardware architecture can increase performance tremendously. Higher level libraries such at the Intel Data Analytics Acceleration Library (Intel DAAL) can assist the developer with highly tuned algorithms for data analysis as well as machine learning. Intel DAAL functions can be called within other, more comprehensive frameworks that deal with the various types of data and storage, increasing the performance and lowering the development time of a wide range of applications.
“Software Defined Visualization (SDVis) is an open source initiative from Intel and industry collaborators to improve the visual fidelity, performance and efficiency of prominent visualization solutions – with a particular emphasis on supporting the rapidly growing “Big Data” usage on workstations through HPC supercomputing clusters without the memory limitations and cost of GPU based solutions. Existing applications can be enhanced using the high performing parallel software rendering libraries OpenSWR, Embree, and OSPRay. At the Intel HPC Developer Conference, Amstutz provided an introduction to this initiative, its benefits, a brief descriptions of accomplishments in the past year and talk about the changes made to Intel provided libraries in the past year.”
In this video from SC16, Ben Sander from AMD presents: HIP and CAFFE Porting and Profiling with AMD’s ROCm. “We are excited to present ROCm, the first open-source HPC/Hyperscale-class platform for GPU computing that’s also programming-language independent. We are bringing the UNIX philosophy of choice, minimalism and modular software development to GPU computing. The new ROCm foundation lets you choose or even develop tools and a language run time for your application. ROCm is built for scale; it supports multi-GPU computing in and out of server-node communication through RDMA.”
“The HPC Community demands performance, transparency, and value—exactly what Red Hat and open source offer. Red Hat is the standard choice for Linux in HPC clusterers worldwide. But it doesn’t stop there–our cloud, virtualization, storage, platform and service-oriented solutions bring real freedom and collaboration to federal, state, local, and academic programs. And Red Hat’s worldwide support, training and consulting services bring the power of open source to your agency. We are a part of a larger community working together to drive innovation.”