In this video from the University of Houston CACDS HPC Workshop, Duncan Poole from the OpenACC Standards Group moderates a panel discussion on OpenACC.
“An increasing number of GPU enabled applications are available to the HPC community. The key issues are understanding the enhanced application performance and corresponding increase in power consumption due to GPUs. In most cases these depend on the CPU to GPU ratio and the way GPUs and connected to CPUs. Latest compute node designs allow flexibility to select the number of GPUs and how they are connected CPUs. This offers users a unique opportunity to select the a suitable operating point according to their application characteristics. This talk is about studying the performance vs. power tradeoff on a few common HPC applications.”
The University of Houston (UH) is adding a new, state-of-the-art supercomputer to its arsenal of research tools. With 1860 compute cores, the new Opuntia cluster will be used primarily for scientific and engineering work. The acquisition of this new system marks the start of a new era of supercomputing not only for the University of […]
“This presentation will highlight the use of GPU ray tracing for visualizing the process of photosynthesis, and GPU accelerated analysis of results of hybrid structure determination methods that combine data from cryo-electron microscopy and X-ray crystallography atom molecular dynamics with all- simulations.”
“Deep neural networks have recently emerged as an important tool for difficult AI problems, and have found success in many fields ranging from computer vision to speech recognition. Training deep neural networks is computationally intensive, and so practical application of these networks requires careful attention to parallelism. GPUs have been instrumental in the success of deep neural networks, because they significantly reduce the cost of network training, which then has allowed many researchers to train better networks. In this talk, I will discuss how we were able to duplicate results from a 1000 node cluster using only 3 nodes, each with 4 GPUs.”
“The end of Dennard scaling has made all computing power limited, so that performance is determined by energy efficiency. With improvements in process technology offering little increase in efficiency innovations in architecture and circuits are required to maintain the expected performance scaling. The large scale parallelism and deep storage hierarchy of future machines poses programming challenges. Future programming systems must allow the programmer to express their code in a high-level, target-independent manner and optimize the target-dependent decisions of mapping available parallelism in time and space. This talk will discuss these challenges in more detail and introduce some of the technologies being developed to address them.”