Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems

Print Friendly, PDF & Email

DK Panda, Ohio State University

In this video from the 2018 Swiss HPC Conference, DK Panda from Ohio State University presents: Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems.

“This talk will focus on challenges in designing HPC, Deep Learning, and HPC Cloud middleware for Exascale systems with millions of processors and accelerators. For the HPC domain, we will discuss about the challenges in designing runtime environments for MPI+X (PGAS – OpenSHMEM/UPC/CAF/UPC++, OpenMP, and CUDA) programming models by taking into account support for multi-core systems (KNL and OpenPower), high-performance networks, GPGPUs (including GPUDirect RDMA), and energy-awareness. Features, sample performance numbers and best practices of using MVAPICH2 libraries ( be presented.

For the Deep Learning domain, we will focus on popular Deep Learning frameworks (Caffe, CNTK, and TensorFlow) to extract performance and scalability with MVAPICH2-GDR MPI library. Finally, we will outline the challenges in moving these middleware to the Cloud environments.”

Dr. Dhabaleswar K. (DK) Panda is a Professor and Distinguished Scholar of Computer Science at the Ohio State University. He obtained his Ph.D. in computer engineering from the University of Southern California. His research interests include parallel computer architecture, high performance networking, InfiniBand, network-based computing, exascale computing, programming models, GPUs and accelerators, high performance file systems and storage, virtualization and cloud computing and BigData (Hadoop (HDFS, MapReduce and HBase) and Memcached). He has published over 400 papers in major journals and international conferences related to these research areas.

Dr. Panda and his research group members have been doing extensive research on modern networking technologies including InfiniBand, Omni-Path, iWARP and RoCE. His research group is currently collaborating with National Laboratories and leading InfiniBand, Omni-Path, iWARP and RoCE companies on designing various subsystems of next generation high-end systems. The MVAPICH (High Performance MPI and MPI+PGAS over InfiniBand, iWARP and RoCE with support for GPGPUs, Xeon Phis and Virtualization) software libraries , developed by his research group, are currently being used by more than 2,850 organizations worldwide (in 85 countries). These software packages have enabled several InfiniBand clusters to get into the latest TOP500 ranking. More than 440,000 downloads of this software have taken place from the project website alone. These software packages are also available with the software stacks for network vendors (InfiniBand, Omni-Path, RoCE, and iWARP), server vendors (OpenHPC), and Linux distributors (such as RedHat and SuSE). This software is currently powering the #1 supercomputer in the world.

See more talks at the Swiss HPC Conference Video Gallery

Check out our insideHPC Events Calendar