Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Designing Scalable HPC, Deep Learning, Big Data, and Cloud Middleware for Exascale Systems

DK Panda, Ohio State University

In this video from the UK HPC Conference, DK Panda from Ohio State University presents: Designing Scalable HPC, Deep Learning, Big Data, and Cloud Middleware for Exascale Systems.

This talk will focus on challenges in designing HPC, Deep Learning, Big Data and HPC Cloud middleware for Exascale systems with millions of processors and accelerators. For the HPC domain, we will discuss about the challenges in designing runtime environments for MPI+X (PGAS – OpenSHMEM/UPC/CAF/UPC++, OpenMP, and CUDA) programming models by taking into account support for multi-core systems (Xeon, ARM and OpenPower), high-performance networks, and GPGPUs (including GPUDirect RDMA). Features, sample performance numbers and best practices of using MVAPICH2 libraries will be presented. For the Deep Learning domain, we will focus on popular Deep Learning frameworks (Caffe, CNTK, and TensorFlow) to extract performance and scalability with MVAPICH2-GDR MPI library. For the Big Data domain, we will focus on high-performance and scalable designs of Spark and Hadoop (including HDFS, MapReduce, RPC, and HBase) and the associated Deep Learning frameworks using native RDMA support for InfiniBand and RoCE. Finally, we will outline the challenges in moving these middleware to the Azure and AWS cloud environments.

DK Panda is a Professor and University Distinguished Scholar of Computer Science and Engineering at the Ohio State University. He has published over 450 papers in the area of high-end computing and networking. The MVAPICH2 (High Performance MPI and PGAS over InfiniBand, Omni-Path, iWARP and RoCE) libraries, designed and developed by his research group, are currently being used by more than 3,025 organizations worldwide (in 89 countries). More than 559,000 downloads of this software have taken place from the project’s site. This software is empowering several InfiniBand clusters (including the 3rd, 5th, 8th, 16th, and 19th ranked ones) in the TOP500 list.

See more talks from the UK HPC Conference

Check out our insideHPC Events Calendar

Leave a Comment

*

Resource Links: