Exploiting HPC Technologies for Accelerating Big Data Processing and Associated Deep Learning

Print Friendly, PDF & Email

DK Panda, Ohio State University

In this video from the 2018 Swiss HPC Conference, DK Panda from Ohio State University presents: Exploiting HPC Technologies for Accelerating Big Data Processing and Associated Deep Learning.

“This talk will provide an overview of challenges in accelerating Hadoop, Spark, and Memcached on modern HPC clusters. An overview of RDMA-based designs for Hadoop (HDFS, MapReduce, RPC and HBase), Spark, Memcached, Swift, and Kafka using native RDMA support for InfiniBand and RoCE will be presented. Enhanced designs for these components to exploit NVM-based in-memory technology and parallel file systems (such as Lustre) will also be presented. Benefits of these designs on various cluster configurations using the publicly available RDMA-enabled packages from the OSU HiBD project will be shown. Benefits of these stacks to accelerate deep learning frameworks (such as CaffeOnSpark and TensorFlowOnSpark) will be presented.”

DK Panda is a Professor and University Distinguished Scholar of Computer Science and Engineering at the Ohio State University. He has published over 400 papers in the area of high-end computing and networking. The MVAPICH2 (High Performance MPI and PGAS over InfiniBand, Omni-Path, iWARP and RoCE) libraries, designed and developed by his research group (http://mvapich.cse.ohio-state.edu), are currently being used by more than 2,875 organizations worldwide (in 86 countries). More than 449,000 downloads of this software have taken place from the project’s site. This software is empowering several InfiniBand clusters (including the 1st, 9th, 12th, 17th, and 48th ranked ones) in the TOP500 list. The RDMA packages for Apache Spark, Apache Hadoop and Memcached together with OSU HiBD benchmarks from his group (http://hibd.cse.ohio-state.edu) are also publicly available. These libraries are currently being used by more than 275 organizations in 34 countries. More than 25,300 downloads of these libraries have taken place. A high-performance and scalable version of the Caffe framework is available from https://hidl.cse.ohio-state.edu.

See more talks at the Swiss HPC Conference Video Gallery

Check out our insideHPC Events Calendar