rdma Archives - Page 2 of 7 - High-Performance Computing News Analysis

Agenda Posted: Exacomm 2019 Workshop at ISC High Performance

June 5, 2019 by Doug Black

“The goal of this workshop is to bring together researchers and software/hardware designers from academia, industry and national laboratories who are involved in creating network-based computing solutions for extreme scale architectures. The objectives of this workshop will be to share the experiences of the members of this community and to learn the opportunities and challenges in the design trends for exascale communication architectures.”

Filed Under: Cloud HPC, Compute, CPUs, GPUs, FPGAs, Events, Exascale, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Network, Research / Education, Resources Tagged With: AWS, Azure, big data, Cray, Exacomm 2019, Fujitsu, InfiniBand, Mellanox, NVMe, rdma, Sandia, Weekly Newsletter Articles

HPC Breaks Through to the Cloud: Why It Matters

April 30, 2019 by staff

In this special guest feature, Scot Schultz from Mellanox writes researchers are benefitting in a big way from HPC in the Cloud. “HPC has many different advantages depending on the specific use case, but one aspect that these implementations have in common is their use of RDMA-based fabrics to improve compute performance and reduce latency.”

Filed Under: Cloud HPC, Compute, HPC Hardware, HPC Software, Industry Perspectives, Industry Segments, Network, News, Research / Education, Resources Tagged With: InfiniBand, Mellanox, NCI, rdma, RoCE, Vestas

Faster Fabrics Running Against Limits of the Operating System, the Processor, and the I/O Bus

March 26, 2019 by Doug Black

Christopher Lameter from Jump Trading gave this talk at the OpenFabrics Workshop in Austin. “In 2017 we got 100G fabrics, in 2018 200G fabrics and in 2019 it looks like 400G technology may be seeing a considerable amount of adoption. These bandwidth compete with and sometimes are higher than the internal bus speeds of the servers that are connected using these fabrics. I think we need to consider these developments and work on improving fabrics and the associated APIs so that ways to access these features become possible using vendor neutral APIs. It needs to be possible to code in a portable way and not to a vendor specific one.”

Filed Under: Compute, Datacenter, Enterprise HPC, Events, HPC Hardware, HPC Software, Industry Perspectives, Industry Segments, Main Feature, Network, News, Research / Education, Resources, Videos Tagged With: 200G, 400G, jump trading, OFA, OpenFabrics Workshoop, rdma, Weekly Newsletter Articles

Accelerating TensorFlow with RDMA for High-Performance Deep Learning

March 20, 2019 by Doug Black

Xiaoyi Lu from Ohio State University gave this talk at the 2019 OpenFabrics Workshop in Austin. “Google’s TensorFlow is one of the most popular Deep Learning (DL) frameworks. We propose a unified way of achieving high performance through enhancing the gRPC runtime with Remote Direct Memory Access (RDMA) technology on InfiniBand and RoCE. Through our proposed RDMAgRPC design, TensorFlow only needs to run over the gRPC channel and gets the optimal performance.”

Filed Under: Compute, Events, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Machine Learning, Main Feature, Network, News, Research / Education, Resources, Videos Tagged With: OFA, OpenFabrics Workshop, rdma, TensorFlow, Weekly Newsletter Articles

Mellanox HDR 200G InfiniBand Speeds Machine Learning with NVIDIA

March 18, 2019 by Doug Black

Today Mellanox announced that its HDR 200G InfiniBand with the “Scalable Hierarchical Aggregation and Reduction Protocol” (SHARP) technology has set new performance records, doubling deep learning operations performance. The combination of Mellanox In-Network Computing SHARP with NVIDIA 100 Tensor Core GPU technology and Collective Communications Library (NCCL) deliver leading efficiency and scalability to deep learning and artificial intelligence applications.

Filed Under: CPUs, GPUs, FPGAs, Datacenter, Enterprise HPC, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Machine Learning, Network, News, Research / Education Tagged With: AI, ConnectX-6, CUDA-X, GPUDirect RDMA, HCAs, Mellanox, Mellanox Quantum Switch, nvidia, NVIDIA V100, NVlink, rdma, Weekly Newsletter Articles

Video: Why InfiniBand is the Way Forward for Ai and Exascale

March 5, 2019 by Doug Black

In this video, Gilad Shainer from the InfiniBand Trade Association describes how InfiniBand offers the optimal interconnect technology for Ai, HPC, and Exascale. “Tthrough Ai, you need the biggest pipes in order to move those giant amount of data in order to create those Ai software algorithms. That’s one thing. Latency is important because you need to drive things faster. RDMA is one of the key technology that enables to increase the efficiency of moving data, reducing CPU overhead. And by the way, now, there’s all of the Ai frameworks that exist out there, supports RDMA as a default element within the framework itself.”

Filed Under: Compute, Datacenter, Enterprise HPC, Exascale, High Performance Analytics, HPC Hardware, HPC Software, Industry Perspectives, Industry Segments, Machine Learning, Main Feature, Network, News, Research / Education, Resources, Videos Tagged With: hpc advisory council, IBTA, InfiniBand, Mellanox, Mellanox SHARP, rdma, Weekly Newsletter Articles

How to Design Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems

March 4, 2019 by Doug Black

DK Panda from Ohio State University gave this talk at the Stanford HPC Conference. “This talk will focus on challenges in designing HPC, Deep Learning, and HPC Cloud middleware for Exascale systems with millions of processors and accelerators. For the HPC domain, we will discuss about the challenges in designing runtime environments for MPI+X (PGAS – OpenSHMEM/UPC/CAF/UPC++, OpenMP, and CUDA) programming models taking into account support for multi-core systems (Xeon, OpenPower, and ARM), high-performance networks, GPGPUs (including GPUDirect RDMA), and energy-awareness.”

Filed Under: Compute, Events, Exascale, HPC Hardware, HPC Software, Industry Segments, Main Feature, Network, News, Parallel Programming, Research / Education, Resources, Videos Tagged With: ARM, DK Panda, InfiniBand, Intel, Middleware, MVAPICH, Ohio State University, rdma, Stanford HPC Conference, Weekly Newsletter Articles

The State of High-Performance Fabrics: A Chat with the OpenFabrics Alliance

February 11, 2019 by staff

In this special guest feature, Paul Grun and Doug Ledford from the OpenFabrics Alliance describe the industry trends in the fabrics space, its state of affairs and emerging applications. “Originally, ‘high-performance fabrics’ were associated with large, exotic HPC machines. But in the modern world, these fabrics, which are based on technologies designed to improve application efficiency, performance, and scalability, are becoming more and more common in the commercial sphere because of the increasing demands being placed on commercial systems.”

Filed Under: Datacenter, Enterprise HPC, Events, HPC Hardware, HPC Software, Industry Perspectives, Industry Segments, Main Feature, Network, News, Research / Education, Resources Tagged With: Cray, InfiniBand, Intel, iWARP, OFA, OmniPath, OpenFabrics Workshop, rdma, RoCE, SELinux, Weekly Newsletter Articles

Agenda Posted for ExaComm 2018 Workshop in Frankfurt

June 13, 2018 by Doug Black

The ExaComm 2018 workshop has posted their Speaker Agenda. Held in conjunction with ISC 2018, the Fourth International Workshop on Communication Architectures for HPC, Big Data, Deep Learning and Clouds at Extreme Scale takes place June 28 in Frankfurt. ” The goal of this workshop is to bring together researchers and software/hardware designers from academia, industry and national laboratories who are involved in creating network-based computing solutions for extreme scale architectures. The objectives of this workshop will be to share the experiences of the members of this community and to learn the opportunities and challenges in the design trends for exascale communication architectures.”

Filed Under: Compute, Events, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Network, News, Research / Education, Resources, Storage Tagged With: big data, Exacomm 2018, Intel Omni Path, ISC 2018, NVlink, NVMe, rdma

Improving Deep Learning scalability on HPE servers with NovuMind: GPU RDMA made easy

June 13, 2018 by Doug Black

Bruno Monnet from HPE gave this talk at the NVIDIA GPU Technology Conference. “Deep Learning demands massive amounts of computational power. Those computation power usually involve heterogeneous computation resources, e.g., GPUs and InfiniBand as installed on HPE Apollo. NovuForce deep learning softwares within the docker image has been optimized for the latest technology like NVIDIA Pascal GPU and infiniband GPUDirect RDMA. This flexibility of the software, combined with the broad GPU servers in HPE portfolio, makes one of the most efficient and scalable solutions.”

Filed Under: Compute, CPUs, GPUs, FPGAs, Enterprise HPC, Events, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Machine Learning, Main Feature, Network, News, Research / Education, Resources, Videos Tagged With: GPU Technology Conference, GPUDirect RDMA, HPE, InfiniBand, NovuMind, nvidia, rdma

Agenda Posted: Exacomm 2019 Workshop at ISC High Performance

HPC Breaks Through to the Cloud: Why It Matters

Faster Fabrics Running Against Limits of the Operating System, the Processor, and the I/O Bus

Accelerating TensorFlow with RDMA for High-Performance Deep Learning

Mellanox HDR 200G InfiniBand Speeds Machine Learning with NVIDIA

Video: Why InfiniBand is the Way Forward for Ai and Exascale

How to Design Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems

The State of High-Performance Fabrics: A Chat with the OpenFabrics Alliance

Agenda Posted for ExaComm 2018 Workshop in Frankfurt

Improving Deep Learning scalability on HPE servers with NovuMind: GPU RDMA made easy

Sponsored Guest Articles

Accelerated HPC for Energy Efficiency with AWS and NVIDIA

White Papers

Energy efficiency drives HPC to the cloud

Featured RSS Feed

More News from insideBIGDATA