Habana Labs and Grid.ai Collaborate for Training on Gaudi with PyTorch Lightning

Print Friendly, PDF & Email

April 7, 2022 — The Habana team announced it is collaborating with Grid.ai to make it easier for developers to train on Gaudi processors with PyTorch Lightning without  code changes. Grid.ai and PyTorch Lightning make coding neural networks simple. Habana Gaudi makes it cost efficient to train those networks. The integration of Habana’s SynapseAI software suite with PyTorch Lightning brings the best of both worlds together, enabling greater developer productivity while lowering the cost of model training. PyTorch Lightning 1.6 was released just last week and now supports Habana GaudiAdditional details are here.

Habana Gaudi’s compute and scaling efficiency brings new levels of price-performance for deep learning training, whether in the cloud, on premise or both. Gaudi powers the AWS EC2 DL1 instances and Supermicro’s X12 Gaudi AI Training Server delivering up to 40% better price/performance compared to existing GPU solutions. Habana’s SynapseAI software suite is integrated with TensorFlow and PyTorch frameworks and is optimized for Gaudi performance, with focus on computer vision and natural language processing applications. The Habana Developer Site is the main portal for Gaudi developers and contains a variety of resources to get started with Gaudi, including quick start guides, tutorials, videos, reference models, and SynapseAI documentation.

PyTorch Lightning is a lightweight framework built on PyTorch and provides APIs that abstract the boilerplate code that PyTorch users need to train models. Lightning is designed to enable flexibility and ease of use for AI researchers and data scientists. Once the code is restructured, developers can train models on different accelerators without code changes using state-of-the-art distributed training mechanisms. Lightning adoption has grown quickly in the last two years, with over 600 contributors, 15K GitHub stars, and 2 million monthly downloads, with 10x yoy growth.

“We’re very excited to work alongside the team at Habana to bring Gaudi processor speed-ups and cost-savings to the PyTorch community without requiring any code changes,” said William Falcon, CEO of Grid.ai and creator of PyTorch Lightning. “Because we work together to optimize models running on Gaudi hardware, Lightning enables users to reap the cost-saving benefits of our partnership without having to be experts in hardware integration.”

The Lightning and Habana teams have a shared philosophy that data scientists and researchers should be able to focus more on the data science and research, and less on the underlying software engineering. As a result of our collaboration, developers now have the flexibility to choose Gaudi’s AI computational power with Lightning to benefit from Gaudi advantages with speed and ease. All it takes is a few code changes to get started with training on Gaudi.

“We’re thrilled to partner with the team at Grid.ai to enable the Lightning developer community to benefit from the cost efficiency and scalability of Gaudi for deep learning training” said Sree Ganesan, Head of Software Products, Habana Labs. “Our combined solution provides flexibility, ease of use and high-performance processing, making AI training more accessible and cost effective for developers.”

Riskfuel is a fintech startup that provides real-time valuations and risk sensitivities to companies managing financial portfolios, helping them increase trading accuracy and performance. “We are excited that PyTorch Lightning now has support for Habana Gaudi. We’re looking forward to taking advantage of the productivity that Lightning offers with cost efficient training on Gaudi,” said Maxime Bergeron, R&D Director at Riskfuel.

For more information on deep learning training with Gaudi, please visit developer.habana.ai.