Fast.AI Achieves Record ImageNet performance with NVIDIA V100 Tensor Core GPUs

Print Friendly, PDF & Email

A snippet of the Jupyter Notebook comparing different cropping approaches. ‘Center Crop Image’ is the original photo, ‘FastAi rectangular’ is our new method, ‘Imagenet Center’ is the standard approach, and ‘Test Time Augmentation’ is an example from the multi-crop approach.

The NVIDIA blog points us to this story on how Fast.ai just completed a new deep learning benchmark milestone. Using NVIDIA V100 GPUs on AWS with PyTorch, the company now has the ability to train ImageNet to 93% accuracy in just 18 minutes.

Fast.ai alumni Andrew Shaw, and Defense Innovation Unit Experimental (DIU)researcher Yaroslav Bulatov achieved the speed record using 128 NVIDIA Tesla V100 Tensor Core GPUs on the Amazon Web Services (AWS) cloud, with the fastai and cuDNN-accelerated PyTorch libraries. The record is 40% faster than the previous record.

DIU and fast.ai will be releasing software to allow anyone to easily train and monitor their own distributed models on AWS, using the best practices developed in this project,” said Jeremy Howard, a founding researcher at fast.ai. “We entered this competition because we wanted to show that you don’t have to have huge resources to be at the cutting edge of AI research, and we were quite successful in doing so.”

The researchers said they were encouraged by previous speed records achieved on publicly available machines by the AWS team.

The set of tools developed by fast.ai focused on fast iteration with single-instance experiments, whilst the nexus-scheduler developed by DIU was focused on robustness and multi-machine experiments,” Howard stated.

Sign up for our insideHPC Newsletter