Object detection and image search is seen as the next big thing when it comes to the search market. And it’s easy to see why. The ability to deliver relevant results when users mouse over objects within photos means giant computations by algorithms. And these are all trained to classify, detect and match the images within images.
Visual search at scale is no easy matter — but it’s worth it.
A new white paper from NVIDIA documents how Microsoft’s Bing has used the power of NVIDIA GPUs to make visual search a reality. Bing also used the NVIDIA CUDA profiling toolchain and cuDNN to make the system more cost-effective and efficient.
Back in 2015, Bing introduced image-search capabilities that enable users to draw boxes around sub-images or click on boxes of sub-images already detected by the platform. Then, they can use those images as the basis of a new search.
In search of a solution that was fast enough to keep up with their users and their expectations, Bing made the shift to GPUs. The search giant transitioned their objection detection platform from CPUs to Azure NV-series virtual machines running NVIDIA Tesla M60 GPU accelerators.
With CPUs, it would take months to run updated models on the entire dataset of billions of images after every significant change. With GPUs, this process is now instantaneous.
And the results were significant — and quick. The switch cut inference latency by 10X. Further, the company incorporated the NVIDIA cuDNN GPU-accelerated deep learning library into their code, and updated their driver mode from the Windows Display Driver Model to the Tesla Compute Cluster. These implementations led to a drop in latency to 40 milliseconds for a total performance improvement of 60X.
According to the NVIDIA report, in order to detect more object categories on an image, Bing moved from a fast R-CNN two-stage process to a one-stage “single shot detection” process, which enabled the system to detect over 80 image categories.
The new report covers in detail the image search capabilities and enhancements Bing experienced when implementing a variety of Nvidia technologies, including the NVIDIA Tesla M60 GPUs on Azure, NVIDIA Tesla K40s, and NVIDIA cuDNN.
And on the development and deployment side, the report shared switching to NVIDIA GPUs has made the Bing team “more agile” and has ramped up innovation, as well.
Ultimately, with the ability to process deeper and more complex models, Bing Visual Search can support more categories for detectable objects. And quick updates for back-end models means more time to spend on the development front.
Visual search could have a huge impact on online retail, as well as the travel and education sectors. For example, with just one click on a beach or destination, a user could be sent directly to the page to book a vacation to that very spot. The possibilities are expansive.
Download the new white paper from NVIDIA that covers the possibilities of object detection and image search, and explores how Bing deployed NVIDIA technology to speed up object detection and deliver pertinent results in real time.