Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Podcast: Intel to Ship Neural Network Processor by end of year

Naveen Rao is vice president and general manager of the Artificial Intelligence Products Group at Intel Corporation.

Intel’s Naveen Rao writes that Intel will soon be shipping the world’s first family of processors designed from the ground up for artificial intelligence. As announced today, the new chip will be the company’s first step towards it goal of achieving 100 times greater AI performance by 2020.

The new Intel® Nervana™ Neural Network Processor family of processors is over 3 years in the making. The Intel Nervana NNP is a purpose built architecture for deep learning. The goal of this new architecture is to provide the needed flexibility to support all deep learning primitives while making core hardware components as efficient as possible.

“We designed the Intel Nervana NNP to free us from the limitations imposed by existing hardware, which wasn’t explicitly designed for AI:

  • New memory architecture designed for maximizing utilization of silicon computation. Matrix multiplication and convolutions are a couple of the important primitives at the heart of Deep Learning. These computations are different from general purpose workloads since the operations and data movements are largely known a priori.  For this reason, the Intel Nervana NNP does not have a standard cache hierarchy and on-chip memory is managed by software directly. Better memory management enables the chip to achieve high levels of utilization of the massive amount of compute on each die. This translates to achieving faster training time for Deep Learning models.
  • Achieve new level of scalability AI models. Designed with high speed on- and off-chip interconnects, the Intel Nervana NNP enables massive bi-directional data transfer.  A stated design goal was to achieve true model parallelism where neural network parameters are distributed across multiple chips.  This makes multiple chips act as one large virtual chip that can accommodate larger models, allowing customers to capture more insight from their data.
  • High degree of numerical parallelism: Flexpoint. Neural network computations on a single chip are largely constrained by power and memory bandwidth.  To achieve higher degrees of throughput for neural network workloads, in addition to the above memory innovations, we have invented a new numeric format called Flexpoint.  Flexpoint allows scalar computations to be implemented as fixed-point multiplications and additions while allowing for large dynamic range using a shared exponent.  Since each circuit is smaller, this results in a vast increase in parallelism on a die while simultaneously decreasing power per computation.
  • Meaningful performance. The current AI revolution is actually a computational evolution. Intel has been at the heart of advancing the limits of computation since the invention of the integrated circuit. We have early partners in industry and research who are walking with us on this journey to make the first commercially neural network processor impactful for every industry. We have a product roadmap that puts us on track to exceed the goal we set last year to achieve a 100x increase in deep learning training performance by 2020.

Intel Nervana Neural Network Processor

In designing the Intel Nervana NNP family, Intel is once again listening to the silicon for cues on how to make it best for our customers’ newest challenges. Additionally, we are thrilled to have Facebook in close collaboration sharing their technical insights as we bring this new generation of AI hardware to market. Our hope is to open up the possibilities for a new class of AI applications that are limited only by our imaginations.

Sign up for our insideHPC Newsletter

Resource Links: