Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Slidecast: Dell EMC Using Neural Networks to “Read Minds”

In this slidecast, Luke Wilson from Dell EMC describes a case study with McGill University using neural networks to read minds. “If you want to build a better neural network, there is no better model than the human brain. In this project, McGill University was running into bottlenecks using neural networks to reverse-map fMRI images. The team from the Dell EMC HPC & AI Innovation Lab was able to tune the code to run solely on Intel Xeon Scalable processors, rather than porting to the university’s scarce GPU accelerators.”

Deep Learning for Natural Language Processing – Choosing the Right GPU for the Job

In this new whitepaper from our friends over at Exxact Corporation we take a look at the important topic of deep learning for Natural Language Processing (NLP) and choosing the right GPU for the job. Focus is given to the latest developments in neural networks and deep learning systems, in particular a neural network architecture called transformers. Researchers have shown that transformer networks are particularly well suited for parallelization on GPU-based systems.

New Paper: Survey of FPGAs for Convolutional Neural Networks

Sparsh Mittal has just published a new paper on the use of FPGAs for Convolutional Neural Networks. “Deep convolutional neural networks (CNNs) have recently shown very high accuracy in a wide range of cognitive tasks and due to this, they have received significant interest from tnewhe researchers. Given the high computational demands of CNNs, custom hardware accelerators are vital for boosting their performance. The high energy-efficiency, computing capabilities and reconfigurability of FPGA make it a promising platform for hardware acceleration of CNNs.”

A Survey of FPGA-based Accelerators for Convolutional Neural Networks

Deep convolutional neural networks (CNNs) have recently shown very high accuracy in a wide range of cognitive tasks and due to this, they have received significant interest from the researchers. Given the high computational demands of CNNs, custom hardware accelerators are vital for boosting their performance. The high energy-efficiency, computing capabilities and reconfigurability of FPGA make it a promising platform for hardware acceleration of CNNs. In this paper, we present a survey of techniques for implementing and optimizing CNN algorithms on FPGA. We organize the works in several categories to bring out their similarities and differences. This paper is expected to be useful for researchers in the area of artificial intelligence, hardware architecture and system-design.

The Machine Learning Potential of a Combined Tech Approach

This is the first in a five-part series from a report exploring the potential of unified deep learning with CPU, GPU and FGPA technologies. This post explores the machine learning potential of taking a combined approach to these technologies. 

MIT helps move Neural Nets back to Analog

MIT researchers have developed a special-purpose chip that increases the speed of neural-network computations by three to seven times over its predecessors, while reducing power consumption 94 to 95 percent. “The computation these algorithms do can be simplified to one specific operation, called the dot product. Our approach was, can we implement this dot-product functionality inside the memory so that you don’t need to transfer this data back and forth?”

TensorFlow Deep Learning Optimized for Modern Intel Architectures

Researchers at Google and Intel recently collaborated to extract the maximum performance from Intel® Xeon and Intel® Xeon Phi processors running TensorFlow*, a leading deep learning and machine learning framework. This effort resulted in significant performance gains and leads the way for ensuring similar gains from the next generation of products from Intel. Optimizing Deep Neural Network (DNN) models such as TensorFlow presents challenges not unlike those encountered with more traditional High Performance Computing applications for science and industry.

Deep Learning Frameworks Get a Performance Benefit from Intel MKL Matrix-Matrix Multiplication

Intel® Math Kernel Library 2017 (Intel® MKL 2017) includes new GEMM kernels that are optimized for various skewed matrix sizes. The new kernels take advantage of Intel® Advanced Vector Extensions 512 (Intel® AVX-512) and achieves high GEMM performance on multicore and many-core Intel® architectures, particularly for situations arising from deep neural networks..

Video: The Coming Quantum Computing Revolution

In this video, D-Wave Systems Founder Eric Ladizinsky presents: The Coming Quantum Computing Revolution. “Despite the incredible power of today’s supercomputers, there are many complex computing problems that can’t be addressed by conventional systems. Our need to better understand everything, from the universe to our own DNA, leads us to seek new approaches to answer the most difficult questions. While we are only at the beginning of this journey, quantum computing has the potential to help solve some of the most complex technical, commercial, scientific, and national defense problems that organizations face.”

Steve Oberlin Presents: Accelerating Understanding – Machine Learning & Intelligent Applications

Steve Oberlin from Nvidia presented this talk at The Digital Future conference. “Oberlin will discuss machine learning and neural networks, explore a few advanced applications based on deep learning algorithms, discuss the foundation and architecture of representative algorithms, and illustrate the pivotal role GPU acceleration is playing in this exciting and rapidly expanding field.”