Inference systems: The 2nd Piece of the Deep Learning Puzzle

Print Friendly, PDF & Email

We continue our five-part series on the steps to take before launching a machine learning startup. The complete report, available here, covers how to get started, choose a framework, decide what applications and machine learning technology to use, and more. This post also explores inference systems, and how they can apply capabilities to data. 

What are they?

Inference systems provide the second piece of the deep learning puzzle by applying capabilities to the data.

What do they do?

Deep learning can be broken down into two parts: training and inference. When the deep learning neural network has been trained on what to look for, the inference system essentially makes predictions or ‘infers’ based on the input data in order to provide results. Netflix’s recommendation engines are a prime example of the power of inference.

Tech specs and further features

A great example of an inference system is NVIDIA’s TensorRTTM. This high performance deep learning inference engine maximizes inference throughput and efficiency, and provides the ability to take advantage of fast reduced precision instructions provided in the Pascal GPUs. TensorRT v2 delivers up to 45x faster inference under 7 ms real-time latency with INT8 precision.

How will it help me?

Inference systems will optimize, validate and deploy your trained neural network, regardless of how demanding your throughput requirements might be. Multiple network topologies, like AlexNet or Caffenet, tend to be supported. In the case of TensorRT, developers can avoid having to spend their time performance tuning for inference deployment, and instead focus on developing novel AI-powered applications.

Deep learning hardware options 

We can’t look at the latest hardware for deep learning without giving a nod to the Dell EMC PowerEdge R730, and R740 servers.

inference systems

Dell EMC PowerEdge R730

In just 2U of rack space, the PowerEdge R730 server packs a punch, thanks to a combination of powerful processors, large memory, fast storage options and GPU accelerator support. It’s scalable and configurable, enabling you to adapt to virtually any workload. Vital statistics include the Intel Xeon processor E5-2600 v4 product family, and up to 24 DIMMs of DDR4 RAM.

It’s highly scalable storage features up to 16 x 12Gb SAS drives, while the high-performance 12Gb PowerEdge RAID Controller (PERC9) is an ideal tool for your virtualized environment. Data access can further be boosted by an optional SanDisk DAS Cache application acceleration technology.

The equally impressive new PowerEdge R740 offers an ideal balance of accelerator cards, storage and compute resources in a 2U, 2-socket platform. The R740 boasts up to 16 x 2.5″ or 8 x 3.5″ drives and iDRAC9, as well as up to three 300W accelerator cards or six 150W cards. It’s scalable, versatile and can simplify the entire IT lifecycle.

inference systems

Dell EMC PowerEdge R740

Future articles in the insideHPC guide on launching a machine learning startup will cover the following additional topics:

You can download the complete report, “insideHPC Special Report: Launch a Machine Learning Startup,” courtesy of Dell EMC and Nvidia.