TensorRT Archives - High-Performance Computing News Analysis

NVIDIA TensorRT 6 Breaks 10 millisecond barrier for BERT-Large

September 17, 2019 by staff

Today, NVIDIA released TensorRT 6, which includes new capabilities that dramatically accelerate conversational AI applications, speech recognition, 3D image segmentation for medical applications, as well as image-based applications in industrial automation. TensorRT is a high-performance deep learning inference optimizer and runtime that delivers low latency, high-throughput inference for AI applications. “With today’s release, TensorRT continues to expand its set of optimized layers, provides highly requested capabilities for conversational AI applications, delivering tighter integrations with frameworks to provide an easy path to deploy your applications on NVIDIA GPUs. In TensorRT 6, we’re also releasing new optimizations that deliver inference for BERT-Large in only 5.8 ms on T4 GPUs, making it practical for enterprises to deploy this model in production for the first time.”

Filed Under: CPUs, GPUs, FPGAs, Enterprise HPC, HPC Hardware, HPC Software, Industry Segments, Machine Learning Tagged With: AI, BERT, nvidia, TensorRT

Video: NVIDIA Rolls out TensorRT Hyperscale Platform and New T4 GPU for Ai Datacenters

September 13, 2018 by Doug Black

This morning at GTC Japan, NVIDIA CEO Jensen Huang announced a set new products centered around Ai and accelerated computing. Targeting Hyperscale datacenters looking to run Ai workloads, NVIDIA continues to innovate Machine Learning technologies at an unprecedented pace. “There is no question that deep learning-powered AI is being deployed around the world, and we’re seeing incredible growth here,” Huang told an audience of more than 4,000 press, partners, academics and technologists gathered on the latest stop in a GTC world tour.

Filed Under: CPUs, GPUs, FPGAs, Datacenter, Editor's Choice, Enterprise HPC, Events, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Machine Learning, Main Feature, Manufacturing, News, Research / Education, Resources, Videos, Visualization Tagged With: cuda, hyperscale, nvidia, NVIDIA AGX, TensorRT, Tesla T4 GPU, Weekly Newsletter Articles

High Performance Inferencing with TensorRT

January 23, 2018 by Doug Black

Chris Gottbrath from NVIDIA gave this talk at SC17 in Denver. “This talk will introduce the TensorRT Programmable Inference Accelerator which enables high throughput and low latency inference on clusters with NVIDIA V100, P100, P4 or P40 GPUs. TensorRT is both an optimizer and runtime – users provide a trained neural network and can easily creating highly efficient inference engines that can be incorporated into larger applications and services.”

Filed Under: Compute, CPUs, GPUs, FPGAs, Datacenter, Enterprise HPC, Events, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Machine Learning, Main Feature, News, Research / Education, Resources, Videos Tagged With: AI, nvidia, SC17, TensorFlow, TensorRT, Weekly Newsletter Articles

NVIDIA TensorRT 6 Breaks 10 millisecond barrier for BERT-Large

Video: NVIDIA Rolls out TensorRT Hyperscale Platform and New T4 GPU for Ai Datacenters

High Performance Inferencing with TensorRT

Sponsored Guest Articles

Microsoft and NVIDIA Together Advance AI

White Papers

Energy efficiency drives HPC to the cloud

Featured RSS Feed

More News from insideBIGDATA