Latest Updates to NVIDIA CUDA-X Libraries

Learn what’s new in the latest releases of NVIDIA’s CUDA-X Libraries and NGC.

Neural Modules

NVIDIA Neural Modules is a new open-source toolkit for researchers to build state-of-the-art neural networks for AI accelerated speech applications. Early release of the toolkit includes:

Base modules for automatic speech recognition and natural language processing
GPU acceleration with mixed precision and multi-node distributed training
PyTorch support

Download Now

TensorRT 6

NVIDIA TensorRT is a platform for high-performance deep learning inference. This version of TensorRT includes:

BERT-Large inference in 5.8 ms on T4 GPUs
Dynamic shaped inputs to accelerate conversational AI, speech, and image segmentation apps
Dynamic input batch sizes help speed up online apps with fluctuating workloads
New layers accelerate 3D image segmentation in healthcare apps
Optimizations in 2D image segmentation for industrial defect inspection

Get started with new Jupyter notebooks:

Automatic Speech Recognition with Jasper

Download Now

cuDNN 7.6

NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. This version of cuDNN includes:

Tensor Core accelerated 3D convolutions for VNet and UNet-3D models
Tensor Core acceleration for multi-head attention forward training and inference
Auto-padding for TensorFlow NHWC layout for faster kernel launch times

Download Now

NGC Updates

NGC provides containers, models and scripts with the latest performance enhancements. This month’s updates include:

GPU Optimized ASR and NLP Pipelines

New Jupyter Notebooks
- JasperTRT.ipynb (ASR)
- Predicting_movie_reviews_with_bert_on_tf_hub.ipynb (NLP)

BERT TensorFlow and PyTorch multi-node Training scripts, using SLURM container orchestration, and DeepOps cluster management

Deep Learning Framework Updates

Native Automatic Mixed Precision support in TensorFlow 2.0 and MXNet 1.5
Additional support in PyTorch and MXNet for 3D convolutions, grouped convolutions, and depthwise separable

TensorRT Inference Server

NVIDIA TensorRT Inference Server is an open source inference microservice that lets you serve deep learning models in production while maximizing GPU utilization. This version of TensorRT Inference Server includes: