NVIDIA Releases Updates to CUDA-X AI Libraries

NVIDIA CUDA-X AI are deep learning libraries for researchers and software developers to build high performance GPU-accelerated applications for conversational AI, recommendation systems and computer vision. CUDA-X AI libraries deliver world leading performance for both training and inference across industry benchmarks such as MLPerf.

Learn what’s new in the latest releases of CUDA-X AI libraries and NGC.

Refer to each package’s release notes in documentation for additional information.

cuDNN 8.1

The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. This version of cuDNN includes:

TensorRT 7.2

NVIDIA TensorRT is a platform for high-performance deep learning inference. This version of TensorRT includes:

  • New debugging APIs – ONNX Graphsurgeon, Polygraphy, and Pytorch Quantization toolkit
  • Support for Python 3.8 

In addition this version includes several bug fixes and documentation upgrades.

Triton Inference Server 2.6

Triton is an open source inference serving software designed to maximize performance and simplify production deployment at scale. This version of Triton includes:

  • Alpha version of Windows build, which supports gRPC and TensorRT backend
  • Initial release of Model Analyzer, which is a tool that helps users select the optimal model configuration that maximizes performance in Triton.
  • Support for Ubuntu 20.04 – Triton provides support for the latest version of Ubuntu, which comes with additional security updates.
  • Native support in DeepStream – Triton on DeepStream can run inference on video analytics workflows on the edge or on the cloud with Kubernetes.

NGC Container Registry

NGC, the hub for GPU-optimized AI/ML/HPC application containers, models and SDKs that simplifies software development and deployment so users can achieve faster time to solution. This month’s updates include:

  • NGC catalog in the AWS Marketplace – users can now pull the software directly from the AWS portal
  • Containers for latest versions of NVIDIA AI software including Triton Inference Server, TensorRT, and deep learning frameworks such as PyTorch

DALI 0.30

The NVIDIA Data Loading Library (DALI) is a portable, open-source GPU-accelerated library for decoding and augmenting images and videos to accelerate deep learning applications. This version of DALI includes:

NVJPEG2000 0.1

nvJPEG2000 is a new library for GPU-accelerated JPEG2000 image decoding. This version of nvJPEG2000 includes:

  • Support for Linux and Windows operating systems
  • Up to 4x faster lossless decoding for 5-3 wavelet decoding and upto 7x faster lossy decoding for 9-7 wavelet transform
  • Bitstreams with multiple tiles are now supported.

About Brad Nemire

Brad Nemire leads the Developer Communications team at NVIDIA focused on evangelizing amazing GPU-accelerated applications. Prior to NVIDIA, he worked at Arm on the Developer Relations team. Brad graduated from San Diego State University.