Latest Updates to NVIDIA CUDA-X AI Libraries

Learn what’s new in the latest releases of NVIDIA’s CUDA-X AI libraries and NGC. For more information on NVIDIA’s developer tools, join live webinars, training, and Connect with the Experts sessions now through GTC Digital

NVIDIA Collective Communications Library 2.6

NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multi-node collective communication primitives that are performance optimized for NVIDIA GPUs. Highlights in this version include:

  • Up to 2x peak bandwidth with in-network AllReduce operations utilizing SHARPV2
  • Infiniband adaptive routing reroutes traffic to alleviate congested ports
  • Topology support for AMD, ARM, PCI Gen4 and IB HDR
  • Enhancements to topology detection and automatic speed detection of PCI and NICs
Download Now

NVIDIA Triton Inference Server 20.03

NVIDIA Triton Inference Server, formerly TensorRT Inference Server, is an open source inference serving software to serve deep learning models in production with maximum GPU utilization. This version includes:

  • Enabled prioritization per request and timeouts/drops of requests via queuing policies in the dynamic batching scheduler. 
  • Experimental Python client and server support for community standard GRPC inferencing API.
  • Support large ONNX models storing weights across separate files
  • Support ONNX runtime optimizations levels via model configuration settings
  • Support running Triton on older unsupported GPUs via –min-supported-compute-capability flag 
Download Now

Deep Learning Profiler 0.10

Deep Learning Profiler(DLProf) is a profiling app to visualize GPU utilization, operations supported by Tensor Core and their usage during execution. This is an experimental version, it includes:

  • Integration with TensorBoard to visualize results
  • Expert System Recommendations
  • Support for profiling with user defined NVTX markers
Try Deep Learning Profiler in NGC Tensorflow container

Deep Learning Frameworks and Models in NGC 20.03

NVIDIA provides ready-to-run containers with GPU-accelerated frameworks, that include CUDA and CUDA-X libraries required. In addition, NGC also contains optimized models, performance benchmarks and training scripts to achieve them. Highlights in this release include:

For details on features, bug releases and version compatibility, refer to release notes in documentation for containers.

NGC Repository

DALI 0.20

NVIDIA Data Loading Library (DALI) is a portable, open-source library to GPU-accelerate decoding and augmentation of image/video in deep learning apps. This version includes:

  • Optimizations for common speech processing and augmentation operators including spectrogram, mel filterbank and MFCC that can accelerate ASR models such as Jasper and RNN-T
Download Now

Refer to each package’s release notes in documentation for additional information.

(Originally published on March 31, 2020)