What’s New for AI Containers on NVIDIA GPU Cloud

NGC is a single source for researchers seeking access to deep learning frameworks, HPC applications, and visualization tools essential for their scientific workflows.

This article summarizes the most important improvements to our AI Framework Containers from the last three releases, 19.01 ( published today), 18.12, and 18.11.

TensorFlow Highlights

PyTorch Highlights

MXNet Highlights

  • Latest version of MXNet 1.4.0.rc0, NCCL 2.3.7, cuDNN 7.4.2, DALI 0.6 Beta, TensorRT 5.0.2, Horovod 0.15.1, Amazon Labs Sockeye 1.18.61, ONNX exporter 0.1
  • Tensor Core Examples
    • An implementation of ResNet50. The ResNet50 v1.5 model is a modified version of the original ResNet50 v1 model, included in the container examples directory.
  • Performance improvements
    • Added MXNET_EXEC_ENABLE_ADDTO environment variable, which when set to 1 increases performance for some networks.
    • Increased performance of Batchnorm and Batchnorm+Relu operators in FP16 and NHWC data format.
    • Increased performance when training with small batch sizes.
    • Improved speed of metrics computation during training, especially in the case of using TopKAccuracy metric.
    • Added fused BatchNormAddRelu operator to the MXNet Symbol package (accessible via mx.sym.BatchNormAddRelu).
  • Added Horovod support for multi-GPU and multi-node
    • Added support for multi-node via Horovod integration. Currently you can use it by specifying horovod type of KVStore.
    • Added MXNET_UPDATE_ON_KVSTORE environment variable, which controls whether to update parameters using KVStore (default is 1 for KVStore device and 0 for KVStore horovod).
    • Added aggregation of SGD updates which increases performance when update on KVStore is disabled.
  • Updated examples
    • Improved handling of float32 datatype in examples/image-classification/train_imagenet_runner.
    • Added resnet-v1b as possible network in the train_imagenet_runner script.
  • Profiling
    • Enabled NVIDIA Tools Extension SDK (NVTX) instrumentation.
  • More details: 19.01 release notes, 18.12 release notes, 18.11 release notes

NGC features the latest AI frameworks tuned, tested and certified by NVIDIA for use on cloud providers with the latest NVIDIA GPUs that allows you to accelerate your application. The updated and optimized AI containers are available today!

Download Now>