What’s New for AI Containers on NVIDIA GPU Cloud

NGC is a single source for researchers seeking access to deep learning frameworks, HPC applications, and visualization tools essential for their scientific workflows.

This article summarizes the most important improvements to our AI Framework Containers from the last three releases, 19.01 ( published today), 18.12, and 18.11.

TensorFlow Highlights

    • Added OpenSeq2Seq’s custom pre-built CTC decoder for models DeepSpeech2, wav2letter, and Jasper
    • NCCL: The tensorflow.contrib.nccl module has been moved into core as tensorflow.python.ops.nccl_ops. User scripts may need to be updated accordingly. No changes are required for Horovod users. For an example of using Horovod, refer to the nvidia-examples/cnn/ directory inside the container.

PyTorch Highlights

    • Performance improvement for PyTorch native batch normalization.
    • Mixed precision SoftMax enabling FP16 inputs, FP32 computations and FP32 outputs.

MXNet Highlights

    • Tensor Core Examples
      • An implementation of ResNet50. The ResNet50 v1.5 model is a modified version of the original ResNet50 v1 model, included in the container examples directory.
    • Performance improvements
        • Added MXNET_EXEC_ENABLE_ADDTO environment variable, which when set to 1 increases performance for some networks.
        • Increased performance of Batchnorm and Batchnorm+Relu operators in FP16 and NHWC data format.
        • Increased performance when training with small batch sizes.
        • Improved speed of metrics computation during training, especially in the case of using TopKAccuracy metric.
      • Added fused BatchNormAddRelu operator to the MXNet Symbol package (accessible via mx.sym.BatchNormAddRelu).
    • Added Horovod support for multi-GPU and multi-node
        • Added support for multi-node via Horovod integration. Currently you can use it by specifying horovod type of KVStore.
        • Added MXNET_UPDATE_ON_KVSTORE environment variable, which controls whether to update parameters using KVStore (default is 1 for KVStore device and 0 for KVStore horovod).
      • Added aggregation of SGD updates which increases performance when update on KVStore is disabled.
    • Updated examples
        • Improved handling of float32 datatype in examples/image-classification/train_imagenet_runner.
      • Added resnet-v1b as possible network in the train_imagenet_runner script.
    • Profiling
      • Enabled NVIDIA Tools Extension SDK (NVTX) instrumentation.

NGC features the latest AI frameworks tuned, tested and certified by NVIDIA for use on cloud providers with the latest NVIDIA GPUs that allows you to accelerate your application. The updated and optimized AI containers are available today!

Download Now>