White Paper: NVIDIA DGX-1 with Tesla V100

A new technical white paper provides an in-depth look at the hardware and software technologies inside the NVIDIA DGX-1 — the fastest platform for deep learning training.

The demand for deep learning performance is rapidly growing. Facebook CTO Mike Schroepfer noted that:

  • Facebook’s deployed neural networks process more than 6 million predictions per second;
  • 25% of Facebook engineers are now using AI and machine learning APIs and infrastructure;
  • Facebook has deployed more than 40 PFLOP/s of GPU capability in house to support deep learning across their organization.

As neural networks get deeper and more complex, they provide a dramatic increase in accuracy, but training these higher accuracy networks requires much higher computation time, and their complexity increases prediction latency.

To satisfy this insatiable need for performance, NVIDIA created the DGX-1. The system can be deployed quickly for plug-and-play use by deep learning researchers and data scientists.

The architecture of DGX-1 draws on NVIDIA’s experience in the field of high-performance computing as well as knowledge gained from optimizing deep learning frameworks on NVIDIA GPUs with every major cloud service provider and multiple Fortune 1000 companies.

Powerful Tesla Volta GPUs and high-performance NVLink interconnect are just part of the DGX-1 story. NVIDIA DGX-1 with Tesla V100 GPUs achieves up to 3.1x faster deep learning training for convolutional neural networks than DGX-1 with previous-generation Tesla P100 GPUs (Figure below). High-performance NVLink GPU interconnect improves recurrent neural network training performance by up to 1.5x compared to slower PCIe interconnect.

Best throughput achievable on each platform. DGX-1 (P100) using FP32, DGX-1 (V100) using mixed precision (FP16 and FP32) using deep learning framework containers version 17.11

More productivity and performance benefits come from the fact that DGX-1 is an integrated system to enable data scientists and A.I. researchers to deploy deep learning frameworks and applications on DGX-1 with minimal setup effort.

The NVIDIA DGX-1 accelerates widely-used deep learning frameworks such as NVCaffe, Caffe2, Microsoft Cognitive Toolkit, MXNet, PyTorch, TensorFlow, Theano, Torch and TensorRT.  This DGX-1 Deep Learning software stack, combined with Tesla V100 and NVLink, enables NVIDIA DGX-1 to outperform similar off-the-shelf systems.

Read more >