Scaling Keras Model Training to Multiple GPUs

Keras is a powerful deep learning meta-framework which sits on top of existing frameworks such as TensorFlow and Theano. Keras is highly productive for developers; it often requires 50% less code to define a model than native APIs of deep learning frameworks require. This productivity has made it very popular as a university and MOOC teaching tool, and as a rapid prototyping platform for applied researchers and developers.
Unfortunately, Keras is quite slow in terms of single-GPU training and inference time (regardless of the backend). It is also hard to get it to work on multiple GPUs without breaking its framework-independent abstraction.
Can this be improved, leveraging Keras’s high-level API, while still achieving good single-GPU performance and multi-GPU scaling? It turns out that the answer is yes, thanks to the MXNet backend for Keras, and MXNet’s efficient data pipeline. Last week, the MXNet community introduced a release candidate for MXNet v0.11.0 with support for Keras v1.2.

ResNet-50 training throughput (images per second) comparing Keras using the MXNet backend (green bars) to a native MXNet implementation (blue bars).

In a new NVIDIA Developer Blog post, Marek Kolodziej shows how to use Keras with the MXNet backend to achieve high performance and excellent multi-GPU scaling. As a motivating example, I’ll show you how to build a fast and scalable ResNet-50 model in Keras.
Read more >

Scaling Keras Model Training to Multiple GPUs

Related resources

Tags

About the Authors

Scaling Keras Model Training to Multiple GPUs

Related resources

Tags

About the Authors

Comments

Related posts

New Optimizations To Accelerate Deep Learning Training on NVIDIA GPUs

SONY Breaks ResNet-50 Training Record with NVIDIA V100 Tensor Core GPUs

NVIDIA DGX-2 AI Supercomputer and New Tesla T4 Set Image Recognition Records for Training and Inference

Speeding Up Deep Learning Training with NVIDIA V100 Tensor Core GPUs in the AWS Cloud

Scaling Keras Model Training to Multiple GPUs

Related posts

Just Released: NVIDIA Modulus v24.04

New Video Series: OpenUSD for Developers

Generative AI for Digital Humans and New AI-powered NVIDIA RTX Lighting

NVIDIA Speech and Translation AI Models Set Records for Speed and Accuracy

Boost Multi-Omics Analysis with GPU-Acceleration and Generative AI