Introducing NVIDIA Isaac Gym: End-to-End Reinforcement Learning for Robotics

For several years, NVIDIA’s research teams have been working to leverage GPU technology to accelerate reinforcement learning (RL). As a result of this promising research, NVIDIA is pleased to announce a preview release of Isaac Gym – NVIDIA’s physics simulation environment for reinforcement learning research. RL-based training is now more accessible as tasks that once required thousands of CPU cores can now instead be trained using a single GPU.

A cube manipulation task trained by Isaac Gym on a single A100 and rendered in Omniverse

RL has become one of the most promising research areas in machine learning and has demonstrated great potential for solving complex problems. RL-based systems have achieved superhuman performance in very challenging tasks, ranging from classic strategy games such as Go and Chess, to real-time computer games like StarCraft and DOTA.

RL based approaches also hold promise for robotics applications, such as solving a Rubik’s Cube, or learning locomotion by imitating animals.

Isaac Gym and NVIDIA GPUs, a reinforcement learning supercomputer 

Until now, most RL robotics researchers were forced to use clusters of CPU cores for the physically accurate simulations needed to train RL algorithms. In one of the more well-known projects, the OpenAI team used almost 30,000 CPU cores (920 computers with 32 cores each) to train their robot in the Rubik’s Cube task. 

In a similar task, Learning Dexterous In-Hand Manipulation, OpenAI used a cluster of 384 systems with 6144 CPU cores, plus 8 Volta V100 GPUs and required close to 30 hours of training to achieve its best results. This in-hand cube object orientation task is a challenging dexterous manipulation task, with complex physics and dynamics, many contacts, and a high-dimensional continuous control space. 

Isaac Gym includes an example of this cube manipulation task for researchers to recreate the OpenAI experiment. The example supports training both recurrent and feed-forward neural networks, as well as domain randomization of physics properties that help with sim-to-real transfer. With Isaac Gym, researchers can achieve the same level of success as OpenAI’s supercomputer — on a single A100 GPU — in about 10 hours! 

End to End GPU RL

Isaac Gym achieves these results by leveraging NVIDIA’s PhysX GPU-accelerated simulation engine, allowing it to gather the experience data required for robotics RL.

In addition to fast physics simulations, Isaac Gym also enables observation and reward calculations to take place on the GPU, thereby avoiding significant performance bottlenecks. In particular, costly data transfers between the GPU and the CPU are eliminated.

Implemented this way, Isaac Gym enables a complete end-to-end GPU RL pipeline.

Isaac Gym

Isaac Gym provides a basic API for creating and populating a scene with robots and objects, supporting loading data from URDF and MJCF file formats.  Each environment is duplicated as many times as needed, and can be simulated simultaneously without interaction with other environments.

Isaac Gym provides a PyTorch tensor-based API to access the results of physics simulation work, allowing RL observation and reward calculations to be built using the PyTorch JIT runtime system, which dynamically compiles the python code that does these calculations into CUDA code, running on the GPU.  

Observation tensors can be used as inputs to a policy inference network, and the resulting action tensors can be directly fed back into the physics system. Rollouts of observation, reward, and action buffers can stay on the GPU for the entire learning process eliminating the need to read data back from the CPU.

This set-up permits tens of thousands of simultaneous environments on a single GPU, allowing researchers to easily run experiments locally on their desktops that previously required an entire data center.

Isaac Gym also includes a basic Proximal Policy Optimization (PPO) implementation and a straightforward RL task system, but users may substitute alternative task systems or RL algorithms as desired. Also, while the included examples use PyTorch, users should also be able to integrate with TensorFlow based RL systems with some further customization.

Some additional features of Isaac Gym include:

  • Support for a variety of environment sensors – position, velocity, force, torque, etc.
  • Runtime domain randomization of physics parameters
  • Jacobian / inverse kinematics support

Research Results    

NVIDIA’s research team has been applying Isaac Gym to a wide variety of projects. You can take a sneak-peek at some of these below, but stay tuned to for more details on these projects.   

Get Started Today

Are you a researcher or academic interested in RL for robotics applications? Please download and try Isaac Gym

Future Plans

The core functionality of Isaac Gym will be made available as part of the NVIDIA Omniverse Platform and NVIDIA’s Isaac Sim, a robotics simulation platform built on Omniverse. Until then we are making this standalone preview release available to researchers and academics to show the possibilities of end-to-end GPU-based RL and help accelerate your work in this arena.