AI Learns to Play Dota 2 with Human Precision

Developers from the California-based non-profit OpenAI announced today that their five deep learning neural networks they call “OpenAI Five” beat amateur human teams at the popular battle arena game Dota 2.

“OpenAI Five plays 180 years worth of games against itself every day, learning via self-play. It trains using a scaled-up version of Proximal Policy Optimization running on 256 GPUs and 128,000 CPU cores,” the team wrote in a blog post. “Using a separate LSTM for each hero {the character the AI or player’s control) and no human data, it learns recognizable strategies. This indicates that reinforcement learning can yield long-term planning with large but achievable scale — without fundamental advances, contrary to our own expectations upon starting the project.”

Using 256 NVIDIA Tesla P100 GPUs on the Google Cloud, the team trained their neural network on millions of hours of game footage. The neural network is capable of playing 180 years of gameplay against itself every single day. When counted separately by AI player, the amount is 900 years per day of gameplay.

Dota 2 has been in development for over a decade and continues to receive updates every two weeks, meaning the game environment changes frequently.

The developers say their goal is to beat a team of top professional human gamers in August at The International, an annual Dota 2 eSports tournament.

“We may not succeed: Dota 2 is one of the most popular and complex esports games in the world, with creative and motivated professionals who train year-round to earn part of Dota’s annual $40M prize pool (the largest of any esports game),” the team explained.

The work builds on the on the 1v1 bot the team developed and released last year. Both neural networks learn entirely from self-play. They start with random parameters and do not use search or bootstrap from human replays, the team said.

Read more >