Games have become a widely adopted way of benchmarking the advances of artificial intelligence. Since Deep Blue won a chess game against Gary Kasparov in 1997, AI had many remarkable advances. Machines have topped the best humans at most games held up as measures of human intellect, including chess, Scrabble, Othello, even Jeopardy!
A few years ago we witnessed Google’s AphaGo winning a game against world champion in Go (2,500-year-old game that’s exponentially more complex than chess and requires a bit of intelligence and strategy instead of just trying all possible combination of actions), showing the impressive advancement of artificial intelligence, especially in the past few years . Although impressive, it was still questionable if we are talking about intelligence and if this is the way to measure it.
Time for AI to become the best Dota2 team
It’s 2018 now, and we are witnessing another human champion defeated by an AI agent, this time in Dota2. A few months ago, OpenAI, a non-profit, San Francisco-based AI research company backed by Elon Musk, Reid Hoffman, and Peter Thiel has announced that they are planning an official Dota2 match between their Dota2-playing AI agent and a team of former pros in Dota2. Not chess, not Othello or Go, but one of the world’s most popular online strategy games: Valve’s Dota 2. It immediately caught the attention: both of the AI community and the Dota gamers community.
But, for OpenAI’s Dota player this was not the first game it was about to play. Previously, in its first generations, it was constrained only to 1-vs-1 matches (which are supposed to be less complex) but it had been tested many times before even going 5-vs.-5. In June this year, OpenAI’s Dota player managed to beat five teams of amateur players in June (in 5-vs.5 settings), including one made up of Valve employees. The challenge to beat a team of pros has been called right after this success.
To make the game manageable for AI a few (low impact) constraints were introduced in the game such as narrower hero pool to choose from and item delivery couriers that are invincible. Those have been said to not affect much the measure of OpenAI Five’s success.
On August 5th, OpenAI Five (as the AI agent player was called) defeated the team of former pros in a best-of-three series of games, much easier than expected! It won the first game without having a tower destroyed by the opponent, which is truly remarkable. Moreover, it showed intelligence in the decisions and strategic decision-making as a result of good team play.
How did they do it?
OpenAI Five is actually a set of five single-layer, 1,024-unit long short-term memory (LSTM) recurrent neural networks assigned to each hero (in the 5-vs.-5 setting). The fact that it is not a multi-modular super-complex system but just a set of simple recurrent neural networks, makes it even more impressive.
“People used to think that this kind of thing was impossible using today’s deep learning”, as told by Greg Brockman, one of the co-founders and CEO of OpenAI. The simple setting of five networks that do not even communicate with each other has defeated a team of human masters. Even though the heroes (in fact, the networks) don’t communicate, there is still a noticeable team chemistry. Actually, the teamwork is designed to be controlled by a hyperparameter that can be set from 0 to 1 and adds a weight onto how much each hero should care about its own individual rewards compared to the average reward from the whole team. In this way, OpenAI Five is able to come up with its own strategies.
All of the networks get an input every four frames about the state of the game. Each input is in fact 20,000 mostly floating-point numbers that encode vital information such as the location and health of visible units, giving it access to the same knowledge that human team can have. On average, the time of reaction is faster than human’s giving a slight advantage to the OpenAI’s agents.
The networks were trained with OpenAI’s Proximal Policy Optimization and self-play. They “play” 180 years’ worth of games every day — 80 percent against themselves and 20 percent against past selves. To make all of this possible, OpenAI used 256 Nvidia Tesla P100 graphics cards and 128,000 processor cores on Google’s Cloud Platform.
OpenAI announced that they are working on improving OpenAI Five even more and that it will compete in “The International” – the biggest international Dota2 tournament, where teams can win millions of dollars. OpenAI Five will have a chance to become the champion of the famous Dota2 and to show again how much we have advanced Artificial Intelligence.