Loading…

Mastering the game of Go with deep neural networks and tree search

The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positio...

Full description

Saved in:

Bibliographic Details
Published in:	Nature (London) 2016-01, Vol.529 (7587), p.484-489
Main Authors:	Silver, David, Huang, Aja, Maddison, Chris J., Guez, Arthur, Sifre, Laurent, van den Driessche, George, Schrittwieser, Julian, Antonoglou, Ioannis, Panneershelvam, Veda, Lanctot, Marc, Dieleman, Sander, Grewe, Dominik, Nham, John, Kalchbrenner, Nal, Sutskever, Ilya, Lillicrap, Timothy, Leach, Madeleine, Kavukcuoglu, Koray, Graepel, Thore, Hassabis, Demis
Format:	Article
Language:	English
Subjects:	631/378/1788 639/705/1042 639/705/117 Algorithms Analysis Artificial intelligence Computer games Computers Europe Evaluation Games Games, Recreational Go (Game) Humanities and Social Sciences Humans Monte Carlo Method Monte Carlo simulation multidisciplinary Neural networks Neural Networks (Computer) Product development Reinforcement (Psychology) Science Software Supervised Machine Learning Technology application
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away. A computer Go program based on deep neural networks defeats a human professional player to achieve one of the grand challenges of artificial intelligence. AlphaGo computer beats Go champion The victory in 1997 of the chess-playing computer Deep Blue in a six-game series against the then world champion Gary Kasparov was seen as a significant milestone in the development of artificial intelligence. An even greater challenge remained — the ancient game of Go. Despite decades of refinement, until recently the strongest computers were still playing Go at the level of human amateurs. Enter AlphaGo. Developed by Google DeepMind, this program uses deep neural networks to mimic expert players, and further improves its performance by learning from games played against itself. AlphaGo has achieved a 99% win rate against the strongest other Go programs, and defeated the reigning European champion Fan Hui 5–0 in a tournament match. This is the first time that a computer program has defeated a human professional player in even games, on a full, 19 x 19 board, in even games with no handicap.
ISSN:	0028-0836 1476-4687
DOI:	10.1038/nature16961