AlphaGo: Program that plays the board game Go

Introduction

In this blog, we would discuss AlphaGo: Program that plays the board game Go. A computer program called AlphaGo competes in the game of Go. DeepMind Technologies, a Google company, created it (now Alphabet Inc.). AlphaGo evolved into more potent versions throughout time, including one that competed under the name Master. The even more potent AlphaGo Zero, which was entirely self-taught without learning from human games, succeeded AlphaGo Master after it stopped competing. It was later generalized into a program called AlphaZero that could play chess and shogi in addition to Go. In turn, MuZero, a program that learns without being explicitly taught the rules, succeeded AlphaZero.

 

 

 

History

David Silver from DeepMind states that the AlphaGo research project was started in 2014 to see how well a neural network can compete in Go. Over earlier Go programs, AlphaGo offers a substantial improvement. AlphaGo, operated on a single computer, defeated rival Go systems like Crazy Stone and Zen in 500 games, winning all but one.  Similar tests found that AlphaGo running on many computers won 77% of games against AlphaGo running on a single machine and all 500 games against other Go programs. In October 2015, the distributed version utilized 176 GPUs and 1,202 CPUs.

 

The 2-dan (out of a possible 9-dan) professional and European Go champion Fan Hui were defeated five to zero by the distributed version of AlphaGo in October 2015. This was the first time an expert human Go player had been defeated by a computer program on a full-sized board without the use of a handicap.

 

Artificial intelligence: Google's AlphaGo beats Go master Lee Se-dol - BBC News

Google’s AlphaGo beats Go master Lee Se-dol

 

 

 

Algorithm of AlphaGo: Program that plays the board game Go

As of 2016, AlphaGo’s system makes considerable use of substantial training from both human and computer play, as well as machine learning and tree search approaches. It employs a “value network” and a “policy network,” both of which are deep neural network implementations, to direct the Monte Carlo tree search. Before the input is given to the neural networks, a small amount of game-specific feature identification pre-processing is conducted on it (for instance, to indicate whether a move matches a nakade pattern). Convolutional neural networks with 12 layers that were trained using reinforcement learning make up the networks.

 

The neural networks of the system were initially bootstrapped from the gameplay knowledge of humans. Using a database of about 30 million moves, AlphaGo was initially trained to imitate human play by attempting to replicate the games of skilled players from games that had been recorded in the past. It was trained further by being forced to play a lot of games against other instances of itself after it had attained a certain level of expertise, using reinforcement learning to enhance its performance.

 

 

 

AlphaGo Zero

It is a version that excludes human data yet is more powerful than any other version that has defeated a human champion. AlphaGo Zero, by competing against itself, surpassed the strength of AlphaGo Lee in three days by winning 100 games to zero, AlphaGo Master in 21 days, and all previous versions in 40 days. In order to achieve a superhuman level of play in the games of chess, shogi, and go within 24 hours, DeepMind claimed to have generalized AlphaGo Zero’s methodology into a single AlphaZero algorithm. This algorithm reportedly defeated world-champion programs Stockfish, Elmo, and the 3-day version of AlphaGo Zero in each instance.

 

 

Also, read – What are population-based training and its uses?

 

Share this post

3 thoughts on “AlphaGo: Program that plays the board game Go

Leave a Reply

Your email address will not be published. Required fields are marked *