The Neural Network That Mastered Backgammon

Gerald Tesauro's TD-Gammon taught itself to play backgammon at world-champion level, proving neural networks could discover strategies humans never imagined.

March 01, 1992 · 📍 Yorktown Heights, United States

People involved: Gerald Tesauro

In 1992, Gerald Tesauro at IBM’s Thomas J. Watson Research Center developed TD-Gammon, a groundbreaking computer backgammon program that used artificial neural networks and temporal difference learning. Wikipedia — TD-Gammon explains how TD-Gammon achieved world-class play through self-play and introduced novel strategies that human experts adopted. By 1993, version 2.1 of TD-Gammon had played 1.5 million games and was nearly on par with top human players. Its success in backgammon demonstrated the potential of reinforcement learning and neural networks, influencing future AI developments like AlphaGo.

Why it matters: TD-Gammon’s achievement marked a significant milestone in AI, showing that complex games could be mastered through self-play and reinforcement learning. It not only improved backgammon strategy but also paved the way for advancements in machine learning and game AI.

Further reading: