The Technology Behind Backgammon

The Technology Behind Backgammon
The die is cast as artificial intelligence takes online backgammon to a new level

Backgammon is the oldest known board game, with a history spanning thousands of years.

It is ever-popular in the Middle East and Asia, but in the rest of the world it swings in and out of fashion.

In the last century it enjoyed something of a revival in the 1920s, when the doubling cube was introduced to increase the skill element and enhance the game’s gambling potential, but the decade it had the most worldwide impact was the 1970s.

Big-money tournaments were set up, a world championship was established, backgammon clubs sprouted up everywhere (there were even backgammon restaurants) and a whole publishing industry was founded to satisfy the demand for books and magazines about the game.

Although interest waned in the 1980s, it is on the rise again and there’s talk of backgammon following in the footsteps of poker as the next big legal online gambling craze.

Celebrity players include Nicole Kidman and Tobey Maguire, and many of the top poker stars are fans of the game, including Gus Hansen, the Poker World Tour champion.

Such is the popularity of the online game that on the leading backgammon servers, where you can play for fun or real money, there are now monthly tournaments with guaranteed prizes of $100,000.

Chips with everything

Much of the renewed interest in the game stems from its connection with computing technology: not only because backgammon is being packaged as a hot new form of online gaming or because a PC is a tireless and always-available opponent, but because backgammon has become one of the success stories of research into artificial intelligence (AI).

Self-taught backgammon-playing computer programs are so good that they have overturned many of the assumptions previously held about how the game should be played and which are the best moves, especially in the opening phases of the game.

Any player who trains by regularly playing against a backgammon program, using moves suggested by the program itself, can learn to play at levels that were previously attained by only a few top players.

Unlike the top chess-playing computers and computer programs, which tend to use brute-force tree searching and rely on massive amounts of computing power in a hybrid hardware/software platform, any PC can run a world-class backgammon program.

Hydra, which is the world’s most powerful chess-playing computer and has never been beaten, is a cluster computer based on 32 x 3.06GHz Xeon processors, each paired with its own FPGA (Field Programmable Gate Array) device and capable of evaluating 200 million moves per second.

It is able to look 18 moves ahead, which is six more than IBM’s Big Blue. In contrast, the top backgammon programs, which are based on neural net techniques, will run happily on a bog-standard Pentium under Windows 95.

Learning by doing

Backgammon-playing programs are entirely self taught using the technique of reinforcement learning, which is one of the most promising avenues of research into artificial intelligence.

In essence, it’s no different from the way we train dogs, dishing out a reward when the animal does what we want it to and withholding the reward when it doesn’t.

In AI, the trainee or agent, is a neural network, and it is the aim of the agent to maximise the total amount of reward it receives. An agent won’t respond at all to a biscuit or a pat on the head, so the reward is a numerical one based on the agent’s most recent action.

What the agent must try to do is maximise the cumulative reward it receives for succeeding in its goal and not get hung up on the immediate reward for making one good decision.

The goal in any game is, of course, to win and it’s easy to score the outcome by awarding +1 points for a win, -1 points for losing and 0 points for a draw or uncompleted game.

An agent needs to understand the environment in which it operates and must know when it has achieved its goal, but in the case of backgammon this is incredibly simple.

A backgammon board is basically a one-dimensional race track split into 24 segments, with opponents racing in opposite directions and all checkers moving identically. A draw is impossible and the goal has been achieved when the agent gets all its checkers round the track before its opponent.

Timing is the key

The mode of reinforcement learning that has been so successful in teaching computers to play backgammon is called temporal difference (TD) learning, which is based on the differences between temporally successive predictions.

Each move by each notional player in the game (the computer plays both sides) is regarded as a time step, and there is a heuristic reward signal sent to the agent after each step and at the end of each game.

The agent learns to predict the best move by adjusting the prediction at each time step to make it more closely match the prediction at the next time step. It is the difference between successive predictions which is the only measure of error, and the program is never explicitly instructed as to what is the best move.

Gerald Tesauro, an IBM researcher, is responsible for pioneering TD techniques with backgammon. His program, TD-Gammon, was developed after abandoning experiments with a supervised learning program called Neurogammon, in which the good and bad moves were hard-coded.

Neurogammon never reached an expert level of play, whereas TD-Gammon went on improving for 1,500,000 games and became a world-class player. Readers with long memories may recall that a version of TD-Gammon was included in the 1996 Family Funpak for OS2/Warp.

The next commercially available neural net program came in 1998, in the form of Fredrik Dahl’s Jellyfish, and this was soon followed by Olivier Egger’s Snowie. The current version of Snowie is regarded as the state of the art in terms of its playing skills and analysis tools, and it is priced accordingly.

However, there is a free alternative in the form of GNU Backgammon. This was the brainchild of Gary Wong, who by 1999 had drawn on the work of Tesauro and others to produce a neural net backgammon player called Costello.

He donated his code to the GNU Project, and GNU Backgammon (as it became known) is still under development. It plays an extremely strong game and has not stopped learning.

A version of it plays on the First Internet Backgammon Server (FIBS), where it ranks in the top 20 of over 6,000 players.

All three programs provide a way of doing what no human player is easily able to do, which is to compute the equity of games and matches while they are in progress.

Human players can estimate their positions within a game by counting the points remaining for each player and applying their experience and gut instinct, whereas a neural net program can compute the equity of a game to several decimal places.

The simplest type of equity is game equity (sometimes called settlement equity), which is a value that represents what a game would be worth to you if you and your opponent decided to stop playing and settle with each other according to the current state of play.

If your equity is reported as 0.62 and the stake is $1, your opponent should pay you .62 to end the game.

Backgammon websites use neural net programs to compute the equity of games (and settle them) when connection or hardware problems make it impossible to play the games out.

When playing against a neural net backgammon program in tutorial mode, you are warned if you are about to make, or have made, a bad move.

You are presented with a table of all the possible moves for the current dice throw, with a computation of what the game equity would be after each move.

Often the differences are mere thousandths of a point, but sometimes (as when you’ve completely failed to see an obvious play), the difference in equity can be the difference between a losing position and a winning one.

In a sense, the equity table is a visual representation of how the program evaluates the temporal differences between moves.

For anyone with an interest in puzzles and mental arithmetic, the mathematics of backgammon form part of the attraction.

All good backgammon players know the chances of throwing any specific combination of numbers with two dice, along with a host of other probabilities involving the chances of being hit by pieces at various distances, and of being able to re-enter the board against a set number of occupied points.

Probability also governs when you should double your opponent, and whether you should accept the offer of a double yourself. The basic rule of thumb is that you should accept a double if you have at least a 25 per cent chance of winning, but it is notoriously difficult to compute your current position with this degree of accuracy.

However, when playing against a computer opponent in tutorial mode, it can tell you your precise chances of winning at any point in a game, all of which helps an aspiring player to recognise potential winning positions in subsequent games against human opponents.

It’s only a game…

Backgammon involves luck (the dice) and skill (moving and doubling), and it is generally thought to be fairer than poker because there are no hidden elements such as hole cards: both players’ pieces are visible to both players at all times.

Although a skilled player will beat an amateur three times out of four, an amateur can still win if the dice roll with them.

This is unlike chess and other games of pure skill where a class player will always wipe the floor with an amateur opponent – and it’s for this reason that backgammon players of all abilities tend to mix. This results in games that are more interesting to play because they feature a wider range of playing styles.

If backgammon really is set to follow in the steps of poker, now’s the time to start honing your skills before everybody else jumps on the bandwagon.

It needn’t cost you a penny if you practise on one of the many free servers, and you’ll never be short of playing partners because there are human and robot opponents online at all hours of the day, willing to play with you at any level.

Best of luck!

by Paul Wardley Personal Computer World

by

Paul Wardley
Personal Computer World