Fermat's Library | Some Studies In Machine Learning Using the Game of Checkers annotated/explained version.

### TL;DR In 1959 Arthur Lee Samuel coined the term machine lear...

Arthur Lee Samuel was an American computer scientist and pioneer in...

Important characteristics to take into account when choosing the pr...

Samuel used a **look ahead search algorithm, also known as backtrac...

Minimax is a recursive algorithm used in decision making and game t...

### IBM 704 The IBM 704 Data Processing System was a large-scale c...

In two-player sequential games, **a ply is one move/turn taken by o...

### What is Rote Learning? One of the learning techniques used b...

### Sense of direction The rote learning method produced a slow bu...

### What is learning by generalization? > ***"An obvious way to ...

Samuel's checkers algorithm using the generalization learning metho...

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000 A. L. SAMUEL

207

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000A. L. SAMUEL

208

VOL. 3, NO. 3, 1959, REPRINT

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000 A. L. SAMUEL

VOL. 3, NO. 3, 1959, REPRINT

209

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000A. L. SAMUEL

210

VOL. 3, NO. 3, 1959, REPRINT

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000 A. L. SAMUEL

VOL. 3, NO. 3, 1959, REPRINT

211

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000A. L. SAMUEL

212

VOL. 3, NO. 3, 1959, REPRINT

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000 A. L. SAMUEL

VOL. 3, NO. 3, 1959, REPRINT

213

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000A. L. SAMUEL

214

VOL. 3, NO. 3, 1959, REPRINT

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000 A. L. SAMUEL

VOL. 3, NO. 3, 1959, REPRINT

215

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000A. L. SAMUEL

216

VOL. 3, NO. 3, 1959, REPRINT

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000 A. L. SAMUEL

VOL. 3, NO. 3, 1959, REPRINT

217

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000A. L. SAMUEL

218

VOL. 3, NO. 3, 1959, REPRINT

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000 A. L. SAMUEL

VOL. 3, NO. 3, 1959, REPRINT

219

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000A. L. SAMUEL

220

VOL. 3, NO. 3, 1959, REPRINT

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000 A. L. SAMUEL

VOL. 3, NO. 3, 1959, REPRINT

221

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000A. L. SAMUEL

222

VOL. 3, NO. 3, 1959, REPRINT

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000 A. L. SAMUEL

VOL. 3, NO. 3, 1959, REPRINT

223

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000A. L. SAMUEL

224

VOL. 3, NO. 3, 1959, REPRINT

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000 A. L. SAMUEL

VOL. 3, NO. 3, 1959, REPRINT

225

226

IBM J. RES. DEVELOP. VOL. 44 NO. 1/2 JANUARY/MARCH 2000A. L. SAMUEL

VOL. 3, NO. 3, 1959, REPRINT

Discussion

In two-player sequential games, **a ply is one move/turn taken by one of the players.** I When using a ply of 3 would the machine would be looking ahed thus creating a tree that would result if 3 moves were to be played. ### IBM 704 The IBM 704 Data Processing System was a large-scale computer designed for engineering and scientific calculations. In the 1950s it was considered as ***"pretty much the only computer that could handle complex math".*** It could execute up to 12,000 floating-point additions per second. IBM sold 123 type 704 systems between 1955 and 1960. [Learn more here: 704 Data Processing System](https://www.ibm.com/ibm/history/exhibits/mainframe/mainframe_PP704.html) !["704 picture"](http://www.columbia.edu/cu/computinghistory/704-llnl.jpg) Samuel's checkers algorithm using the generalization learning method attained **"better-than-average" play so that good amateur opponents characterized the machine as "tricky but beatable".** This version of the algorithm was able to develop a good mid-game but remained weak in opening and ending parts of the game. ### Sense of direction The rote learning method produced a slow but steady learning rate that was most effective in the opening and ending part of the game. The mid-game tended to be slow because the machine **lacked a sense of direction - it did not know how to find the fastest way to win**. In order to improve to solve this issue Samuel gave it a "a sense of direction" so that: > "If the program is now faced with a choice of board positions whose scores differ only by the ply number, it will automatically make the most advantageous choice, choosing a low ply alternative if winning and a high ply alternative if losing" He did this by decreasing a position's value a small amount each time it was backed up a level during the minimax analysis. This method was essential to successful learning. Important characteristics to take into account when choosing the problem: 1. The activity must not be deterministic. 2. There must exist a definite goal to the problem. 3. The rules must be known and clearly stated. 4. We need certain amount of activity related knowledge. 5. The machine should be able to play against humans. ### What is learning by generalization? > ***"An obvious way to decrease the amount of storage needed to utilize past experience is to generalize on the basis of experience and to save only the generalizations."*** Learning by generalization meant modifying the evaluation function based on previous games played by the program. Here is a description of how it worked: - make the program play against other versions itself and against human opponents - version A adjusts its coefficients after each move whereas version B uses the same evaluation polynomial for the duration of any  game - If version A wins, we set version B = version A. Otherwise,  randomize version A and repeat.   This was groundbreaking achievement, it showed that a a program could learn by playing against previous versions of itself. ### What is Rote Learning? One of the learning techniques used by Samuel was **rote learning**. Rote learning consists of saving all of the board positions encountered during the game together with the computed evaluation scores until a specific depth. This is a method of learning by memorizing. If a board position had been encountered before the depth of the search tree would be amplified thanks to the cached results. After 53,000 board positions saved Samuel noted that rote learning: - was a slow but continuous learning rate that was most effective in the opening and end-game phases of the game - the mid-game was bad due to a lack of **sense of direction**, the machine does not know the fastest way to win. Thanks to rote learning is program became a **"better-than-average novice" after learning from many games against itself, a variety of human opponents, and from book games in a supervised learning mode.** Minimax is a recursive algorithm used in decision making and game theory to find the optimal move for a player. This algorithm is used in two player turn-based games such as checkers. The two players are called **maximizer and minimizer. The maximizer tries to get the highest score possible while the minimizer tries to get the lowest score possible.** !["checker tree moves"](https://i.imgur.com/whAkug2.png) Figure: Circles represent the moves of the maximizing player, and squares represent the moves of the minimizing player. The values inside the circles and squares represent the value of the minimax algorithm. The red arrows represent the chosen move, the numbers on the left represent the tree depth and the blue arrow the chosen move. Learn more here: [Minimax Algorithm in Game Theory](https://www.geeksforgeeks.org/minimax-algorithm-in-game-theory-set-1-introduction/) Arthur Lee Samuel was an American computer scientist and pioneer in the field of artificial intelligence. In 1959 he coined the term **machine learning to denote the "field of study that gives computers the ability to learn without being explicitly programmed”.** His Checkers-playing Program, the subject of this paper was among the world's first successful self-learning programs [Learn more here: Arthur Lee Samuel](https://history.computer.org/pioneers/samuel.html) !["Samuel playing checkers"](https://i.imgur.com/3MHdV9s.png) Samuel used a **look ahead search algorithm, also known as backtracking algorithms**. He implemented heuristic search methods to determine how to expand the search tree and when to stop searching. The terminal board positions of each search were evaluated and given a score using a linear function. Working backward through the search tree from the terminal positions, each position was given the score of the position that would result from the best move. Samuel's machine would always try to maximize the score, while the opponent would always try to minimize it. This work was inspired by the suggestions made by Claude Shannon a few years earlier. **Shannon suggested that in two-player games, the minimax algorithm determines the best move.** Minimax is a recursive algorithm that chooses the optimal moves based on the assumption that both players play optimally. Minimax tries to model what human players do when they play: "If I make this move, then my opponent can only make these moves, then I will make this move,..." ### TL;DR In 1959 Arthur Lee Samuel coined the term machine learning as the ***"field of study that gives computers the ability to learn without being explicitly programmed".*** This is his seminal paper originally published in 1959 where Samuel sets out to build a program that can learn to play the game of checkers. Checkers is an extremely complex game - as a matter of fact the game has roughly 500 billion billion possible positions - that using a brute force only approach to solve it is not satisfactory. Samuel’s program was based on Claude Shannon’s minimax strategy to find the best move from a given current position. In this paper he describes how a machine could look ahead > "by evaluating the resulting board positions much as a human player might do". and that > "at our present stage of knowledge, the only practical approach, even with the help of the digital computer, will be through the development of heuristics which tend to ape human behavior." The computer started out losing but eventually ended up beating Samuel. Samuel's checkers-playing program ***is widely recognized as a significant achievement in artificial intelligence and machine learning.***

Comments

Products

Project