
In Nimthere are a limited number of optimal moves for a given board configuration. If you don’t play one of them, then you give control to your opponent, who can win if they play nothing but optimal moves. And again, the optimal moves can be determined by evaluating the mathematical parity function.
So there are reasons to think that the training process that works for chess might not be effective Nim. The surprise is how bad it actually is. Zhou and Riis found that Nim five-row board, the AI improved fairly quickly and was still improving after 500 training iterations. Adding just one row caused the rate of improvement to slow down dramatically. For the seven-row board, performance gains largely stopped by the time the AI had played 500 times.
To better describe the problem, the researchers replaced the subsystem that suggests potential actions with a system that runs randomly. In seven rows Nim board, the performance of the trained and random versions was indistinguishable over 500 training gains. Basically, once the board got big enough, the system couldn’t learn by observing game results. In the initial state of the seven-row configuration, there are three potential moves that correspond to the final win. However, when his system’s trained motion rater was asked to examine all potential motions, he rated each as roughly equivalent.
The researchers concluded that Nim requires players to learn the parity function to play effectively. And a training procedure that works very well for chess and Go is unable to do so.
Not alone Nim
Here’s one way to look at the result Nim (and all biased games by the way) is just weird. But Zhou and Riis also found some signs that similar problems might arise in chess-playing AIs trained in this way. They initially identified several “wrong” chess moves—moves that missed a mating attack or threw the endgame—highly rated by the AI’s board evaluator. Only because the program took several steps in the future and took a number of additional branches was it able to avoid these gaps.




