Building an Intelligent Tic-Tac-Toe in Swift 3.0 (Part 1)

(Source Code in Swift 3.0 at BitBucket)

The Only Winning Move is NOT to Play

It inspired a generation, and here I am living off that inspiration…

WarGames was a film about a young hacker who, while trying to change his school grades, accidentally breaks into a US Air Force supercomputer and gets one keystroke away from starting WW3! The computer system that controls the missiles and which shows signs of a machine learning type intelligence runs a Tic-Tac-Toe simulation until it observes that when playing perfectly, no player can win and therefore “Thermonuclear War” is futile (obvs!).

Over the last year I have become very interested in Machine Learning. It is still a new field and comprehensive resources are relatively scarce. I am not one to follow recipes and “black-box” type libraries as I feel that one needs to build something from the ground up, at least once in their career, in order to have a deep knowledge of the subject.

In this tutorial I will create a game of Tic-Tac-Toe and explore a couple of AI algorithms that can be used to program an intelligent opponent for our player. My goal is to show Reinforcement Learning by first going through the better known method of Minimax. This is the first part of a 2-part tutorial where the second focuses on the actual RL technique while this will go through the groundwork of building the game and programming it with old school AI.

If you feel comfortable with Swift 3.0 and board games in general, please skip to the second part directly.

 The Game of Tic-Tac-Toe

Tic-Tac-Toe (or Noughts and crosses or Xs and Os) is a paper-and-pencil game for two players, X and O, who take turns marking the spaces in a 3×3 grid. The player who succeeds in placing three of their marks in a horizontal, vertical, or diagonal row wins the game.

(source: Wikipedia).

The reasons why I chose this game as a test case are pretty obvious: It is one of the best known games of its kind, the rules are super simple and there are well known AI solutions against which one can measure their own. More specifically, it is one of the few games that are so easily solved by a conventional Minimax strategy, so that one could easily program a perfect AI player even in relatively weak portable device (like the iPhone). This of course means that no further research needs to be performed but the purpose of this article is to introduce new concepts rather than beating existing ones. What I also found interesting is that since it is easy to create a perfect AI player, one that either wins or draws a tie, making the actual gameplay pretty boring, it is interesting to spice things up by creating a player that is learning the game, thus becoming more and more difficult to beat as the game progresses. This is actually something that players would like to play rather than admire for a few seconds and then put down.

Building the Board Model

Hitting two birds with one stone we will take Swift 3.0 out for a spin, explore certain aspects of the language and build a self learning algorithm referencing the most reliable resources I can find. What I want to focus on in terms of the language itself is its power of simplicity as, when used correctly, it can easily feels like one is writing half the code that he would otherwise with Objective C or any other popular language.

First thing we need is a model for the 3×3 board itself, and for that we need a way to represent an individual cell. We can simply create a Cell class but since this tutorial is also about simplicity. So the way I think about it is this: What data is a cell actually holding? …and the answer is a “mark”, which is either an X or an O, or is Empty. So, why not create a Mark enum which is more concise and use that to represent cells? (don’t answer, it’s a rhetoric question).

What is unique with Swift is it’s flexibility and mixing of different paradigms. True, not everyone enjoys that freedom as they feel it might make code unreadable for certain members of the team, but I love it! So here we have an enum which has the 3 possibilities for a cell, and the ability to get the opposite mark from a given one. This will come handy later.

Notice the == operator overload, Swift allows us to redefine operators for custom types and we need to do this since the compiler cannot know what we mean by equality on non-primitive values. Since we have conveniently defined the Mark enum to be a String, we can use that as a comparison.

Subscripting in Swift

The second question that naturally arises is how are we going to represent the board? The board is a collection of cells so we need to hold them in some kind of array. The first though that might come to mind is by a 2-dimensional array like:

But we can go simpler. We can hold everything in a 1-dimensional array and use subscripts to calculate the correct offset for a give row and column.

Here we use the repeating function of Array to create the 3×3 grid in one line. Be careful, this will work with values rather than references. In other words, it is only because enums are copied over that we can use this method.

The subscript method calculates the offset and either returns or sets the correct cell. By using this method we can turn any object into an array of sorts and use the bracket notation as follows:

This will set the last column of the first row into an “X”.

Classes vs Structures

Also note that we have changed the class into a struct. This is a really crucial point with Swift that must be understood before attempting to do anything with the language. I suggest you do your own research as there are many different articles that explain the differences and you should find the one that uses the right wording for you (one good example is this). In a very small nutshell, structs are value types while classes are reference types. This is different from C++ where both can be either references or values depending on how the variables are declared. In C++ structs feels like a legacy feature and are not really different from classes apart from some minor things. In Swift, they are a feature built into the language to allow for various programming paradigms to coexists and be used at the developer’s discretion. If we define our Board as a class, we can change its state (ie. its public variables) and every variable pointing to that will point to it’s altered state. If we define it as a struct then changing a variable will create a new copy of this object and reassign it to the variable. This is the reason why the variable must be defined as a var rather than a let.

When the name variable is changed, a new copy of the Person object with an altered variable and re-assign it to p2. This reassignment obliges p2 to be a variable rather than constant.

Playing a Turn

We make a move on a board by changing the data variable where we hold our cells. Since the board is copied every time one of its variables is changed, Swift obliges us to mark the methods with the mutating keyword. This seemed annoying at first until I realised how what difference it really makes!

Above we also introduce a typealias to hold a tuple of two ints as a Coordinate, another cool feature of Swift. The ability to label tuples, as we have done with row and col, makes them even the more powerful since for the most part they will look like value objects with their properties being accessible by name.

Printing it Out

At this moment, it might be beneficial to have a way of printing out the state of our board and what better way than to use a simple form of ASCII art? For those of you that have been used to Objective-C, you know that the root object (NSObject) has a

method that one can override and return whatever they wish. In Swift, you can still inherit from NSObject and make use of that method, but the built in way to do it for a plain object is by implementing the CustomStringConvertible protocol. I always prefer to do this in an extension since this piece of code rarely interacts with any other and even the protocol name is too long and ugly to be included int he head of the class declaration.

When setting the X at [0,2] as in code above and printing it out we should get:

Expanding the Board

One more function we need to add to our Board, and for reasons that will become obvious in a minute, is the ability to get back a list of possible moves and their resulting boards corresponding to them. We call this act expanding a board.

So here we simply say, check every cell, if it is not empty continue, else play it and append the result in the list. Here is where the difference between structures and classes becomes crucial. When we define var board = self, the variable is still pointing to the same object as self is. When the board executes the play method defined above, and because it is a mutating function, it will create a new copy independent of the self and append that to the list. In this way, the next iteration will have the original board object pointed to by self, intact.

Who Won?

Lastly we need to be able to figure out if the game has finished and who has won it and with what pattern. By that I mean, which 3 cells, a b and c, did he complete with his mark in order to win? To capture this I have created another structure:

and a list of possible patterns

We will also have a State enum with 3 possible values: Playing, Victory and Draw. I took advantage of another cool feature of Swift, associated values. This allows us to capture values when instantiating an enum and extract it later in our switch statement.

When returning a victory, the method that will determine the state will have the possibility of massing the mark of the player who won and the pattern with which he won.

Finally, everything will come together in a get only variable called state

If there are 3 in a row in any of the patterns return victory, otherwise if there is no victory and no more empty cells, it is a tie, return Draw, else the game is still going.

Artificial Intelligence

Here is where it gets interesting. We will attempt to make the opponent intelligent by adding a few known techniques, culminating in the Reinforcement Learning approach that was promised in the beginning. For that we will create a Player object which will contain the algorithm of choice. In this way we can experiment by having different methods play against each other and observe the results.

I have created a convenience extension on Array to return a random element if the count is more than 0 (see source code).

The Player class has only one option: Random, which is fine for the moment. Let’s create an opponent based on this in our main ViewController.

If there is no choice returned, the game is finished because the player does not have any moves (the expansion returned an empty array). Else, play the move returned by the player.

Adding Intelligence through the Strategy Pattern

In order to add more methods in the player class it is preferable to choose the Strategy design pattern that will allow us to abstract the implementation of each AI algorithm in their own class. First I define the superclass.

It encapsulates the player’s mark and has only one method: choose. Now, the random choice implementation can be moved away from the Player and into it’s own class.

The player will instantiate a strategy object depending on it’s intelligence type and call it to get it’s next move.

Basic Strategy

Our first approach is the so called “rules based”. What that means is that we are hardcoding some rules that the agent will follow. It’s effectively a set of conditions (or if-else statements if you prefer) that captures what needs to be done in any condition.

If you look into the source code for BasicStrategy you will see that what it essentially does is first check and see whether there is a move in which the player using the strategy wins, and take that if it exists.

Failing to find one it will check whether there is a move in which the opponent wins and block it. This will mean that the method will avoid “stupid” mistakes like not filling in an X when there are two already in the same row and the other cell is empty.

Minimax Strategy

Since the point of this article is Reinforcement Learning, and since there are excellent resources out there explaining the use of Minimax specifically for Tic-Tac-Toe, I will delegate any explanation directly to them or my source code.

The basic idea is that we expand the current board and loop over each possible move returned. For each one, we call the same function recursively thus performing a depth first search on the game tree. A leaf node is one where a victory or a draw is reached in either case we return a Result object that is there to hold a score and a coordinate. Eventually the values will bubble up (in the peculiar minimax way) and the best score will be returned by the strategy

Leave a Reply

Your email address will not be published. Required fields are marked *