1-Hot Encoding: The Chess Programmer’s Secret Weapon (Ep.1)

Building My Second Chess Engine Using Python

This article is also available to read on Medium

Image generated using DALL-E by OpenAI

This is not my first rodeo.

I’ve built a chess engine before, but the results were… let’s say… underwhelming.

Don’t get me wrong, the program worked; it was able to play a full game of chess, and for that, I give it credit. But let’s face it, it was terrible at chess.

It took minutes to produce a move, even at a low search depth. And then, the moves it would output were random and unpredictable: sometimes it would force you into a neat mate-in-3, other times it would sacrifice a rook and then lose the game entirely.

I recently decided it’s time for round 2, and I thought I’d document my journey and share some things I learn in the process. There are so many different problems that arise, and so many different approaches to solving them. Of all the programming rabbit holes I’ve gotten lost down, chess programming is definitely one of my favourites.

[Note: The rest of this article assumes a basic level of chess understanding. Familiarity with the rules, piece movement and board notation is recommended (including knowledge of en passant, castling and different draw conditions such as repetition of moves). Check out chess.com for a comprehensive guide to the rules.]

What is a Chess Engine?

Put simply, a chess engine is a program that, given a valid chess position, will determine the ‘best’ move that can be made. There are several components that make up an engine including, a board representation, move generation algorithms, a recursive search algorithm, and an evaluation function.

The chessboard representation is perhaps the most important part. Since it will be accessed and manipulated so many times, creating a compact data structure, and some efficient methods to manipulate it are vital to the success of our engine.

Board Representation

A board representation encodes a chess position. It needs to record which piece is where, which player is next to move, and if anything can be taken by en passant (as this is not always obvious by the position of the pieces). There are two main approaches to board representations: a board-centric view, and a piece-centric-view.

  • A board-centric view stores a list or set containing each board square, and associates with it an identifier for any piece that occupies it.
  • A piece-centric view approaches this the other way around. Instead, we allocate a memory location to each piece, and in it, store the square that the piece resides on.

These structures differ in the way they can be accessed. With a board-centric view, it’s quite efficient to find the piece located on a square, but it’s less straightforward to find the square given a specific piece. The opposite is true for a piece-centric structure

For this project, I will be using a piece-centric model known as a bitboard.

Bitboards

Python
A 64-bit binary number formatted in an 8x8 grid            
        
71776119061217280 =     
   
--- least significant bit
          |
          v                          
     (0b) 0 0 0 0 0 0 0 0 
          0 0 0 0 0 0 0 0
          0 0 0 0 0 0 0 0
          0 0 0 0 0 0 0 0
          0 0 0 0 0 0 0 0
          0 0 0 0 0 0 0 0
          1 1 1 1 1 1 1 1
          0 0 0 0 0 0 0 0
                        ^
--- most significant bit

A bitboard is just a 64-bit binary number. We use one bit of this number to represent each square on the board. The bit will be a 1 if the square contains a piece, and a 0 otherwise. Hence, this is a variation of 1-hot encoding.

Not only can this be used to store the locations of pieces, but it is also a tool that allows us to easily compute possible moves; use the locations of pieces in calculations (without knowing their location) and access, modify and create other bitboards. And, because it’s just a binary number, we can use all of the standard bitwise logical and arithmetic operations to modify it.

Since we can visually format the binary string into a 8×8 grid (as shown above) it’s easy to see how this bitboard can be used as a mask. For example, the board above is a mask indicating the starting position of each of the white pawns. Instead of recording a number for the location of each piece, we use this optical picture to indicate its location. This mask can be used to target specific functions and operations to work only on specific squares.

There’s a lot of advantages to this format, largely due to the use of bitwise and arithmetic operations that give us very efficient ways to manipulate it. Let’s start by build some tools to help us work with bitboards:

[Note: we will be using a lower case bfor bitboards, and a capital Bfor the whole chess board (which is just a collection of bitboards)]

Python
# used to visualise a bitboard
def print_bitboard(b): 

  # convert to binary string, pad with leading zeros to length of 64
  b = bin(b)[2:].zfill(64) 
 
  print ("\n      A B C D E F G H ")
  print ("    __________________ ")

  for i in range(8):
    # need to print upside down and back to front to ensure correct formatting
    temp = [' ', 8-i, "|"] + [*(b[-8:])[::-1]]
    
    # replace 0 with · for readability
    temp = [i if i!='0' else '·' for i in temp] 
    
    print(*temp, sep=" ") 
    b = b[:-8]  

  print("\n")
Python
# The bitboard coresponding to the white pawns

      A B C D E F G H 
    __________________       
  8 | · · · · · · · ·
  7 | · · · · · · · ·
  6 | · · · · · · · ·
  5 | · · · · · · · ·
  4 | · · · · · · · ·
  3 | · · · · · · · ·
  2 | 1 1 1 1 1 1 1 1
  1 | · · · · · · · ·
Python
# eumerate chess board square names
a8, b8, c8, d8, e8, f8, g8, h8, \
a7, b7, c7, d7, e7, f7, g7, h7, \
a6, b6, c6, d6, e6, f6, g6, h6, \
a5, b5, c5, d5, e5, f5, g5, h5, \
a4, b4, c4, d4, e4, f4, g4, h4, \
a3, b3, c3, d3, e3, f3, g3, h3, \
a2, b2, c2, d2, e2, f2, g2, h2, \
a1, b1, c1, d1, e1, f1, g1, h1 = range(64)

def get_bit(b, square):
  return 1 if (b & (1<<square)) else 0

def set_bit(b, square):
   return b | (1<<square)

def remove_bit(b, square):
  return b ^ ( 1<<square if get_bit(b, square) else 0 )

def ctz(b): # count trailing zeros
  return (b & -b).bit_length() - 1

Let’s examine these tools to understand how they work.

First, the enumeration of the chessboard squares allows us to access them by name. The constant f6 will refer to the corresponding bit in a bitboard. This is done purely for convenience.

Next, consider the operation, 1<<square . It will perform a bit-shift on the binary value 1. For example, 1<<4 will give us the binary string: 10000. We can then use this value as a mask, allowing us to target specific squares on the chess board.

So, for a given bitboard, b , performing b & (1<<e4) will get the intersection between b and the mask (which only contains a bit in position e4 ). This operation returns true if b contains a 1 in this cell, and false otherwise, allowing us to get the contents of the square e4 in b.

A similar procedure is used to add bits to a specific location on a bitboard, except this time by using a bitwise OR, | to get the union of two bitboards.

To remove a bit from a bitboard, we can use a bitwise XOR, ^ . However, in this case, we need to check the contents of the square first; if the square is empty, we should instead XOR with 0 which will have no effect, leaving the square empty.

The last method we create is ctz to Count the Trailing Zeros in our bitboard, that is, the number of zeros after the least significant bit. This will allow us to extract one bit at a time from the board. To see why this works, we take a look at an example with some binary numbers.

Python
# operation: (b & -b).bit_length() - 1

        0 0 1 1 0 1 0 0 = 52
        0 1 0 0 1 1 0 0 = -52  (flip bits and add 1)
   ---------------------------
   AND: 0 0 0 0 0 1 0 0 
---

The length of the resulting string, 100 , is 3. Subtracting 1 gives 2, which is the index of the least significant bit in the original number, 52. (Counting right to left as is convention)

Now we can build the whole board structure, and create a method to visualise it.

A board just consists of a collection of bitboards. We have one for each type of piece of each colour, one for all of the pieces of each colour, and one for all of the pieces together. We’ll store these in an array, and index it, again for convenience, by the name and colour of the pieces it represents.

Python
def new_board():
  return = [ 71776119061217280, 65280,     # pawns (white, black)
             9295429630892703744, 129,     # rooks
             4755801206503243776, 66,      # knights
             2594073385365405696, 36,      # bishops
             576460752303423488, 8,        # queens
             1152921504606846976, 16,      # kings
             18446462598732840960, 65535,  # white, black pieces
             18446462598732840960 | 65535, # all board pieces
           ]

# enumerate array indices for convenience
Wpawn, Bpawn, Wrook, Brook, Wknight, Bknight, Wbishop, Bbishop, Wqueen,\
 Bqueen, Wking, Bking, white, black, all = range(15)


# enumerate piece names to match the index shown above:
piece_names = ['P', 'p', 'R', 'r', 'N', 'n', 'B', 'b', 'Q', 'q', 'K', 'k']


def print_board(B):
    output = [' ' for i in range(64)] 
    
    # scan the first 12 bitboards for pieces
    for i in range(12): 
        b = B[i]
        
        # repeatedly find and remove least significant bit from bitboard 
        while b: 
            square = ctz(b)
            b = remove_bit(b, square)
            output[square] = piece_names[i] # add piece to output string
    
    
    print ("\n      A B C D E F G H ")
    print ("    __________________ ")
    
    for i in range(8):
        # need to print upside down and back to front to 
        # ensure correct formatting
        temp = [' ', 8-i, "|"] + output[:8]
        print(*temp, sep=" ") 
        output = output[8:] 
    
    print ("\n")

The bitboards corresponding to the white, black, and all pieces are just the union of their respective pieces’ bitboards. As such, whenever we make a change to one of the piece’s bitboards, we may need to update those three to match. We’ll create a function to automate this.

Python
def update_bitboards(B):
    # union of other bitboards
    B[white] = B[Wpawn] | B[Wrook] | B[Wknight] | B[Wbishop] | B[Wking] | B[Wqueen]
    B[black] = B[Bpawn] | B[Brook] | B[Bknight] | B[Bbishop] | B[Bking] | B[Bqueen]
    B[all] = B[white] | B[black]

Now we can finally set up a starting board, make some moves, and visualise the result.

Python
### TESTING ###

B = new_board()

# 'e4'
set_bit(B, Wpawn, e4)
remove_bit(B, Wpawn, e2)

# 'e5'
set_bit(B, Bpawn, e5)
remove_bit(B, Bpawn, e7)

update_bitboards(B)
print_board(B)
Python
      A B C D E F G H     # By convention, we use capital letters to represent
    __________________    # white pieces, and lower case letters for black
  8 | r n b q k b n r     # pieces. We also use n/N for a knight, since
  7 | p p p p   p p p     # k/K is reserved for the kings.
  6 |                     
  5 |         p           # This is the chessboard as seen from 
  4 |         P           # white's perspective
  3 |                
  2 | P P P P   P P P
  1 | R N B Q K B N R

Advantages of the Bitboard

The value of bitboards becomes apparent when we start to tackle move generation.

Take a rook for example. It can move as far as desired along its rank and file until it reaches another piece. The classic approach would be to scan the board in each direction to check which squares are empty, which are occupied, and by what colour of piece. This is quite a time consuming process, especially given the number of pieces we need to calculate moves for, and the number of times moves are generated.

Using a bitboard, we can use a technique known as ‘Magic Bitboards’ where we use bitboard multiplication with a ‘magic number’ to work like a hash function. We can then precompute possible moves for various different positions, and use this computation to quickly look them up. This will be explained in more detail in the next article in this series.

For non-sliding pieces (knight, pawn, king) whose movements are somewhat more limited (as they can’t move as far as they want in any direction) we can also use bitboards to simplify the problem. By precomputing possible move masks, and then performing bitwise operations with those masks and the locations of allied pieces we can quickly see where each of those pieces is able to move to.


This is part 1 in a series of walkthroughs documenting my journey in developing a chess engine. If you’ve found this useful, insightful, or even vaguely interesting, please consider following to be notified of future updates. Thanks for reading 🙂


References and Reading

  • www.chessprogramming.org has some extremely thorough and in-depth tutorials and explanations of many different chess programming topics – I would highly recommend.
  • For interactive chess tutorials, www.chess.com is the perfect starting point to gain familiarity with the rules and flow of the game.

#Chess #Python #AI # Algorithms #ChessEngine #ProgrammingProject

Scroll to Top