Unlocking sudoku’s secrets

Sara Logsdon looks to graph theory and abstract algebra for help on the puzzle page

post

Sudoku has long captivated puzzle enthusiasts worldwide with its logical challenges and addictive nature. While it may seem like a simple game of numbers, beneath the surface lies a fascinating connection to the realm of mathematics. Graph theory and abstract algebra both play a crucial role in unravelling the intricacies of sudoku. Sudoku puzzles consist of a $9 \times 9$ grid, divided into nine $3 \times 3$ sub-grids called regions. The objective is to fill the grid with numbers from 1 to 9, ensuring that each row, column, and region contains every digit exactly once.

A vertex colouring problem

In graph theory, a graph is a mathematical structure that comprises a set of vertices, or nodes, connected by edges. An area rich in mathematical questions, graph theory has a long history of famous thought-provoking problems. One of the most well-known of these problems is the vertex colouring problem: Given an undirected graph $G = (V, E)$, where $V$ represents the set of vertices and $E$ represents the set of edges, can one find an assignment of colours to each vertex in $V$ satisfying the condition that no two adjacent vertices (connected by an edge) have the same colour?

While sudoku may appear as a grid of numbers, we can view it through the lens of graph theory. Interestingly, we can represent a sudoku puzzle as a graph, where each of the $81$ cells in the sudoku grid corresponds to a vertex in the graph. We label the vertices with ordered pairs $(x, y),$ $x$ and $y$ being integers from $1$ to $9$. We then join two distinct vertices $(x, y)$ and $(x’, y’)$ by an edge if and only if any of these conditions apply:

  • $x = x’$ (the two cells are in the same row),
  • $y = y’$ (same column), or
  • $\displaystyle\left\lceil \frac{x}{3} \right\rceil = \left\lceil \frac{x’}{3} \right\rceil$ and $\displaystyle\left\lceil \frac{y}{3} \right\rceil = \left\lceil \frac{y’}{3} \right\rceil$   (same $3\times3$ region).

So the top-left region of the sudoku board would be connected like this:

We then propose the vertex colouring problem: Does there exist a 9-colouring of our sudoku graph? That is, can we colour each vertex in the graph, using no more than $9$ colours, in such a way that no vertices which are connected by an edge end up the same colour?

It turns out that this question is equivalent to: does there exist a solution to the original sudoku board? By applying vertex colouring algorithms, such as the greedy algorithm and/or backtracking, we can systematically assign colours (numbers) to the vertices (cells) to find a valid solution to any sudoku board, so long as one exists.

The greedy algorithm and backtracking

The purpose of the greedy algorithm is to produce a vertex colouring of a graph. The algorithm assigns colours to the vertices of the graph by iteratively selecting an uncoloured vertex and assigning it the smallest possible colour that does not conflict with its neighbouring nodes. This means in sudoku, the greedy algorithm takes a sudoku board and systematically assigns numbers (colours) to cells (vertices), ensuring that no conflicts arise within the rows, columns or regions. Backtracking is a systematic search algorithm that explores the solution space by making choices and undoing them when they lead to contradictions or dead ends. We will combine the greedy algorithm with backtracking, particularly when the greedy algorithm fails to find a solution. In sudoku, this will involve filling in cells with numbers, testing their validity, and if a contradiction is detected, `backtracking’ to the previous state to explore alternative choices until a valid solution is found or all possibilities are exhausted. The procedure for combining the greedy algorithm and backtracking in sudoku follows:

  1. Start with the initial board and select the first uncoloured cell.
  2. Assign the smallest possible number to the selected cell.
  3. Move to the next uncoloured cell and assign the next smallest number.
  4. Continue this process, moving from left to right and top to bottom, assigning the smallest valid number to each uncoloured cell.
  5. If a conflict arises, indicating an invalid placement, backtrack to the previous cell and select the next available number.
  6. Repeat the process until the entire grid is filled or until no valid number can be placed.

The algorithm in action

Step 1: Suppose our initial board looks like this. We select the first uncoloured cell:

Steps 2–4: Greedy algorithm. Note that $8$ was entered in $(1, 7)$ instead of $4$, $5$ or $6$ because although $4$ is the lowest available number, $8$ is the lowest available valid number.

Step 5.1: Conflict. The only remaining available numbers for $(1,8)$ are already present in column $8$.

Steps 5.2–6: Backtrack. Return to cell $(1,7)$. Enter the next lowest available valid number, $9$. Resume greedy algorithm.

By continuing this same algorithm through all rows, we can reach a solution to our sudoku board!

Other researchers seem to have been taken by the same curiosity as me and have learnt more about sudoku from graph theory—Joshua Cooper and Anna Kirkpatrick linked minimal sets to minimal fair puzzles, and Michael Haythorpe connected Hamiltonian cycles to sudoku puzzles of different sizes.

Graph theory is one way to unlock sudoku puzzles mathematically. But sudoku is really just solving a puzzle subject to a number of constraints. That sounds like something algebra could help us with. And it can! Algebraic geometry possesses a similarly beautiful application to sudoku.

Gröbner bases

If we could write the sudoku problem as a system of polynomials, then solving the system would allow us to complete the puzzle. Solving simultaneous equations might seem like an opportunity for Gauss–Jordan elimination, but this method only applies to linear systems and we will see shortly that ours is not. Moreover, we require our solutions to be the integers 1–9. A more general approach lies in Gröbner bases.

A Gröbner basis for a system of polynomials is a new system of polynomials with the same solutions as the original, but which is easier to solve. For example, the system \begin{align*} x+y-3 &= 0,\\ xy-2&=0 \end{align*} is coupled, as both equations involve both variables. However, computing a Gröbner basis of this system yields \begin{align*} y^2-3y+2 &= 0,\\ x+y-3&=0, \end{align*} where the first equation involves only $y$. We can solve this first equation and then substitute to find $x$, making the system easier to solve. In order to understand how Gröbner bases work, we will need to learn some abstract algebra terminology. A polynomial ring is a set of polynomials in a certain number of variables. For our purposes, the coefficients of polynomials will come from the field $\mathbb{Q}$ of rational numbers. An ideal is a subset $I$ of elements from a ring $R$ that forms an additive group and has the property that \[x \in R \text{ and } y\in I \implies xy \in I \text{ and }yx \in I.\] An ideal can be generated by a set of polynomials just like a vector space can be spanned by a set of vectors. For example, if \[ I= \{af+bg : a,b \in R\}, \] then $f$ and $g$ generate $I$, and we can write \[I= \langle f,g \rangle.\] A term ordering is a method for choosing the order in which elements are placed. The term ordering that we’ll use here is the lexicographical term ordering, abbreviated $\operatorname{Lex}$. This is the ordering you’d expect in a dictionary. For example, if we have variables $x$, $y$ and $z$ then we order the variables as \[x>y>z.\] When we concatenate variables, we look at the first variable in each element and compare, then the second, and so on. For example, \begin{align*} \operatorname{Lex}(xy) > \operatorname{Lex}(yz), \\ \text{and }\operatorname{Lex}(x^2) > \operatorname{Lex}(xy). \end{align*} Given a term ordering, the leading term of a polynomial $f$ is the term which is greatest in the ordering and will be denoted $\operatorname{lt}(f).$ Given any set $S$ of polynomials in a polynomial ring, we can then define the leading term ideal of $S$ to be the ideal $\operatorname{lt}(S)$ generated by the leading terms of the polynomials in $S.$ That is, \[\operatorname{lt}(S)= \langle \operatorname{lt}(f) : f \in S \rangle.\] Note that the leading term ideal of a set of polynomials is not always equal to the leading term ideal of the ideal generated by that set of polynomials. For Gröbner bases we use degree reverse lexicographic order. That means if $S=\{x,x+1\}$, $\operatorname{lt}(S)= \langle x\rangle$, but if $I= \langle x,x+1\rangle$, then \[x+1-x= 1 \in I,\] so $\operatorname{lt}(I)=\langle 1 \rangle =R$.

When $\operatorname{lt}(S)=\operatorname{lt}(I),$ where $I=\langle S \rangle,$ we say that $S$ is a Gröbner basis for the ideal $I.$ In other words, a set of nonzero polynomials \[G={g_1,g_2,\dots,g_t } \subset I\] is called a Gröbner basis for $I$ if and only if $\operatorname{lt}(G)= \operatorname{lt}(I).$

Gröbner bases are nice because they simplify problems. For example, if $G$ is a Gröbner basis for an ideal $I$, then we can use simple reduction to determine whether or not a given polynomial $f$ is in the ideal $I$.

A well-known theorem in Gröbner basis theory explains that the Gröbner basis puts a given system into triangular form. For example, if $G=\{g_1,\dots,g_s\}$ is a Gröbner basis in variables $x_1, \dots x_n$ ($n\leq s$), the theorem says that we can order the polynomials $g_1,\dots,g_s$ so that $g_1$ only involves the smallest variable $x_1$, $g_2$ involves only $x_1$ and $x_2$ and has leading term involving only $x_2$, and so on. This makes solutions possible through quick, easy substitution.

Bruno Buchberger introduced Gröbner
bases in his 1965 PhD thesis and named
them after his supervisor, Wolfgang
Gröbner

A process known as Buchberger’s algorithm computes the Gröbner basis for a given ideal and term order. The algorithm uses the multivariate division algorithm and least common multiples. This algorithm also guarantees the existence of a Gröbner basis for a given ideal and term order.

Gröbner basis in sudoku

This is useful for the problem of sudoku. We can:

  1. Create a system of polynomial equations to represent our sudoku problem
  2. Use Buchberger’s algorithm to compute a Gröbner basis for the ideal generated by the polynomials in our system
  3. If the puzzle has a unique solution, the Gröbner basis will consist of 81 linear polynomials, from which we can read off the solutions to our sudoku puzzle

We can represent the constraints given in the rules of sudoku as a system of polynomial equations. We introduce 81 variables, $x_0,\dots,x_{80},$ one for each cell in the sudoku. Formally, we are now dealing with the polynomial ring $\mathbb{Q}[x_0,\dots,x_{80}].$ We wish to encode the conditions on the sudoku as a set of polynomials in some subfield $F \subset \mathbb{Q}[x_0,\dots,x_{80}].$ Cells can only take on whole number values from 1 to 9, so can we define the following polynomials for all $i=0,\dots,80$: \[(x_i-1)(x_i-2)\cdots(x_i-9)=0.\]

Next, we define polynomials to represent the condition that each number can only be used once in each column, each row and each region. This can be done by defining that the sum of all columns/rows and of each region should be 45 and product should be $9! = \text{362,880}$. With these conditions we have no duplicate numbers. So if $\{x_{k_1},\dots,x_{k_9}\}$ are all in the same row or column, \[\sum_{n=1}^{9} x_{k_n} = 45 \quad\text{and}\quad \prod_{n=1}^{9} x_{k_n} = \text{362,880}.\] Finally, since some of the cells $x_j$ in the grid are already filled out with a number $a_j,$ we define new polynomials for each of these cells, \[x_j-a_j=0.\] These polynomials give a system of 135 equations, plus one for each known cell. We can solve this system by applying Buchberger’s algorithm to compute a Gröbner basis and from there, read off the solution to the initial grid.

An example: shidoku

As we’ve seen, the algorithm is quite long. So instead of using working through an example with a sudoku board, we will use a shidoku board. Shidoku is a variation of sudoku which uses a $4 \times 4$ grid, divided into four $2 \times 2$ regions.

Introduce 16 variables $x_0,\dots,x_{15}$ to represent the 16 cells. Each cell can only take on the value $1$, $2$, $3$ or $4$, so for $i=0,1,\dots,15$, \[(x_i-1)(x_i-2)(x_i-3)(x_i-4)=0.\] Similar to before, the sum and product of all numbers in a column, row or region must be $1+2+3+4=10$ and $1\times 2 \times 3 \times 4=24,$ respectively. With these conditions we have no duplicate numbers.

Let $\{x_i,x_j,x_k,x_l\}$ represent a set of cells that make up a row, column or $2 \times 2$ region, so \begin{align*} x_i+x_j+x_k+x_l-10&=0 \\ \text{and}\quad x_i x_j x_k x_l-24&=0. \end{align*}

This gives us a total of 40 polynomial equations.

We then add any values which are already given. Let’s suppose we have the board below with the given variable assignments:

 

 

 

 

 

 

We therefore need to add the equations \begin{align*} a-1=0, \quad f-2=0, \\ i-2=0, \quad k-3=0, \\ p-4=0.\hspace{8mm} \end{align*} All that remains is to give our equations to a computer algebra package—Matlab has the gbasis function—in order to perform Buchberger’s algorithm and compute the Gröbner basis for this system.

In doing so, it returns: \begin{align*} a-1&=0, & b-3&=0, & c-4&=0, & d-2&=0,\\ e-4&=0, & f-2&=0, & g-1&=0, & h-3&=0,\\ i-2&=0, & j-4&=0, & k-3&=0, & l-1&=0,\\ m-3&=0, & n-1&=0, & o-2&=0, & p-4 &= 0. \end{align*} This basis consists of a system of linear polynomials from which we can easily read off the solutions to the shidoku board:

Sudoku intertwines powerfully with the realm of both graph theory and abstract algebra. The algorithms we learn in our courses provide valuable insights and strategies that can be applied to conquer sudoku puzzles. So, the next time you pick up a sudoku puzzle, remember this beautiful layer of graphs and polynomial equations that lies beneath its surface.

Sara is an undergraduate student at the University of Georgia, interested in algebra. Outside of math (sic), she enjoys hiking and playing board games (her favourite is Splendor).

More from Chalkdust