A mathematical view of voting systems

Why voting systems can never be fair

post

Soon after I began my undergraduate degree, Barack Obama won the 56th United States presidential election. The next day, my pure maths tutor asked me if I had followed it closely, saying that elections really interested him. I was surprised to hear this: surely the only maths involved was adding up the number of votes? In reality, voting systems hold considerable interest for mathematicians, and there are several mathematical results and theorems concerning electoral processes. The main thing that I like about the language of mathematics is that it allows us to make extremely precise statements without ambiguity and, as you’ll see, we can make precise mathematical statements about voting systems—with some surprising results.

Transitive preferences and voting systems

We define an electorate $E$ to be a vector of voters $\left(v_{1},\dots,v_{n}\right)$. Each voter $v_{i}$ has a ranking (with ties allowed) of the vector of candidates $C=\left(c_{1},\dots,c_{m}\right)$. We write $x\succ_{i}y$ to mean that voter $v_{i}$ prefers $x$ to $y$. The ranking is transitive, meaning that if $x\succ_{i}y$ and $y\succ_{i}z$ then we must have $x\succ_{i}z$. An interesting property of the votes is that if a majority of voters prefer $x$ to $y$, and a majority prefer $y$ to $z$, it does not necessarily follow that a majority prefer $x$ to $z$. A simple example with three voters is $x\succ_{1}y\succ_{1}z$, $z\succ_{2}x\succ_{2}y$ and $y\succ_{3}z\succ_{3}x$. I’ll use the notation “$\succ$”, without the subscript $i$, to refer to the preferences of a group of voters.

A voting system is an algorithm that takes the voters’ preferences and outputs a relation between candidates, which we denote by “$\succ_{E}$”. So $c_{1}\succ_{E}c_{2}$ would mean that the voting system rated $c_{1}$ as preferable to $c_{2}$. The relation $\succ_{E}$ is also transitive. Formally, if $T(C)$ denotes the set of total orderings of $C$ then a voting system is a function $F:T(C)^{n}\rightarrow T(C)$, where $n$ is the number of voters.

In the case of two candidates, May’s theorem (1952) provides some good news. Consider a voting system that satisfies the following sensible conditions:

  • Each voter is given the same weight. (For any permutation $\sigma$ of the voters, $F(E)=F(\sigma(E))$.)
  • If we relabel the candidates then the result must be similarly relabelled. (For any permutation $\tau$ of the $m$ candidates, if $C$ is permuted to become $\tau(C)$ then $F (E)$ becomes $\tau(F(E))$.)
  • If $c_{1}\succ_{E}c_{2}$ or $c_{1}\approx_{E}c_{2}$ (the latter representing no preference between $c_{1}$ and $c_{2}$) and $c_{1}$ increases in popularity then $c_{1}$ must win. (If a voter $v_{i}$ changes their ranking from $c_{2}\succ_{i}c_{1}$ to $c_{1}\approx_{i}c_{2}$ or $c_{1}\succ_{i}c_{2}$, or from $c_{1}\approx_{i}c_{2}$ to $c_{1}\succ_{i}c_{2}$, then $c_{1}\succ_{E}c_{2}$).

May’s theorem tells us that the only voting system satisfying these properties is plurality, where each voter casts a vote for their favourite outcome and whichever outcome is more popular is chosen. So in the case where there are only two options, there is a procedure that accurately reflects the wishes of the electorate.

If there are more than two alternatives then Arrow’s theorem (1950) provides a negative counterpart to May’s theorem. Consider the following criteria for a voting system:

  1. If every voter prefers candidate $x$ to candidate $y$ then $x$ must be ranked higher than $y$ by the voting system ($\forall E\ \, \forall x, y \; (\forall i\ x\succ_{i}y)\implies x\succ_{E}y$). If every voter is agreed on the ranking of two candidates, it is essential that the voting system reflects this.
  2. Suppose that $x$ and $y$ are candidates and that $x\succ_{E}y$. If some of the voters change their preferences for the other candidates (eg candidate $z$ becomes more popular), but each voter preserves
    their preference between $x$ and $y$, then $x\succ_{E}y$ must still hold. Consider a group that wants to rank candidates for a job position. They evaluate all the candidates and agree that they prefer candidate $x$ to $y$. If they then receive new information about $z$ (but no new information about $x$ and $y$), it seems strange that they should reconsider their group preference $x\succ_{E}y$.
  3. There is no dictator, defined as a single voter $v_{i}$ who can always determine the election ($\forall E\ \forall x,y \; \nexists v_{i}\ (x\succ_{i}y\implies x\succ_{E}y)$). This is an essential property for a voting system to have!

No voting system exists that satisfies the three criteria.

Perhaps surprisingly, if there are more than two candidates then Arrow’s theorem shows that no voting system exists that satisfies all three criteria. Proofs of this theorem usually use contradiction, assuming that the system satisfies the first two conditions and showing that it cannot satisfy the third. Most systems that are used in the world satisfy conditions 1 and 3 but do not satisfy condition 2. Condition 2 is known as independence of irrelevant alternatives.

Plurality (AKA first-past-the-post)

uk_ballot

A British ballot paper, using the plurality, or the first-past-the-post, voting systems


Plurality is a very common voting system. Each voter chooses their favourite candidate, and whichever candidate receives the most votes is declared the winner. Problems with this system are well known, and I will give an example of how it violates independence of irrelevant alternatives. Consider an election with three candidates $x,y$, and $z$ where the voters can be split into three groups, and their preferences are shown in Table 1. Under plurality, candidate $x$ would win. But if nine or more voters change their preference from $x\succ z\succ y$ to $z\succ x\succ y$, then the result is a victory for $y$. This is in clear violation of independence of irrelevant alternatives since these voters kept their preference $x\succ y$, with the situation being exactly as that described in the definition.

This is also a violation of the majority loser criterion, which is as follows: if a majority of voters prefers every other candidate over a given candidate, then the given candidate must not win. Once the nine voters switch their support, there is still a majority that prefers both $x$ to $y$ and $z$ to $y$.

Number of voters Preferences
110 $x\succ z\succ y$
102 $y\succ x\succ z$
8 $z\succ x\succ y$
Table 1: Voters’ preferences.

In Table 1 above, candidates $x$ and $z$ are clones, meaning that no voter ranks any other candidate to be between (or equal to) $x$ and $z$. This definition expands to a multi-member clone set, a strict subset of the candidates with the property that no voter ranks any candidate outside the clone set to be between (or equal to) any members of the clone set. The independence of clones condition requires that removing a clone from a clone set must not improve or harm the ranking of any candidate outside the clone set. Plurality violates this condition because $z$’s removal would hinder candidate $y$. There is, therefore, a simple way to affect the outcome of a plurality election in your favour without having to convince anyone else to support you. If you introduce a clone of an opponent then the vote for your opponent may split between your opponent and their clone, meaning that you require fewer votes to win. In practice, this fact is well known and some people in British elections do not vote for their preferred candidate because they do not want to split the vote against the party they dislike.

Alternative vote

On 5 May 2011, the UK voted to keep the plurality system for parliamentary elections, rejecting the alternative vote. The alternative vote is defined as follows: voters rank the candidates, and the initial results are based on each voter’s first-preference votes. If a candidate receives more than half of first-preference votes then that candidate wins. Otherwise, the candidate with the fewest first-preference votes is eliminated. Votes for the eliminated candidate are added to the totals of the remaining candidates based on who is ranked next on each vote. If no remaining candidates are ranked on a vote, the voter is assumed to be indifferent between them and the vote is discarded. This process continues until one candidate wins by obtaining more than half the votes.

An Australian ballot paper, using the alternative vote system.

An Australian ballot paper, using the alternative vote system.

The alternative vote is, by Arrow’s theorem, not independent of irrelevant alternatives, but it is clone-proof, unlike plurality voting. Consider the example above where 9 voters changed their preference from $x\succ \!z\succ \!y$ to $z\succ \!x\succ \!y$. If we run an alternative vote election here, no candidate would have a majority of first-preference votes, so candidate $z$ would be eliminated. Candidate $z$’s 17 votes would then be given to $x$, giving $x$ a majority.

The alternative vote satisfies the majority loser condition (if a majority of voters prefer every other candidate over a given candidate, then the given candidate cannot win), whereas plurality does not.

So is the alternative vote a better system than plurality? It does have some nice properties that plurality does not have, but the
opposite is also true.

In order for a voting system to be consistent, we require that if we arbitrarily break the electorate into two parts and both parts of the electorate declare the same winner, then an election on the whole electorate should declare the same winner as the sub-electorates. Plurality is a consistent voting system but the alternative vote is not. Consider the example of two sets of voters and their amalgamation in Table 2, from Woodall (1994). For electorate (a), $z$ is eliminated so $x$ gains 0 votes and $y$ gains 3, making $y$ the winner. For electorate (b), $x$ is eliminated so $y$ gains 3 votes and $z$ gains 0, making $y$ the winner again. In the election with the merged electorate (a) + (b), $y$ is eliminated, so $x$ gains 0 votes and $z$ gains 8 votes, making $z$ the winner.

Table 2: Voter preferences in two electorates.
Voter preference Voters (a) Voters (b) Voters (a)+(b)
$x\succ \!y\succ \!z$ 6 3 9
$y\succ \!z\succ \!x$ 4 4 8
$z\succ \!y\succ \!x$ 3 6 9
How to steal an election by gerrymandering

How to steal an election by gerrymandering

Consistency is an important property for protecting against gerrymandering of electoral districts: if two areas both support the same candidate using the alternative vote, it is possible to merge the two areas and have the newly created area elect a different candidate.

Table 3: An electorate showing that the alternative vote is not monotonic.
Voter preference Voters Voters’
$z\succ \!x\succ \!y\succ \!w$ 5 5
$y\succ \!z\succ \!x\succ \!w$ 4 4
$y\succ \!x\succ \!z\succ \!w$ 2 0
$x\succ \!y\succ \!z\succ \!w$ 6 8
$w\succ \!z\succ \!x\succ \!y$ 9 9

A voting system satisfies the monotonicity criterion if the following holds: given an electorate $E$, if some voters improve their ranking of $x$ without changing the rankings of other candidates to create a new electorate $E^{\prime}$, then for any $y$, $x\succ_{E}y\implies x\succ_{E^{\prime}}y$, ie candidate $x$ cannot be harmed if more voters favour $x$ than before. As with consistency, plurality satisfies the monotonicity criterion but the alternative vote does not. Consider the following example from Johnson (2005), shown in Table 3. Using the original electorate, $z$ is eliminated in the first round and $z$’s 5 votes pass to $x$, so $x$ has 11 votes. Then $y$ is eliminated and all 6 of $y$’s votes go to $x$ because $z$ is eliminated. So $x$ now has 17 votes and $w$ has 9, so $x$ wins a substantial victory with an electoral preference of $x\succ_{E}w\succ_{E}y\succ_{E}z$. Suppose the two voters with ranking $y\succ \!x\succ \!z\succ \!w$ now change their ranking to $x\succ \!y\succ \!z\succ \!w$, making $x$ even more popular than before. Then under the new electorate (denoted Voters$^{\prime}$ in Table 3), $y$ is eliminated in the first round and $y$’s 4 votes pass to $z$, so $z$ has 9 votes. Then $x$ is eliminated and all 8 of $x$’s votes go to $z$. So the final result is that $w$ has 9 votes and $z$ has 17, and the new electorate preference is $z\succ_{E^{\prime}}w\succ_{E^{\prime}}x\succ_{E^{\prime}}y$. As a result of $x$’s increased popularity, $x$ has been demoted from first choice to third choice!

A related but different property to monotonicity is the participation property. Under the alternative vote, it is possible for a voter to harm their preferred candidate by voting, whilst this is not possible with plurality.

Conclusion

This is only a very brief introduction to this area, and there is much more to be said. What I find engaging is that mathematical devices and constructions such as theorems and counterexamples can be applied to voting systems in order to compare them. We can say with complete confidence that no preferential voting system will ever be devised that satisfies Arrow’s three criteria, and we can also make statements about all possible elections run under a voting system, such as “the alternative vote is clone-proof”.

Many different voting systems are in use around the world, and it is a mathematical certainty that every electoral system lacks some desirable characteristics, so when discussing which system is the “best” it can only ever be a debate about which criteria people think are most important, and that may depend heavily on the context of the election.

References and further reading

Alex did a statistics PhD at Imperial College London, studying change point detection. He now works as a quantitative analyst at G-Research.

More from Chalkdust