Boyer-Moore Majority Vote

Problem

Consider the problem of identifying the element that appears the majority of times in a sequence. Such an element, i.e. majority element, must occur > 50% of the time in the sequence.

Example

In the sequence [A, B, A, C, A, A, B], the majority element is A since it appears 4 out of 7 times (> 50%).

Algorithm analysis

Pseudocode

From the associated Wikipedia article, we find the following pseudocode.

Initialize an element m
Initialize a counter c ← 0
for each element x of the input sequence:
    if c = 0, do 
        m ← x
        c ← 1
    else if m = x, do
        c ← c + 1
    else do
        c ← c - 1
return m

Interpretation

  • Let S denote the sequence of elements, which has length N. Let $I_M$ denote the set of all indices in S where M appears, i.e. $I_M = {i : S[i] = M, i = 0, 1, \dots, N - 1 }$ if we use 0-indexing.

  • We have a single placeholder variable m, that when a majority element M does exist, is guaranteed to equal M after iterating through the whole sequence using the algorithm.

  • Since M makes up >50% of all elements in the sequence, there exists a 1-to-1 matching between each non-M element to a unique element equal to M, with leftover unmatched elements still equal to M.

  • Therefore, the last increment operation performed on counter c must (1) either be done by some M element or (2) decremented by some non-M element, but leaves counter positive. This keeps the current value of m still set to M.


This site uses Just the Docs, a documentation theme for Jekyll.