[EN] Divide, Conquer and Binary Search

Problem

Suppose we have two arrays a[1], a[2], ..., a[n] and b[1], b[2], ..., b[m], and a predicate P(a, b).

For each index i (1 ≤ i ≤ n) we want to find the least j (1 ≤ j ≤ m) such that P(a[i], b[j]) is true.

Naive Solution

With the information we have, we can't do much. The optimal solution given what we know so far is: Iterate over every i, then over every j, until we find a match.

fn search(i) {
    solution[i] = m+1
    for j in 1..m {
        if P(a[i], b[j]) {
            solution[i] = j
            break
        }
    }
}

for i in 1..n {
    search(i)
}

This is O(n * m).

Additional constraints

Let's introduce a construct that will lead to a faster solution. Consider this definition:

P'(a, B) = { true if there is some b in B such that P(a, b)
           { false otherwise

Let's say we can evaluate P' in O(1), given some O(|B|) precomputation.

Then, an alternative strategy becomes available.

Solution

For each i, use binary search to find the least j such that P'(a[i], {b[1], ..., b[j]}) is true.

The pseudocode is as follows:

fn search(i, lo, hi) {
    if hi - lo <= 1 {
        solution[i] = hi
    } else {
        j = (lo + hi) / 2
        pre = precompute({b[1], ..., b[j]})
        if P'(a[i], pre) {
            search(i, lo, j)
        } else {
            search(i, j, hi)
        }
    }
}

for i in 1..n {
    search(i, 0, m+1)
}

If you're thinking this is worse, youre absolutely right: doing all the precomputation required for the searches means we now have complexity O(n * m * log m). (each of the n values needs log m evaluations of P', and the precomputation costs m)

However, there are two important observations here.

First, when we check P'(a, {b1, ..., bj}) and get false as a result, we don't need to include any of those elements of b in future steps of the search, because they are all false.

This means we only need to include indices after at the lo index of the binary search to evaluate our predicate:

fn search(i, lo, hi) {
    if hi - lo <= 1 {
        solution[i] = hi
    } else {
        j = (lo + hi) / 2
        pre = precompute({b[lo+1], ..., b[j]})    // ONLY this line changed
        if P'(a[i], pre) {
            search(i, lo, j)
        } else {
            search(i, j, hi)
        }
    }
}

for i in 1..n {
    solution[i] = search(a[i], 0, m+1)
}

Surprisingly, this changes the complexity. Each binary search no longer costs O(m * log m), but only O(m). This is because the size of the range we precompute halves at each step, so we end up precomputing sets of size m, m/2, m/4, m/8, ..., which adds up to less than 2m.

All in all, we are back to O(n * m) complexity.

The second observation is that each binary search does a lot of repeated work in precomputation. For example, all of them precompute {b[1], ..., b[m/2]}

A quick and dirty fix would be to memoize the precompute function, but that would obscure some interesting ideas. Instead, let's move the loop over i into the binary search, and see if we can share some computation that way.

To make this possible, we pass a whole list of indices instead of just a single one, and separate the indices into two lists, one for each branch of the recursion.

fn search(is, lo, hi) {
    if sz(is) == 0 {
        return
    }

    true_is = []
    false_is = []

    for i in is {
        if hi - lo <= 1 {
            solution[i] = hi
        } else {
            j = (lo + hi) / 2
            pre = precompute({b[lo+1], ..., b[j]})
            if P'(a[i], pre) {
                add i to true_is
            } else {
                add i to false_is
            }
        }
    }

    search(true_is, lo, j)
    search(false_is, j, hi)
}

search(1..n, 0, m+1)

Looking at this code we can recognize some obvious waste: The if condition, the variable j and the precomputation do not depend on i at all, so they can be moved out of the loop.

fn search(is, lo, hi) {
    if sz(is) == 0 {
        return
    }

    if hi - lo <= 1 {
        for i in is {
            solution[i] = hi
        }
    } else {
        true_is = []
        false_is = []
        j = (lo + hi) / 2
        pre = precompute({b[lo+1], ..., b[j]})
        for i in is {
            if P'(a[i], pre) {
                add i to true_is
            } else {
                add i to false_is
            }
        }
        search(true_is, lo, j)
        search(false_is, j, hi)
    }

}

search(1..n, 0, m+1)

With this change something magical happens: we end up with a nice divide and conquer approach, and the complexity jumps all the way down to O((n + m) * log m).

The reason for this is simple:

  • The depth of the recursion is O(log m) (it is the same recursion as binary search).
  • At each level of depth, all recursive calls precompute disjoint intervals. The sum of lengths of those intervals is bounded by m. Therefore we only do O(m * log m) precomputation.
  • Each of the n indices is tested along exactly one path in the recursion, so we only do O(n * log m) calls to P'.

It is worth noting that just memoizing the precomputation also gives the same time complexity (with slightly worse space complexity), and the explanation easily maps to the above.

fn search(i, lo, hi) {
    if hi - lo <= 1 {
        solution[i] = hi
    } else {
        j = (lo + hi) / 2
        pre = precompute_range_memoized(lo+1, j)
        if P'(a[i], pre) {
            search(i, lo, j)
        } else {
            search(i, j, hi)
        }
    }
}

for i in 1..n {
    solution[i] = search(a[i], 0, m+1)
}

Problem

https://codeforces.com/gym/104252/problem/B -- evaluating P' costs O(log m) as far as I can tell, but this is fast enough given the constraints


About me

My name is Sebastian. I'm a competitive programmer (ICPC World Finalist) and coach. I also enjoy learning about programming language theory and computer graphics.

Social links:

Comments

Popular posts from this blog

[EN] Writing a Compiler is Surprisingly Easy (part 1)

[EN] Representing Programs Within Programs is Surprisingly Easy

[EN] Writing a Compiler is Surprisingly Easy (part 2)