[EN] O(log^2(N)) Range Queries on Fenwick Tree Without Inverse Operation

Background: Fenwick Tree

The Fenwick tree is a data structure that is commonly used to perform prefix sum queries and point updates on some array A. Leaving the issue of updates aside the general idea is that, for each index i, it stores the sum of A over the range (i−f(i),i] (yes, that's an open-closed interval), where f(i) returns the least significant bit of i.

To compute a prefix sum over [1,i1] we start with the interval (i2,i1] (where i2=i1−f(i1)), then we add (i3,i2] (where i3=i2−f(i2)), and so on until we cover the entire range. Since each i[k+1] has one fewer bit than i[k], this can only happen O(log(i1)) times in the worst case. Therefore any prefix sum query is computed in O(logn) worst case running time.

int fenwick_tree[MAXN];
int prefix(int i) {
  int x = 0;
  for (; i > 0; i -= f(i)) x += fenwick_tree[i];
  return x;
}

It is also easy to compute any range sum query over a range [l,r] by taking the sum over the prefix [1,r] and subtracting [1,l−1].

int range(int l, int r) {
  return prefix(r) - prefix(l-1);
}

Generalizing

As presented above, instead of sums, we could compute prefix queries for any associative operation, and we could compute range queries for any associative and commutative operation that has an inverse. The requirement on commutativity can be easily removed at the expense of more memory usage and longer implementation. The requirement of an inverse, however, is inherent to the idea of computing ranges by using two overlapping prefixes but is not really necessary to achieve range queries in some other way.

Range queries without inverse

Given this data structure, there is a simple greedy algorithm for computing range queries over [l,r]:

  • if (r−f(r),r] is contained in [l,r], accumulate that range and then compute [l,r−f(r)]
  • otherwise, accumulate A[r] and then compute [l,r−1]

It can be shown that this algorithm runs in O(log(n)2) in the worst case.

I will not write a proof but the intuition is that f(i−f(i)) is twice f(i) (or zero). That means that in O(logn) steps we will cover any interval. We may overshoot, but by that time we will have covered at least half of the interval already, so this process can only happen O(log(n)) times.

The code looks something like this:

int range(int l, int r) {
  int x = 0;
  while (r-l+1 > 0) {
    if (f(r) <= r-l+1) {
      x += fenwick_tree[r]; r -= f(r);
    } else {
      x += A[r]; r -= 1;
    }
  }
  return x;
}

I used the sum operation for clarity, but it also works for any associative operation with an identify element.

int range(int l, int r) {
  int x = IDENT;
  while (r-l+1 > 0) {
    if (f(r) <= r-l+1) {
      x = OP(fenwick_tree[r], x); r -= f(r);
    } else {
      x = OP(A[r], x); r -= 1;
    }
  }
  return x;
}

For instance, we could use it with min operation:

#define IDENT INT_MAX
#define OP min

Conclusion

The Fenwick tree can perform arbitrary range queries on associative operations but if we want to do any kind of updates at all, we will need commutative operations. Also, if we want to perform point-assignment updates -- which is the most common for RMQ among others -- we would need an inverse operation.

It has slow complexity compared to e.g. segment trees, and in practice i tested it to be even slower that the obvious sqrt-decomposition approach.

The only situation where I would use this is if we were forced to use the input array in-place, as the Fenwick tree uses exactly N words of memory and is not that hard to build in-place over an existing array.

Thus I conclude that this is not a useful data structure at all.


About me

My name is Sebastian. I'm a competitive programmer (ICPC World Finalist) and coach. I also enjoy learning about programming language theory and computer graphics.

Social links:

Comments

Popular posts from this blog

[EN] Writing a Compiler is Surprisingly Easy (part 1)

[EN] Representing Programs Within Programs is Surprisingly Easy

[EN] Writing a Compiler is Surprisingly Easy (part 2)