# A Fun Complexity Analysis Problem

A while ago a friend (although I do not know who) posed to me a time complexity analysis problem that I found very interesting. Back then I wrote up a LaTeX document of the problem, as well as a solution, and sent it around to a bunch of my friends who liked Computer Science and Math. I thought it was a rather fun problem but no one I knew tried to solve it. Perhaps the internet would be interested in this sort of problem. As of right now, I have no idea where this exercise came from. I think there a couple fun spins you can put on this problem once you understand it too.

## Problem Statement

Suppose you have a set of **unknown** size $n$ consisting of
**unknown** non-negative integers less that $2^k$ for a **known** integer $k$.
Your only acceptable query operation is to query a given $O(1)$ oracle
$\mathcal{O}$ which will, upon numeric input $\alpha$ return one of three
values:

Can you think of an efficient algorithm to determine $n$? What is its
**worse-case** time complexity? Hint: Design an algorithm that provably runs in
$O(nk)$. Can you improve this bound?

## Answer Key (Try to solve the problem first!)

We can frame this problem as trying to find $n$ elements in a sorted list of size $2^k$. We can use the oracle $\mathcal{O}$ to find each of the $n$ elements included in the set by applying binary search $n$ times with the following criteria:

- $\mathcal{O}(\alpha) = 1$: found an element – exit and add $1$ to our size counter
- $\mathcal{O}(\alpha) > 1$: go left – decrease our guess $\alpha$
- $\mathcal{O}(\alpha) = 0$: go right – increase our guess $\alpha$

After each element that’s counted, we can then check $\mathcal{O}(2^k) = 0$ to ensure that the set has been exhausted and we have found the correct size $n$.

We can calculate the time complexity of this algorithm based on our knowledge of binary search. Recall that the time complexity of finding a single element in an sorted list of size $m$ is $\log(m)$ using binary search. Given the problem statement and our new algorithm, we have that the time complexity of finding the set size $n$ is:

\[\begin{aligned} &O(\log (2^k) + \log (2^k - 1) + \log (2^k - 2) + \cdots + \log (2^k - n)) \\ &< O(n \log (2^k)) \\ &= O(nk) \end{aligned}\]Note that we omit the $n$ $O(1)$ queries that are required to check whether the set has been exhausted. This should be “negligible” in the eyes of Big O notation.

I do not have an answer to “Can you improve this bound?” question at this moment. I will not be working on this problem going forward, but if you have an improved algorithm & bound I’ll gladly post it here! I think there other properties that could be exploited for a more efficient algorithm if you’re feeling up to the challenge of pondering it more.

*Want to chat? The best ways to reach me are through my Mastodon and through my email fm1391<at>nyu.edu. You can also check the “Contact” section on my homepage for other methods of getting in touch.*