Studio 5: The Master Method, and Binary Search

Introduction
Part A: Master Method Practice
Part B: Broken Binary Search
Part C: Handling Repetition in Binary Search

Introduction

The first part of this studio asks you to practice applying the Master Method to recurrences. The second part is about the binary search algorithm for finding a value in a sorted array.

Part A: Master Method Practice

Read the description of the Master Method in this Wikipedia entry.

The version of the Master Method presented here is more general than the one in your textbook and the lecture. The more general version extends the "balanced" case to allow you to solve more recurrences, in particular many recurrences for which the non-recursive term $f(n)$ contains powers of logarithms.

Consider applying the Master Theorem to each of the recurrences below.

If the Master Theorem applies, state the case that applies and the solution to the recurrence according to the theorem.
If the Master Theorem does not apply, state why. (You don't have to solve the recurrence in this case.)

Check with your TA after each five recurrences to see if you're getting the right answers.

$T(n) = 3 T(n/2) + c n^2$
$T(n) = 4 T(n/2) + c n^2$
$T(n) = T(n/2) + 2^n$
$T(n) = 16 T(n/4) + n$
$T(n) = 2 T(n/2) + n \log n$
$T(n) = 2 T(n/2) + n \log^2 n$
$T(n) = 2 T(n/2) + \frac{n}{\log n}$
$T(n) = 16 T(n/4) + n!$
$T(n) = 3 T(n/2) + cn^{\log_2 3}$
$T(n) = 3 T(n/3) + n$
$T(n) = 3 T(n/3) + \sqrt{n} $
$T(n) = 3 T(n/4) + n \log n$
$T(n) = 3 T(n/3) + n \log \log n$
$T(n) = 64 T(n/8) - n^2 \log n$
$T(n) = 7 T(n/3) + n^2$

Part B: Broken Binary Search

We briefly covered binary search in lecture. It takes as input a sorted array $A$ along with its size $n$, plus a query element $x$. If $x$ occurs in $A$, the algorithm returns the index of a location at which $x$ occurs; otherwise, it returns a special value notFound.

We'll refer to this algorithm below as "BSearch". Given (e.g.)
\[ A = [ 2, 5, 7, 12, 19, 36 ], \] we would expect BSearch to behave as follows:

BSearch(A, 6, 7) returns 2
BSearch(A, 6, 19) returns 4
BSearch(A, 6, 13) returns notFound
BSearch(A, 6, 0) returns notFound
BSearch(A, 6, 50) returns notFound

We'll now look at some pseudocode for BSearch. This is an extremely common algorithm, but also one that is very easy to get wrong!

For this section, we'll assume that the array $A$ contains no duplicate values. The basic idea of binary search is as follows:

If $A$ is a single element, then either this element is $x$, or $x$ does not occur in the array.
Otherwise, divide the array $A$ roughly in half, and look at its middle element.
If the middle element is smaller than $x$, then $x$ must appear in the upper half of $A$ if it appears at all.
If the middle element is bigger than $x$, then $x$ must appear in the lower half of $A$ if it appears at all.
Look for $x$ in whichever half of the array is indicated by (2-4).

Try to execute a couple of the examples above following this outline. Each member of the group should try the examples separately and then check with the rest of the group to see if everyone did the same thing.

Question B1: what important details does the above sketch of the algorithm omit?

Working from the sketch, J. Random Hacker has generated the following pseudocode:

  BSearch(A, n, x)
  L <- 0              // lower bound
  H <- n-1            // upper bound
  while (L < H)
    mid <- (L + H)/2  // midpoint
    if (A[mid] <= x)
	  L <- mid        // recur on upper half
	else
	  H <- mid - 1    // recur on lower half
	
  if (A[L] = x)       // A is a single element
    return L
  else
    return notFound

(Here, the division (L+H)/2 is an integer division, i.e. it takes the floor of the real-valued ratio.) When asked to explain why this implementation works, Mr. Hacker states that at all times, the code satisfies the following invariant property:

$x$ exists in $A$ iff $x$ lies in the subarray $A[L..H]$.

Each iteration of the loop narrows down the range in which $x$ occurs to either the upper or lower half of the previous range $L..H$, until $x$ is either found or not.

Question B2: assuming $n > 0$, prove that the invariant holds after any number of iterations of the while loop. (Hint: use an inductive argument on the number of iterations, starting with 0.)

So clearly, if the code satisfies the invariant, it is correct... right?

Question B3: find a small input (still with $n > 0$) for which the above code does not return the correct result. What behavior does your example cause?

Question B4: why was our invariant insufficient to guarantee a correct algorithm?

Question B5: formulate an additional correctness property besides the invariant that, if true, would guarantee that the bad behavior you saw would not happen.

Mr. Hacker, suitably shamed, has provided the following modified code:

  BSearch(A, n, x)
  L <- 0              // lower bound
  H <- n-1            // upper bound
  
  while (L < H)
    mid <- (L + H)/2  // midpoint
    if (A[mid] < x)
	  L <- mid + 1    // recur on upper half
	else
	  H <- mid        // recur on lower half
	
  if (A[L] = x)       // A is a single element
    return L
  else
    return notFound

Question B6: does the modified code satisfy your additional correctness property? Prove it!

Question B7: why does the invariant, together with your additional property, imply that the above code correctly implements binary search for every $n > 0$, when the invariant alone did not?

Part C: Handling Repetition in Binary Search

We're now confident that the revised binary search works correctly for sorted arrays with no repeated elements. But what if the array does contain repeats?

Question C1: review your correctness proof. Does it depend on $A$ containing no repeated elements? If so, can you change the proof so that it also applies to arrays with repeats?

Question C2: if $A$ contains a repeated element $x$, and we call Bsearch(A, n, x), which copy of $x$ is returned by the above code? (Try a few examples and see!)

Question C3: Prove that your belief about which copy of $x$ is returned is always correct.

Now try to change the code so that, when $A$ contains duplicates, it does the opposite of what the above code does. For example, if the code returns the rightmost copy of $x$ in $A$, make it return the leftmost copy instead, or vice versa.

You might have to make multiple changes to get a correct algorithm. First, find a one-line code change that ensures that new property you want. Then, try to change the rest of the code, if needed, so that the two correctness properties hold as well.

Question C4: Write down your modified code and your correctness proof.

When you think you've got it, try to convince your TA that your modified code is correct and does what you claim.