seek=13 | bot | mid | top |
bot=-1 | −1 | | |
top=15 | | | 15 |
while (15 - -1 > 1) | | | |
mid=(15+-1)/2 | | 7 | |
array[7]=8<13 | | | |
bot=7 | 7 | | |
while (15 - 7 > 1) | | | |
mid=(15 + 7)/2 | | 11 | |
array[11]=14>=13 | | | |
top=11 | | | 11 |
while (11 - 7 > 1) | | | |
mid=(11 + 7)/2 | | 9 | |
array[9]=10<13 | | | |
bot=9 | 9 | | |
while (11 - 9 > 1) | | | |
mid=(11 + 9)/2 | | 10 | |
array[10]=12<13 | | | |
bot=10 | 10 | | |
! (11 - 10 > 1) | | | |
return top=11 | | | |
3 Experiments
Use check to verify that AL-DL are correct by doing things like
check binary-al
check Binary.java choice=3
of which the first version uses the "model answer" built into the
TESTER and the second uses the template file Binary.java in
your filespace, in which AL is selected by setting choice=1,
BL=2, CL=3 and DL=4.
Paste in the correctness reports of the TESTER.
Verify that each version is able to find some occurrence
of any value that is actually present, using tests like
run Binary.java choice=1 array=1,3,4,6,6,6,8,8,8,10,12,14,14,14,15 seek=6
Be careful to check the extreme values, 1 and 15.
When a search is done for a value that occurs more than once, can you
characterise which occurrence each of the eight versions finds?
Now consider a search for a value that does not occur in the array.
What position does each method return? Don't forget the extreme
cases, i.e. values that are smaller (or larger) than all of the values
that do occur in the array.
What happens if the array is of zero length?
run Binary.java choice=1 size=0
Do you get an ArrayIndexOutOfBoundsException? What position
is returned? Is this consistent with the specification of the
behaviour that you have just formulated, where an absent value was
sought in a non-empty array?
Change the initialisation bot=-1; in AL to 0 or
-2, and check the behaviour for small values. What goes
wrong?
Similarly, change the initialisation top=size to size-1.
Explain how you would change the code to search the segment
array[3] to array[13] inclusive, so that it
never accesses array[2] or array[14].
Change the return value from bot to top or vice
versa in all four versions.
There is a slight difference in the amount of time that they take.
Find out the coefficient of nlog(n) and paste it in as a comment
in the source file. Which is faster?
Does the timing graph depend on the range of random
increments in the array values? (You will need to make range
very small or density large to see any difference.)
4 The pre-condition
These algorithms only work when the array satisfies
a certain precondition.
What is this condition? Write it in logical notation.
What does the program do if it is not satisfied?
Write the JAVA code to test whether an array satisfies this property.
How long would it take to execute as a function of the size of the
array?
When you use a dictionary, do you (as a human) make such a
check? Describe carefully would be the effect on the usefulness of
the dictionary if a few of the words failed to satisfy this condition.
This verification takes time that depends linearly on the
length of the array (and sorting it,
i.e. enforcing rather than verifying it, would
take time at least O(nlogn)), which is much greater than
the time that the search itself takes. To verify or enforce the
precondition would therefore entirely defeat the point of using binary
search.
In the shift exercise, the TESTER insisted that you should
check the precondition before proceeding with the shift, and throw an
Exception if it fails.
Discuss the consequences of trying to proceed with shift, or with
binary search, in a situation whether the precondition is not
satisfied. How, on the other hand, do the costs of checking
the precondition compare?
5 Time complexity
On paper, find out how many times you go round the loop if
the array is of size 3, 7, 15, 31, 63, ..., or more generally of
of size 2k−1.
How many bits long is the number n, written in binary, if
2k−1 ≤ n ≤ 2k−1?
You may have learned about logarithms to base 10 or base e
in Calculus at school. In Computer Science we always use
logarithms with base 2, and usually round them up to the nearest
whole number.
Now use the TESTER's check command to find out how long
binary search takes on arrays of size 3, 7, 15, 31, 63, ... and so on
up to the biggest that the JAVA run-time system will allow.
Shut down all unnecessary programs, especially Netscape.
It will be possible to run the program for arrays that are larger than
the available RAM, because your workstation will start using
virtual memory, on disk.
This will completely ruin your timings.
You can tell that this is happening by listening to the disk
and watching the lights on the box.
The time taken to execute the code depends logarithmically on
the length, and is therefore substantially faster than linear
search.
Look inside the file Binary.plot to find out how to plot a
graph of time against the logarithm of the size, so 2, 4, 8, 16,
32, 64, ... are equally spaced.
If you are very careful with the experiments, you can find out
the size of the primary memory cache within your CPU, and
how long it takes to fetch data from the secondary cache within the
CPU.
6 Mirror-image algorithms
Within your copy of the file Binary.java, make a copy
of the cases 1-4 (AL-DL), calling them cases 11-14,
and change < to <=. So, for example, case 2 (BL)
becomes
case 12: {
int bot = 0;
int top = size;
while (top > bot) {
int mid = (top + bot - 1)/2;
if (array[mid] <= seek) bot = mid + 1;
else /* array[mid] > seek */ top = mid;
}
return top;
}
Does this still return the position of the leftmost occurrence
when seek occurs several times?
How does the position returned by the algorithm now relate to the
position where the value should be inserted if it is absent?
You may need to change the return values in cases A, C and D
to get this right.
What are appropriate names (instead of AL-DL) for these four versions
of the algorithm?
What behaviour (in particular for missing values) would be appropriate
if this algorithm were to be used as the search subroutine for
insertion sort?
7 Early exit
If you tried to program binary search for yourself without looking at
(the four versions of) the code at the beginning of this chapter,
you probably wrote something like
int CX (int[] array, int seek) {
int bot = 0;
int top = size - 1;
while (top >= bot) {
int mid = (top + bot)/2;
if (array[mid] == seek) return mid;
if (array[mid] < seek) bot = mid + 1;
else /* (array[mid] >= seek) */ top = mid - 1;
}
return bot;
}
which certainly appears in many of the textbooks. We shall say that
this version of the algorithm makes an early exit.
Include CX in your copy of Binary.java, calling it case 23.
The other three versions can be modified in a similar way, and we call
them AX, BX and DX or cases 21, 22 and 24.
Now, when you use CX to search for a value that occurs more than once,
does it return the leftmost or rightmost occurrence?
On the other hand, if the value is not present, does CX still returns
the position before which to insert it?
Which line do you have to change to obtain the position after
which to insert a missing value?
As you see, this "early exit" test means that the algorithm
satisfies a weaker specification. But the point of adding this
test was to make it run faster.
Does CX actually perform the search any quicker?
Assuming n=2r, no value is repeated in the array, we search for each
value with equal likelihood, and never search for absent values,
in how many cases is the search successful after exactly k iterations?
What is the average number of comparisons made in the search?
Hint:
(1/2) + (2/22) + (3/23) + (4/24) + (5/25) + (6/26) + … = 2 |
|
What saving does this represent, on average, compared to the r
comparisons that are made in every case with no early escape?
In our code (comparing integers), a three-way test is actually twice
as expensive as a two-way test, so is there any advantage?
Now consider that, in practice, unsuccessful searches are performed,
as well as successful ones. How does this affect the conclusion?
What difference does this make to the time?
One might suppose that C would be faster than
method S because it makes an early exit from the loop.
On the other hand, the three-way test in C takes longer
than the corresponding two-way test in S.
In the JAVA implementation, S turns out to be faster.
Suppose that we were instead searching a String[] array,
using the JAVA library method
int comparison = string1.compareTo (string2);
which returns −1, 0 or +1 according as string1 comes
before, at the same place as or after string2 in the dictionary.
This method, which we would only call once for either C or S,
takes considerably longer than the rest of the code,
so avoiding one or two calls to it could quite easily outweigh
the cost of the three-way test amongst −1, 0 and +1.
The COMP instruction in SF4 machine code (from the CS1 course)
also makes a 3-way test,
in the sense that it sets N and Z flags
inside the CPU in such a way as to distinguish the three cases.
As for compareTo, we would only use it once
in either C or S.
Donald Knuth1
claims that C is faster unless the array is very large,
though I don't believe this.
Note: this is still a comparison between the two algorithms
C and S, not between JAVA and machine code.
Assuming that the value occurs exactly once, the early exit saves
just one iteration on average. (This follows from a study of how
many nodes of a binary tree are leaves, at height 1, at height 2,
etc.., as in the analysis of merge sort.)
8 Changing the loop test and the calculation of mid
99% of while loops that you will ever meet can be proved to
terminate by observing that the loop variable is a whole number that
increases or decreases by 1 or more on each iteration, between
pre-determined bounds. As we shall now see, termination of binary
search is not quite this obvious, and, on the other hand, some curious
things happen if you exit the loop too early.
Change the loop condition (while (top > bot) etc.).
What happens if you make the condition stronger? For example, change AL to
while (top - bot > 2)
What happens if you make it weaker?
while (top > bot)
In this case it more readily continues going round the loop.
The one remaining thing to change is the calculation
mid = (top + bot + 1)/2;
A easy mistake to make (you will do this often) is to miss out the
brackets. Try this, and find out whether this makes the algorithm
give wrong results.
On the other hand does it necessarily produce any result at all?
A more reasonable version of this "mid-point" calculation is
(top+bot)/2. Do the same experiments.
Keeping your alternative calculation for mid for which the
algorithm fails to terminate, change the loop condition to fix the
problem. What is wrong now? Can you fix it with extra code after
the end of the loop? Make sure that this still works with arrays
of length zero.
9 Termination
The root of the problem is that integer division by 2 does not necessarily
make a number smaller, or turn one non-zero number into another.
Annoyingly, JAVA gives (-1)/2==0,
and this is significant in one case,
but for this section let us assume that division always rounds down,
to less positive or more negative whole numbers.
Draw out a table whose rows and columns are each labelled from −1 to +6.
Calling the number of the column top and the row bot,
calculate (top+bot)/2 in each square of the table.
Make six copies of this table, and two of another containing the values of
(top+bot+1)/2.
Label the tables AL ... DR according to the assignment to
mid inside each loop.
From each square of each table, draw vertical red arrows indicating the
effect of as assignment to bot if one branch of the conditional
is taken. To ensure termination, the arrows should go at least one square
downwards, but some of them go upwards or are loops.
Similarly, draw green horizontal arrows to indicate the effect of assignments
to top.
The normal initial situation is in the top right triangle of the table.
From here, the arrows point down and left towards the diagonal.
There is a mirror-world (with top<bot) in the bottom left of the table,
where the arrows generally point up and right, but still towards
the diagonal.
However, near the diagonal are loops in single squares,
and linking pairs of squares.
This region is called the attractor set.
By this we mean the smallest region with the property that:
- wherever you start, you eventually end up inside the attractor, and
- once you're inside the attractor, you can't get out!
Here we have a tiny, tiny, tiny example of chaos.
If you get bored or make a mistake drawing the two-dimensional tables,
notice that they have a diagonal translation symmetry:
the behaviour really only depends on the value of top-bot,
not on top and bot separately.
(This is where the stupid standard implementation of integer division
makes a mess of things.)
Drawn out in this one-dimensional fashion, the attractor set for
top-bot has just two or three elements, near to zero.
Although there are some coincidences,
it is by no means the case that all eight versions of binary search
have the same attractor set -
that's why we needed three different loop conditions,
and, for the D versions, an untidy extra test after the loop.
Compare the top of the attractor set with the values that were
chosen for the loop condition.
Now it is obvious why the program didn't terminate if we made the
loop condition weaker.
But notice something else: in some cases all of the arrows that enter
the attractor set arrive at the same point, in others they don't.
In the former case, we know, after the loop, exactly what the
value of top-bot is.
In the latter we don't, and, as a result, may have to do some further tests.
10 Proving termination
The basic idea behind a proof of termination is that the value of
top-bot is strictly reduced on each iteration.
The modifications that you made to the programs,
and the analysis of attractor sets both show that this argument
needs to be considered more carefully.
This is much easier than the work involved in calculating the attractor sets.
In cases A and D, the loop condition says that
from which it is easy to deduce that
but the first < becomes a <= for BL, CL and
CR,
and the second does so in the cases BR, CL and CR.
Using these little bits of arithmetic,
we can check that top-bot really is reduced on each iteration.
11 Proving correctness
If the loop terminates normally, with no early escape,
which of top, mid or bot points to x?
When the loop exits, its condition having failed, we have some bound
on the value of top-bot,
although careful examination of the attractor sets shows that we may not
necessarily know exactly what the final relationship is between
top and bot.
The condition inside the loop does, however, tell us something very
interesting.
After each assignment to bot, we know one of
A[bot]<=x A[bot]<x A[bot-1]<=x A[bot-1]<x |
|
depending on which version of the code we're considering.
Similarly, after each assignment to top, we know one of
x<A[top] x<=A[top] x<A[top+1] x<=A[top+1] |
|
Moreover, the initialisations of bot and top we also chosen
to ensure that these properties trivially hold before we enter the loop.
Well, they actually involve positions in the array that are out of bounds,
so they only hold if we adopt the convention that
Now, putting these inequalities together, you should be able to prove
that each of the eight versions of binary search must behave
in the way that you found out by experiment in the first section.
Again, begin by drawing the pictures for the loop invariant
for the four versions of the binary search.
The comparisons that you have to do inside the loop are different
for the four cases.
In what way is (c) now slightly simpler than the other versions?
12 Logical analysis of version AL
Note: be careful to distinguish between the old and new values of variables
to which assignments are made (top and bot).
- The assignment bot=mid is executed after the test
(top - bot > 1) has succeeded and the initialisation
int mid = (top + bot)/2 has been made.
For any integers b and t, we claim that
t − b > 1 ⇒(t + b) / 2 > b. |
|
But, as these are integers, t ≥ b+2, so
(t + b)/2 ≥ (b + 2 + b) / 2 = b + 1 as required.
Hence the variable bot is increased by at least 1.
Beware that, since we are dealing with integer division,
it is not admissible to "multiply throughout by 2" to eliminate
the division.
Mathematical notation - which, after all, you learned at school -
is essential for such reasoning, as English words are imprecise.
Maths is also much quicker to write!
Note that mid may often be closer to bot than to
top, for example if bot=8, top=11,
so mid==9.
- This calculation would no longer be valid
if the loop condition were changed to
while (top - bot > 0).
Eventually we (may) reach the state top==bot+1
and continue executing the loop,
in which case the value of mid is (t+b)/2=(b+1+b)/2=b,
leaving bot unaltered.
- Suppose, for example, seek=7, bot==3, top==4
and array[3]=6. Then we re-assign the same value, bot=3,
and go round the loop again in exactly the same state as before,
and again, and again, ...
On the other hand, with seek=5, we would instead assign
top=3 and leave the loop.
So this version sometimes fails to terminate.
- Similarly, the assignment bot=mid makes the new value
of bot at most top-1,
because, as in (a), (t+b)/2 ≤ (t+t−2)/2 = t−1.
- After the assignment bot=mid has been made,
array[bot] < seek.
- Similarly, after the assignment top=mid has been done,
seek <= array[top].
- The two logical statements in (e) and (f) are not, as they stand,
valid just after the initialisations
bot=-1 and top=array.length,
since these are "out of bounds".
We may, however, restore meaning to the statement either by
adopting the convention that array[−1]=−∞
and array[array.length]=+∞,
or by using the more complicated statement
(bot = −1 ∨array[bot] < seek) ∧ (top = array.length ∨ seek ≤ array[top]) |
|
(though even makes illegitimate use of array[-1] and so depends
on a convention about ∨).
Of course, this convention is only appropriate to the discussion
of an array that's sorted in ascending order!
- Suppose that the logical statement in (e) happened to be valid
just inside the beginning of the body of the loop,
but that the else branch
(including assignment top=mid) is executed.
Then (e) is still valid, because bot, seek and the
contents of the array have not been altered.
Such statements about what has not happened are just as
important as saying what has happened,
cf. that the rest of the array has stayed the same when
part of it has been shifted.
- Therefore, on exit from the loop we may conclude
either that array[bot] < seek <= array[top],
or the more complicated statement in part (g).
This is because this statement
- has been verified on entry to the loop, and
- is preserved by each iteration,
- so is therefore still true on exit -
- assuming that the loop does eventually exit!
- We exit the loop exactly when the loop condition fails,
i.e. !(top - bot > 1).
- Part (d) said that (top - bot >= 1) after the loop body
has been executed, irrespectively of whether it was true at
the top of the loop, though in fact it is also true after
the initialisation, so this statement is part of the loop invariant.
Putting this together with part (j), we have (top == bot+1)
on exit from the loop.
- Substituting (k) into the loop invariant, we have
array[top-1] < seek <= array[top] on exit from the loop.
- We have already said that the array must be sorted as a precondition
for binary search to work.
Then,
∀i. 0 ≤ i < top−1 ⇒
array[i] ≤ array[top−1] < seek. |
|
so it is not possible for array[i]==seek where i<=top-2.
- Hence, if the value seek occurs in the array,
top is the position of its leftmost occurrence.
- Suppose that array[top] != seek.
Then, by the same reasoning as in (m),
∀i. top < i⇒
seek < array[i].
Hence the number seek does not occur anywhere in the array.
- More precisely, the value seek lies strictly between those
in the array in positions top-1 and top,
so top is the position before which to insert.
- If we replace the loop condition by while (top - bot > 2),
the best that we can say is that top==bot+1 or top=bot+2,
so array[top-2] < seek <= array[top],
but we know nothing about array[top-1].
13 The intermediate value theorem
Binary search provides a constructive proof of a standard theorem in
Analysis.
("Analysis" is what mathematicians call Calculus done rigorously.)
Clearly this would not be part of a test in a computer science course.
"If you're on one side of a river in the morning and on the other side
in the afternoon, you must have eaten your lunch on a boat!"
Let f:[0,1]→R be a real-valued continuous function on the
real unit interval.
Suppose that f(0) ≤ s ≤ f(1).
Then ∃x.0 ≤ x ≤ 1∧f(x)=s.
The proof usually found in the textbooks considers the point
x= |
sup
| {y ∈ [0,1] | f(y) ≤ s} |
|
and uses continuity to prove that f(x)=s,
without actually offering any way to calculate x.
Here is a constructive way to solve the problem
(so long as the function f is such that you can decide,
for each y ∈ [0,1], whether f(y) < 0 or f(y) ≥ 0).
You can apply this method to solving equations like nlog2(n)=106
if you wish.
We define three sequences bn, tn and mn of real numbers
by a recursive procedure.
First, put b0=0 and t0=1.
Then let mn=(bn+tn)/2, and
- if f(mn) < s, put bn+1=mn and tn+1=tn;
- if f(mn) ≥ s, put bn+1=bn and tn+1=mn.
So bn is an increasing sequence, and tn a decreasing sequence,
and moreover tn−bn ≤ 2−n, so both sequences converge to a
common limit. Call it x. This satisfies
∀n. x−2−n ≤ bn ≤ x ≤ tn ≤ x+2−n. |
|
We haven't used continuity of f yet, just as in binary search we didn't rely
on the array being sorted until the very end (question 7(m) and (o)).
By definition, we say that f is continuous at x if
∀ε > 0. ∃δ > 0. ∀y. |y−x| < δ⇒|f(y)−f(x)| < ε. |
|
Suppose f(x) > s, so ε = f(x)−s > 0,
and let δ > 2−n > 0 be as in the definition of continuity.
But then y=bn satisfies |y−x| < δ, so
|f(y)−f(x)| < ε by continuity.
However, by construction f(y)=f(bn) < s < f(x),
so f(x)−f(y) > ε.
The case f(x) < s also leads to contradiction by considering tn,
so we are left with f(x)=s as required.
|