# All About Binary Search

##### May 4, 2020

A **binary search** is an algorithm that finds the position of a target value in a *sorted* list. It has several uses in practice:

- Finding entries in sorted arrays, including sorted collections from language libraries like Java, .NET, C++ STL;
- Hunting for errors during the debugging process, which is faster than single-stepping large parts of the code (even Git has a git-bisect command);
- Retrieving 3D game displays according to position and camera.

### Process

- Compare target to middle number, since the last is sorted – this tells us whether the target will be in the left or right half of the list.
- We now ignore the other half of the list, seeing as we have divided the problem in half.
- Repeat the same process, with the middle of the new half of the list we are using, until finding the target value, or not finding it in the list.

### Benefits

- Reduces time complexity of linear search from
`O(n)`

to`O(log n)`

in sorted list – this is because the search interval decreases by a power of two each time (halving the lists). - Includes cost of
`insert()`

,`delete()`

, and`lookup()`

- Space complexity is O(1), meaning it requires constant time to perform operations, as we just need to store three values (upper, middle, and lower bounds)
- More efficient for searching for specific target from large input
- Simple to program for lists

**Fun fact:** You could search all the names in the world (written in order) and find a specific name in a maximum of 35 iterations

### Disadvantages

- It only works on sorted lists that are kept sorted
- More complicated to implement than a linear search – it requires a three-way update of low and high index, and potentially an additional check if the target wasn’t found
- Overkill for smaller lists
- The recursive implementation requires more stack space
- Loss of efficiency if the list does not support random-access

Additionally, using interpolating search to predict where to start searching when the elements are distributed evenly gets us to our destination more quickly than a binary search tree, i.e. `O(log(log(n)))`

instead of `O(log(n))`

; and there are further improvements over binary search for sorted arrays that are uniformly distributed (sorted but unindexed on-disk datasets). However, when list size increases exponentially, interpolation search time complexity is `O(log(n))`

, the worst case, as it has to go to different locations according to the value of what is being searched.

### Implementation

```
def binary_search(target, input):
# We use variables low and high as bounds around our target
low = -1 # set low as -1 to start bound to the left of 0th index
high = len(input)
# We know the target must not be in the input when there isn't
# at least one element between low and high indices
while low + 1 < high:
# calculate middle index by taking average of bounds
distance = high - low
# integer division used so we don't get a 'half index'
# flors result
middle = distance // 2
# move new guess index by adding middle to low index
guess_index = low + middle
guess_value = input[guess_index]
if guess_value == target:
return True
if guess_value > target:
# target is to the left, so move high index to the left
high = guess_index
else:
# target is to the right, so move low index to the right
low = guess_index
return False
```

### Python-Specific Modules and Features

The above implementation takes a target and a set of elements as an input. What if our target was a string value and we wanted to search by index instead? In that case, you can use an `index`

parameter:

```
def binary_search(target, input, index = identity):
```

By using the `index`

parameter, we lose the ability to search by value. We could assign index a default value of `None`

and then check whether it was supplied, but a simpler solution uses an inline function:

```
def binary_search(target, input, index = lambda x: x)
```

Here we use `lambda`

to create an anonymous function (one not bound to an identifier). Instead of including a “return” statement like a regular function, it returns an expression.

**lambda**: shortcut for declaring small anonymous functions; can take many arguments but return only one expression

In addition, there are useful functions and modules we can use to implement binary search in Python such as:

**recursive()**: an inner function that can access both elements and value parameters even though they’re defined in the enclosing scope

This is potentially useful when manually implementing a recursive binary search algorithm.

**bisect()**: a build-in module that helps you find an index of an element or add a new element in an already sorted list, via`insort()`

,`insort_left()`

, or`insort_right()`

This helps avoid having to sort the list after each insertion.

### Example using bisect():

```
>>> def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
i = bisect(breakpoints, score)
return grades[i]
>>> [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]]
['F', 'A', 'C', 'C', 'B', 'A', 'A']
```

#### About the author

Sophia Raji

Sophia R. is a junior in computer science at Columbia University. She takes particular interest in full-stack web development and Bitcoin programming. When she is not working on side projects, she teaches coding to middle school and high school students and writes a satire website.