Sorting is a common operation that needs to be done, but there are many ways to do it. We want to use whichever algorithm can sort a list as quickly as possible. By the end of this article, you should be able to choose the best algorithm for your project.
About Time Complexity
Time complexity measures how an algorithm performs under large inputs.
O(1)is the best-case scenario. It means that no matter how big the list is, it will take the same amount of time to sort it.
O(logn)means that the extra time for each element decreases as you add more elements.
O(n)means that a constant amount of time is added for each element.
The latter two are both pretty good time complexities to have. Unfortunately, none of our algorithms are this quick. It’s impossible to have a sorting algorithm that works faster than
Most of the algorithms we will talk about today will be either
O(nlogn). We’ll be using time complexity to compare our algorithms.
Insertion sort works by splitting the list into a “sorted part” and an “unsorted part”. Initially, every element in the list is in the “unsorted part”. The algorithm needs to move all the elements into the sorted part. To do this, it needs to pick an element, and shift it until the element is in its proper place.
Moving a single element, on average, is an
O(n/4) operation. This is because it needs to go through half of the list to find where it needs to go. We can ignore the 1/4. There are n elements, which means the total complexity of insertion sort is
O(n^2). That’s pretty bad.
There is one reason to use insertion sort, which is if there’s a high chance that the list is already sorted. If it’s already sorted, then none of the elements have to move, which means we can sort the list in
O(n) time. If there’s a high likelihood of getting an already-sorted list (like in user input), then this is a good approach.
Merge sort splits the list in half, repeatedly, until every part of the list is just one element. Then, we can merge halves together until we eventually get the sorted list back.
Merging the lists is an
O(n) operation. This time, it only has to be done log(n) times, so the complexity of merge sort will always be
O(nlogn). Unlike insertion sort, the complexity does not change depending on the list. It always takes the same amount of time.
Quick sort is very interesting. The first thing you do is select an element of the list to be a “pivot”. Then, you split the list into three lists. One contains elements that are less than the pivot. One contains elements equal to the pivot. The last one contains elements greater than the pivot. Then you need to run quick sort on the less and greater lists, until everything has been sorted.
The complexity of quick sort is complicated. Usually, it’s
O(nlogn), but it can take longer if the pivot chosen is close to one extreme of the list. The worst-case scenario is
There are many ways to choose a pivot (some of which may or may not take longer than the sort itself). If the list is already sorted, then it will take a while to sort a list if you assign the first element as the pivot. It’s still very fast, but you might not want to use this particular approach for user input.
Counting sort is amazing. You can use either a HashMap or a List. You need to count how many of each number appear in the list. Then, just go through each number and put that many in the list.
Note: Counting sort typically uses a different algorithm, described here. It works better if you’re using things that aren’t numbers. It’s also good when you have a lot of unused values in your range.
Counting sort is the closest thing to an
O(n) sorting algorithm we’ll see. The complexity is
N is the number of elements in the list.
R is the difference between the largest and smallest elements in the list. If the value of
R is very big, then it can take a while to sort. It works very well if you have a lot of the same value and the range isn’t too large.
These are some of the most popular sorting methods. Of course, there are many more. Hopefully, one of these should be fast enough to use for your application. There’s a GitHub repository that benchmarks all of the algorithms in this article. There are also implementations in Python for your reference.
About the authorMike White
Mike White is a second-year Computer Science student at the Rochester Institute of Technology. His interests are technology, philosophy, culture, music, and effective altruism. Mike has a blog about technology and philosophy. If he isn’t doing any of that, then he’s probably either playing a Sherlock Holmes video game or watching YouTube.