Outcome (2): Sorting algorithms
Lower bound on Sorting (by comparisons) & Sorting in linear time
• Bubble sort • Quick sort
• Selection sort • Heap sort
• Insertion sort • Counting sort
• Merge sort • Radix sort Not discussed yet
• Bucket sort
Have you noticed that the sorting algorithms we learned so far has
the following characteristics?
They make no assumptions on the values of the numbers and
1 the sorting algorithms are based on the comparisons
between elements.
The best algorithm we have learned runs in O(n log n) time in the
2
worst case and examples exist that require (n log n) time.
Have you ever wondered if we can get a better algorithm that runs
in o(n log n) based on comparisons between elements?
In fact, it can be shown that it is NOT possible if the
algorithm is based on element comparison 1
Q: Does it mean that we cannot have a sorting algorithm which
runs in o(n log n) time in the worst case?
No, if the algorithm does not base on comparisons between
numbers, it may be possible to have faster algorithms.
Counting sort
Radix sort not comparison based algorithm
Bucket sort
Counting sort
Assumptions: The given numbers are integers in the range 0 to k.
Idea: If I know that m numbers are less than x, then I know where I
should place x in the sorted sequence (position m+1 ?)
Q) How we count the number of elements less than a particular
element x?
Q) How to handle duplicate numbers? For example, if you have two
numbers of value x, one should be placed at m+1, the other will be at
m+2!
2
Use two arrays C[0..k] (working storage) and B[1..n] (output)
Step 1: Scan A, and store the number of occurrences of i in C[i]
for (i = 1 to n) {
O(n) C[A[i]] = C[A[i]] + 1 // assume C has been initialized to 0
}
O(n)
e.g. k = 6, A = [5, 4, 3, 5, 6, 1, 3, 2, 5, 6, 4, 2]
C = [0, 1, 2, 2, 2, 3, 2]
Step 2: Determine # of elements i for all i 1 based on C
for (i = 1 to k) { C = [0, 1, 3, 5, 7, 10, 12]
O(k) C[i] = C[i] + C[i-1] Note: C[i] counts also elements = i
}
Step 3: Place A[j] in the appropriate position using information in C
for (i = n downto 1) {
B[C[A[i]]] = A[i] B = [1, 2, 2, 3, 3, 4, 4, 5, 5, 5, 6, 6]
O(n) C[A[i]] = C[A[i]] – 1
Total time complexity = O(n+k),
}
if k = O(n),
the overall time complexity = O(n)
3
Radix sort
Idea: Sort a set of integers by one digit in each pass. If the
integers have d digits, d passes are sufficient to sort them.
Example
329 720 720 329
457 455 329 436 Note: It is essential
657 436 436 455 that the relative order
839 457 839 457 of two elements with
436 657 455 657 the same digit being
720 329 457 720 considered is not
455 839 657 839 changed
At each pass, we only look at one digit starting from
the least significant digit (the rightmost digit)
For each pass, any available sorting algorithm can be used as
long as it maintains the relative order of two elements with the
same digit being considered (we call these algorithms “stable”).
4
A sorting algorithm is stable if numbers with the same value appear in
the output array in the same order as they do in the input array.
Q: Is Counting sort stable?
for (i = 1 to n) {
e.g. k = 6, A = [5, 4, 3, 5, 6, 1, 3, 2, 5, 6, 4, 2] C[A[i]] = C[A[i]] + 1
C = [0, 1, 2, 2, 2, 3, 2] }
C = [0, 1, 3, 5, 7, 10, 12] for (i = 1 to k) {
C[i] = C[i] + C[i-1]
B = [1, 2, 2, 3, 3, 4, 4, 5, 5, 5, 6, 6] }
Yes, Counting sort is a stable sort for (i = n downto 1) {
B[C[A[i]]] = A[i]
Q: If we change the last loop in Counting C[A[i]] = C[A[i]] – 1
sort to }
for (i = 1 to n) {
B[C[A[i]]] = A[i]
C[A[i]] = C[A[i]] – 1
}
Is it still correct? If yes, is it still stable?
5
RadixSort(A, d){ // each integer has d digits
for i = 1 to d { // let digit 1 be the least significant (rightmost) digit
use Counting sort to sort A on digit i; }
}
// can be substituted by any stable sort //
Time complexity:
Assume that each digit can have k different values.
Counting sort takes O(n+k) time
So, the overall time complexity is O(d(n+k))
If d is a constant and k = O(n), radix sort runs in linear time
6
Bucket sort Won’t have many numbers in one bucket
Assumption: numbers are uniformly distributed over the interval [0, 1)
Idea:
1) Divide [0, 1) into m equal-sized subintervals (called buckets).
2) Distribute the n numbers into these buckets according to their
values.
3) Sort numbers in each bucket separately, say, by insertion sort,
then output numbers in each bucket one by one.
Example Bucket B[0] etc.
0.78 [0.0 – 0.1): [0.0 – 0.1):
0.17 [0.1 – 0.2): 0.17 => 0.12 [0.1 – 0.2): 0.12 => 0.17
0.39 [0.2 – 0.3): 0.26 => 0.21 => 0.23 [0.2 – 0.3): 0.21 => 0.23 => 0.26
0.26 [0.3 – 0.4): 0.39 [0.3 – 0.4): 0.39
0.72 [0.4 – 0.5): [0.4 – 0.5):
sort bucket
0.94 [0.5 – 0.6): [0.5 – 0.6):
individually
0.21 [0.6 – 0.7): 0.68 [0.6 – 0.7): 0.68
0.12 [0.7 – 0.8): 0.78 => 0.72 [0.7 – 0.8): 0.72 => 0.78
0.23 [0.8 – 0.9): [0.8 – 0.9):
0.68 [0.9 – 1.0): 0.94 [0.9 – 1.0): 0.94 7
BucketSort(A) {
for i = 1 to n {
insert A[i] into Bucket B[mA[i]]
}
for i = 0 to m-1 { // for each bucket
sort elements in Bucket B[i] using insertion sort
}
Concentenate B[0], B[1], ..., B[m-1] in order
}
Time Complexity (only a rough idea):
m −1
T (n) = O (n) + O (n j )
2
where nj is the number of elements in B[j]
j =0
If m = O(n), then n j is probably a constant.
=> T(n) = O(n)
For a formal proof, please take a look at the MIT book
8
Any comparison sort algorithm requires (n log n) comparisons in
the worst case
We call this a lower bound result and will prove it in this lecture
One technique to prove lower bound result is to use Decision Tree
We can use a decision tree, which is a full binary tree, to
represent the comparisons between elements that are performed
by a particular sorting algorithm on an input of a given size. How?
Internal node
A[i]:A[j] Compare elements A[i] and A[j] (original position)
>
Root represents the first
Represent the case Represent the case comparison to be done by
that A[i] A[j] that A[i] > A[j] the algorithm
Leaf node Represent a permutation such that
< A[(1)], A[(2)],... A[(n)]> the elements are in increasing
order, e.g. <A[2], A[1], A[3]>
An execution of the algorithm corresponds to a path from root to a leaf
9
Example
InsertionSort(A) {
for (i = 2 to n) { // n-1 passes
j = i-1 // Insert A[i] into A[1], A[2],..., A[i-1]
while ((j 1) and (A[j] > A[j+1])) {
swap(A[j], A[j+1])
j--; Let n = 3 If A=[6, 8, 5], the
} execution path is
} A[1]:A[2] shown in red
} >
A[2]:A[3] A[1]:A[3]
> >
<A[1], A[2], A[3]> <A[2], A[1], A[3]>
A[2]:A[3]
A[1]:A[3]
>
<A[1], A[3], A[2]> > <A[2], A[3], A[1]>
<A[3], A[1], A[2]> <A[3], A[2], A[1]>
10
All comparison sort algorithms can be represented
using this kind of decision tree
What’s the implication?
The longest path represents the worst case for a particular algorithm
and the length of this path (ie., the height of the tree) is the worst
case number of comparisons needed
So, we want to give a lower bound on the height of the tree for any
possible algorithm
For any algorithm, it must be able to handle all kinds of input sequence,
so, # of leaves in the corresponding decision tree
# of possible permutations of n numbers = n!
Let h be the height of the tree. Then, 2h n!
=> h log(n!)
=> h = (n log n)
In other words, any sorting algorithm that bases only on comparisons
between elements takes (n log n) time in the worst case.
11
Exercise
Assuming that we have 9 coins of which one of them is counterfeit
and is lighter than others, you are given a balance which can only tell
which side of the balance is heavier or both sides are of equal weight.
Q1) In how many weightings, one can identify the counterfeit coin?
Q2) Can you prove that the answer you give in Q1 is the best possible?
12