CH7 Sorting
CH7 Sorting
Structures
CH7 Sorting
Prof. Tai-Lang Jong
Office: Delta 928
Tel: 42577
email: tljong@mx.nthu.edu.tw
Spring 2022
Outline
• 7.1 Introduction
• 7.2 Insertion Sort
• 7.3 Quick Sort
• 7.4 How fast can we sort?
• 7.5 Merge sort
• 7.6 Heap sort
• 7.7 Radix sort
• 7.8 (List and table sorts)
• 7.9 Summary of internal sorting
2
Important Uses of Sorting
• Searching a List of records – a common operation
• Each record has 1 or more fields
• Record uses key fields to distinguish among records
• Search for a record with the specified key
• If all keys are distinct and the key a[i] is being searched
for, then i key comparisons are made
• The average number of comparisons for a successful
search is ሺσ 𝑛𝑖=1ሺ𝑖ሻሻ
= (𝑛 + 1)/2
𝑛 4
Binary Search
template <class E, class K>
int BinarySearch(K *a, const int n, const K& k)
{ // search sorted a[1:n] for k
int left = 1, right = n;
while(left <= right){
int middle = (left + right)/2;
if(k < a[middle]) right = middle - 1;
else if(k > a[middle]) left = middle + 1;
else return middle; // found
}
return -1; // not found
}
7
Verifying 2 Lists w Sequential Search
void Verify1 (Element *l1, Element *l2, const int n, const int m)
{ // 比對兩個大小分別為 n 跟 m 的未排序過的串列 l1 與 l2 ,。
bool *marked = new bool [m + 1];
fill (marked + 1, marked + m + 1, false);
for (i = 1 ; i <= n ; i++)
{
int j = SeqSearch (l2, m, l1[i]);
if (j == 0) cout << l1[i] << “ not in l2 “ << endl; // 滿足 (1)
else
{
if (!Compare (l1[i], l2[j]) // 滿足 (3)
cout << “Discrepancy in “ << l1[i] << endl;
marked [j] = true; // 將 l2[j] 標為已檢查
}
}
for (i = 1; i <= m ; i ++)
if(!marked [i])cout << l2[i] << “ not in l1. “ << endl; // 滿足 (2)
delete [] marked;
}
8
Fast Verifying 2 Lists
void Verify2 (Element *l1, Element *l2, const int n, const int m)
{ // 跟 Verify1 的工作一樣。不過,這次我們先對 l1 、 l2 排序過。
Sort(l1, n); // 依鍵值的遞增順序做排序
Sort(l2, m);
int i = 1, j = 1;
while (( i <= n) && ( j <= m)) {
if ( l1[i] < l2[j]) { cout << l1[i] << “ not in l2” << endl; i++; }
else if (l1[i] > l2[j]) { cout << l2[j] << “not in l1” << endl; j++; }
else // 鍵值相等
{
if (!Compare (l1[i], l2[j])) cout << “Discrepancy in” << l1[i] << endl;
i++; j++;
}
}
if (i <= n) OutputRest(l1, i, n, 1); // 輸出 l1 從 i 到 n 的記錄
else if (j <= m) OutputRest(l2, j, m, 2); // 輸出 l2 從 j 到 m 的記錄
}
9
Classification of Sorting
• Complexity Score Name Score Name
• Time complexity 100 Alice 100 Alice
• Space complexity 90 Bob 100 David
100 David 90 Emily
• Stability 90 Emily 90 Bob
• A sort is called stable iff it
maintains the relative order non-stable sort example
of records with equal keys
• Internal vs. external
• An internal sort requires its inputs to be small enough so
that the entire sort can be carried out in main memory
• Examples: Selection Sort, Insertion Sort, Quick Sort, Heap Sort
• An external sort has no abovementioned requirement
10
Internal Sort
• Bubble sort – simple but slow
• Priority queue sort
• Sequence
• Insertion sort
• Selection sort
• Heap
• Heap sort
• Quick sort – divide-and-conquer
• Merge sort –divide-and-conquer, set ADT
• Bucket sort
• Radix sort
11
Outline
• 7.1 Introduction
• 7.2 Insertion Sort
• 7.3 Quick Sort
• 7.4 How fast can we sort?
• 7.5 Merge sort
• 7.6 Heap sort
• 7.7 Radix sort
• 7.8 (List and table sorts)
• 7.9 Summary of internal sorting
12
Insertion Sort
• The basic step in insertion sort is to insert a new
record into a sorted sequence of i records in such a
way that the resulting sequence of size (i + 1) is also
ordered (sorted)
Priority Queue
a[ ] sorted
1 i
a[ ] sorted
1 i+1
13
Insertion Sort Concept0 1 2 3 4 5
5 3 4 1 2
• array[0] is used as temporary
3 5 4 1 2
space
• array[1] is the initial sorted 3 5 4 1 2
sublist 4 3 5 1 2
• Insertion pass
• Place the element next to the 3 4 5 1 2
sublist to the temporary space
1 3 4 5 2
• Insert the element to the sublist
• Insertion passes are continued 1 3 4 5 2
until the sublist contains all 2 1 3 4 5
records
• Insertion Sort is a stable sort 1 2 3 4 5
14
Insertion Sort Algorithm
template <class T>
void InsertionSort(T *a, const int n)
{ // sort a[1:n] into nondecreasing order
for (int j = 2; j <= n ; j++){
a[0] = a[j];
for (int i = j - 1; a[i] > a[0]; i--) Comparison
{
a[i + 1] = a [i]; Data move
}
a[i + 1] = a[0]; Data move
}
}
15
How Fast is Insertion Sort?
5 3 4 1 2
• Worst-case time complexity 3 5 4 1 2
• When input is in a reversed order
• Each insertion pass involves i 3 5 4 1 2
comparisons, i = 1..n
• 1+2+…+n = O(n2) 4 3 5 1 2
3 4 5 1 2
• Average time complexity 1 3 4 5 2
• It has been shown that Insertion
Sort is O(n2) on average 1 3 4 5 2
2 1 3 4 5
1 2 3 4 5
16
Insertion into a Sorted List
template <class T>
void Insert(const T& e, T* a, const int i)
{ // insert e into ordered a[1:i]
a[0] = e;
while(e < a[i])
{
a[i + 1] = a[i];
i--;
}
a[i + 1] = e;
}
17
Insertion Sort Algorithm
template <class T>
void InsertionSort(T* a, const int n)
{ // Sort a[1:n] into nondecreasing order
for(int j = 2; j<=n;j++)
{
T temp = a[j];
insert(temp, a, j-1);
}
}
𝑶 ቌ ሺ𝒊 + 𝟏ሻቍ= 𝑶(𝒏𝟐 )
𝒊=𝟏
18
Relative Disorder
• Left out of order (LOO):
Record Ri is LOO iff Ri < max1≤j<i {Rj}
• The insertion step has to be carried out only for
those records that are LOO.
• If k is the number of LOO records, the computing
time is O((k+1)n) = O(kn)
• The computing time is O(kn) makes InsertionSort
very desirable in sorting sequences in which only a
very few records are LOO (i.e., k < n).
• The simplicity of this scheme makes it about the
fastest sorting method for small n (say n ≤ 30)
19
Variations of Insertion Sort
• Binary Insertion Sort
• Use binary search rather than sequential search for
insertion passes to reduce the number of comparisons
• The number of record moves remains unchanged
23
Quick Sort
• Developed by C. A. R. Hoare
• Very good average behavior, actually the best
average behavior among the sorting methods:
insertion sort, selection sort, heap sort, merge sort
• Is a randomized sorting algorithm based on the
divide-and-conquer paradigm
• Divide: pick a random element x (called pivot) and
partition S into 7 4 9 6 2 2 4 6 7 9
• L elements less than x
• E elements equal x 4 2 2 4 7 9 7 9
• G elements greater than x
• Recur: sort L and G 22 99
• Conquer: join L, E and G
24
1 2 3 4 5 6 7 8
Quick Sort Concept 6 3 7 1 8 4 5 2
4 3 2 1 5 6 8 7
• Divide-and-conquer
^ sublist
pivot
sublist ()
• Division pass
• Pick the first element as the 4 3 2 1 5 6 8 7
pivot
• Make elements whose key ≤ 1 3 2 4 5 6 7 8
pivot the left sublist ^ ^ ^
• Make elements whose key >
pivot the right sublist 1 3 2 4 5 6 7 8
a 6 2 3 5 11 10 4 1 9 7 8
a 6 2 3 5 1 10 4 11 9 7 8
a 6 2 3 5 1 4 10
10 11 9 7 8
a 4 2 3 5 1 6
4 10 11 9 7 8
29
Quick Sort Example
30
Space Needed in Recursive Quick Sort
• Unlike Insertion Sort, where the only additional
space needed was for one record, Quick Sort needs
stack space to implement the recursion.
• If the lists split evenly each time, the maximum
recursion depth would be logn, requiring a stack
space of O(logn).
• The worst case occurs when the list is split into a
left sublist of size n-1 and a right sublist of size 0 at
each level of recursion. In this case the depth of
recursion becomes n, requiring stack space of O(n).
31
How Fast is Quick Sort ?
• Worst case time complexity
• Each division pass involves n comparisons and ends up
with sublist with 0 and n-1 records (input already sorted)
• T(n) = O(n) + T(n-1) = O(n2)
• Average case time complexity
• It has been shown that quick sort is O(nlog(n)) on
average 𝑛
𝑇ሺ𝑛ሻ ≤ 𝑐𝑛 + 2𝑇 ቀ ቁ, 𝑓𝑜𝑟 𝑠𝑜𝑚𝑒 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡 𝑐
2
𝑐𝑛 𝑛
≤ 𝑐𝑛 + 2 ቆ + 2𝑇 ቀ ቁቇ
2 4
𝑛
≤ 2𝑐𝑛 + 4𝑇 ቀ ቁ
4
.
≤ 𝑐𝑛𝑙𝑜𝑔2 𝑛 + 𝑛𝑇ሺ1ሻ = 𝑂(𝑛𝑙𝑜𝑔𝑛)
32
Variations of Quick Sort
• Median-of-three strategy
• Ordered lists are worst-case inputs for Quick Sort
• Pivot are always the smallest or largest key within a
sublist
• Ordered lists are not rare in real life
• Choosing the pivot using the median of the first, middle,
and last key can address this issue
pivot = median{ Kl, K(l+r)/2, Kr}
• Random pivot strategy
33
Non-In-Place Partition
• We partition (split) an input sequence S as follows:
• We remove, in turn, each element y from S and
• We insert y into L, E or G, depending on the result of the
comparison with the pivot x
• Each insertion and removal is at the beginning or at
the end of a sequence, and hence takes O(1) time
• Thus, the partition step of quick-sort takes O(n) time
x x x
L E G
34
Outline
• 7.1 Introduction
• 7.2 Insertion Sort
• 7.3 Quick Sort
• 7.4 How fast can we sort?
• 7.5 Merge sort
• 7.6 Heap sort
• 7.7 Radix sort
• 7.8 (List and table sorts)
• 7.9 Summary of internal sorting
35
7.4 How Fast Can We Sort?
• A sorting algorithm can be represented as a binary
decision tree
• Non-leaf node represents a comparison between two keys
• Leaf nodes are the sorting results
[a, b, c]
Y
a b N
b c a c
Y N Y N
[a, b, [b, a,
c]
ac c]
b c
Y N Y N
[a, c, [c, a, [b, c, [c, b,
b] b] a] a]
36
How Fast Can We Sort?
• Sorting n records:
• Number of leaf nodes is at least n!
• Tree height is at least (log(n!) + 1) = W(nlog(n))
→ sorting with worst time complexity < nlog(n) is impossible
• Average root-to-leaf path length is W(nlog(n))
→ sorting with worst time complexity < nlog(n) is impossible
[a, b, c]
Y
a b N
b c a c
Y N Y N
[a, b, [b, a,
c]
ac c]
b c
Y N Y N
[a, c, [c, a, [b, c, [c, b,
b] b] a] a] 37
C++ STL sort Function
• Quick sort.
Switch to heap sort when number of subdivisions
exceeds some constant times log2n.
Switch to insertion sort when segment size becomes
small.
38
Outline
• 7.1 Introduction
• 7.2 Insertion Sort
• 7.3 Quick Sort
• 7.4 How fast can we sort?
• 7.5 Merge sort
• 7.6 Heap sort
• 7.7 Radix sort
• 7.8 (List and table sorts)
• 7.9 Summary of internal sorting
39
7.5 Merge Sort
1 2 3 4 5 6
• Interpret the unsorted list 5 2 6 1 4 3
as n sorted sublists, each of interpreted as
size 1
=
n sublists
40
Merge Sort Algorithm (Nonrecursive)
template <class T>
void MergeSort(T *a, const int n)
{ // sort a[1:n] into non-decreasing order
T *tempList = new T[n+1];
s *= 2;
MergePass(tempList, a, n, s);
}
delete [] tempList;
}
(to be continued) 41
Merge Sort Algorithm (Nonrecursive)
template <class T>
void MergePass(T *a, T *b, const int n, const int s)
{//adjacent pairs of sublist of size s are merged from a to b. n = # records in a
for (int i = 1; //i the first position in first of the sublists merged
5 26 5 77 11 61 15 59 19 48
1 5 26 77 11 15 59 61 19 48
1 5 11 15 26 59 61 77 19 48
1 5 11 15 19 26 48 59 61 77
Nonrecursive Merge Sort
Input list has 13 keys.
[8] [3] [13] [6] [2] [14] [5] [9] [10] [1] [7] [12] [4]
[3, 8] [6, 13] [2, 14] [5, 9] [1, 10] [7, 12] [4]
47
Merge Two Sorted Lists
• A = (5, 6)
B = (3, 8, 9, 10)
C = (1, 2)
• A = (5, 6)
B = (8, 9, 10)
C = (1, 2, 3)
• A = (6)
B = (8, 9, 10)
C = (1, 2, 3, 5)
48
Merge Two Sorted Lists
• A = ()
B = (8, 9, 10)
C = (1, 2, 3, 5, 6)
• When one of A and B becomes empty, append the
other list to C.
• A = ()
B = ()
C = (1, 2, 3, 5, 6, 8, 9, 10)
• O(1) time needed to move an element into C.
• Total time is O(n + m), where n and m are, respectively,
the number of elements initially in A and B.
49
Recursive Merge Sort
• Divide & Conquer
• Divide :
• Partition the n > 1 elements into two smaller instances.
• First elements define one of the smaller instances;
remaining elements define the second smaller instance.
• Recur:
• Each of the two smaller instances is sorted recursively.
• Conquer:
• The sorted smaller instances are combined using a
process called merge.
• Complexity is O(n log n).
50
Recursive Merge Sort
template <class T>
int rMergeSort(T* a, int* link, const int left, const int
right)
{// 要排序的是 a[left:right] 。對於所有 i , link[i] 初始化為
0。
// rMerge 回傳排序好的鏈的第一個元素之索引值。
if (left >= right) return left; //base case
int mid = (left + right) /2;
return ListMerge(a, link,
rMergeSort(a, link, left, mid), // 排序左半邊
rMergeSort(a, link, mid + 1, right));// 排序右半
邊
}
51
Recursive Merge Sort
Input list is (26, 5, 77, 1, 61, 11, 59, 15, 48, 19). 10 keys.
Sublist partitioning for recursive merge sort
26 5 77 1 61 11 59 15 48 19
5 26 11 59
5 26 77 1 61 11 15 59 19 48
1 5 26 61 77 11 15 19 48 59
1 5 11 15 19 26 48 59 61 77
[8, 3] [13, 6] [2, 14] [5] [9, 10] [1] [7, 12] [4]
[8] [3] [13] [6] [2] [14] [9] [10] [7] [12]
Recursive Merge Sort
• To eliminate the record copying that takes pace when
Merge() is used to merge sorted sublists, associate an
integer pointer with each record.
• For this purpose, employ an integer array link[1:n] such
that link[i] gives the record that follows record i in the
sorted sublist.
• In case link[i] = 0, there is no next record.
• Record copying is replaced by link changes and the run
time of our sort function becomes independent of the
size s of a record.
• Iterative merge sort: O(snlogn) time, O(sn) extra space
• Recursive merge sort: O(nlogn) time, O(n) extra space
54
Recursive Merge Sort
tamplate <class T>
int ListMerge(T* a, int* link, const int start1, const int start2)
{// 兩個排序好的鏈分別從 start1 及 start2 開始,將它們合併
// 將 link[0] 當作一個暫時的標頭。回傳合併好的鏈的開頭。
int iResult = 0; // 結果鏈的最後一筆記錄
for (int i1 = start1, i2 =start2; i1 && i2; )
if (a[i1] <= a[i2]) {
link[iResult] = i1;
iResult = i1; i1 = link[i1];}
else {
link[iResult] = i2;
iResult = i2; i2 =link[i2]; }
// 將其餘的記錄附接至結果鏈
if (i1 = = 0) link[iResult] = i2;
else link[iResult] = i1;
return link[0];
}
55
Recursive Merge Sort
• Merge
[3, 8] [6, 13] [2, 14] [5] [9, 10] [1] [7, 12] [4]
[8] [3] [13] [6] [2] [14] [9] [10] [7] [12]
How Fast is Merge Sort ?
• Both worst and average cases
• log(n) merge passes are performed
• Each merge pass is O(n)
• Time complexity is O(nlog(n))
57
Variations of Merge Sort
• Natural Merge Sort
• Interpreting the initial list as multiple sorted sublists,
each can contain more than one records
1 2 3 4 5 6 1 2 3 4 5 6
5 2 6 1 4 3 5 2 6 1 4 3
=
=
5 2 6 1 4 3 5 2 6 1 4 3
58
C++ STL stable_sort Function
• Merge sort is stable (relative order of elements
with equal keys is not changed).
• Quick sort is not stable.
• STL’s stable_sort is a merge sort that switches to
insertion sort when segment size is small
59
Outline
• 7.1 Introduction
• 7.2 Insertion Sort
• 7.3 Quick Sort
• 7.4 How fast can we sort?
• 7.5 Merge sort
• 7.6 Heap sort
• 7.7 Radix sort
• 7.8 (List and table sorts)
• 7.9 Summary of internal sorting
60
7.6 Heap Sort Concept
1 2 3 4 5 6 7 8
6 3 8 1 5 4 2 7
• Interpret the input list as a tree interpreted as
=
1
6 a tree
• Heapify the tree to form a max 2 3
3 8
heap 4 5 6 7
1 5 4 2
• Popping pass 8
• Pop the top (maximum) record 7 heapify the tree to
1
• Heap size shrinks by one 8 form a max heap
• Space next to the heap becomes
2 3
7 6
unused 4 5 6 7
• Place the popped record at the 3 5 4 2
8
space 1
• Popping passes are continued 1 2 3 4 5 6 7 8
until the heap becomes empty 8 7 6 3 5 4 2 1
convert the heap
• Heap Sort is non-stable back to a list
1 2 3 4 5 6 761 8
Heap Sort Detail Steps
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
6 3 8 1 5 4 2 7 6 3 8 7 5 4 2 1
interpreted as
=
1 a tree 1
6 6
2 3 2 3
3 8 3 8
4 5 6 7 4 5 6 7
1 5 4 2 7 5 4 2
8 8 heapify
7 1
heapify
62
Heap Sort Detail Steps
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
6 3 8 7 5 4 2 1 6 7 8 3 5 4 2 1
1 1
6 6
2 3 2 3
3 8 7 8
4 5 6 7 4 5 6 7
7 5 4 2 3 5 4 2
8 8
1 1
heapify heapify
63
Heap Sort Detail Steps
heap ~ heap
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
8 7 6 3 5 4 2 1 1 7 6 3 5 4 2 8
1 1
8 1
2 3 2 3
7 6 7 6
4 5 6 7 4 5 6 7
3 5 4 2 3 5 4 2
8 8 heapify
1 8 (trickling
down)
64
Heap Sort Detail Steps
heap ~ heap
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
7 5 6 3 1 4 2 8 2 5 6 3 1 4 7 8
1 1
7 2
2 3 2 3
5 6 5 6
4 5 6 7 4 5 6 7
3 1 4 2 3 1 4 7
8 8 heapify
8 8 (trickling
down)
65
Heap Sort Detail Steps
heap ~heap
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
6 5 4 3 1 2 7 8 2 5 4 3 1 6 7 8
1 1
6 2
2 3 2 3
5 4 5 4
4 5 6 7 4 5 6 7
3 1 2 7 3 1 6 7
8 8
8 8
66
Heap Sort Algorithm
template <class T>
void HeapSort(T *a, const int n)
{// sort a[1..n] into non-decreasing order
// a[n/2] is the parent of the last node, a[n]
for (int i = n/2; i >= 1; i--) // buttom-up heapification
Adjust(a, i, n);// make the tree rooted at i be a max heap
(to be continued)
67
Heap Sort Algorithm
template <class T>
void Adjust(T *a, const int root, const int n)
{
// two subtrees are max heaps already
// same procedure as the trickling-down procedure
T e = a[root];
//2*root is root's left child
for (int j = 2*root; j <= n; j *=2) {
if (j < n && a[j] < a[j+1]) // j and j+1 are siblings
j++; // make j be the max child
if (e >= a[j])
break;
a[j / 2] = a[j]; // move jth record up the path
}
a[j / 2] = e;
}
68
How Fast is Heap Sort?
• Both worst and average cases
• Heapifying the tree
• n/2 adjust()’s are invoked, each is at most O(log(n))
• Converting the max heap to the list
• n pop()’s are invoked, each is at most O(log(n))
• Overall, the time complexity is O(nlogn)
69
Summary
Worst Average
70
Summary
5 Insertion Sort
3
Time
2
Heap Sort
Merge Sort
1
Quick Sort
0
0 500 1000 2000 3000 4000 5000
Number of elements to be sorted
71
Outline
• 7.1 Introduction
• 7.2 Insertion Sort
• 7.3 Quick Sort
• 7.4 How fast can we sort?
• 7.5 Merge sort
• 7.6 Heap sort
• 7.7 Radix sort
• 7.8 (List and table sorts)
• 7.9 Summary of internal sorting
72
Bucket Sort (Bin Sort)
• Let be S be a sequence of n (key, element) items
with keys in the range [0, N - 1]
• Bucket-sort uses the keys as indices into an auxiliary
array B of sequences (buckets or bins)
Phase 1: Empty sequence S by moving each item (k, o)
into its bucket B[k]
Phase 2: For i = 0, …, N - 1, move the items of bucket
B[i] to the end of sequence S
• Analysis:
• Phase 1 takes O(n) time
• Phase 2 takes O(n + N) time
Bucket-sort takes O(n + N) time
73
Algorithm bucketSort(S, N)
Input sequence S of (key, element) items with
keys in the range [0, N - 1]
Output sequence S sorted by increasing keys
7, d 1, c 3, a 7, g 3, b 7, e
Phase 1
1, c 3, a 3, b 7, d 7, g 7, e
B
0 1 2 3 4 5 6 7 8 9
Phase 2
1, c 3, a 3, b 7, d 7, g 7, e
75
Bucket Sort Properties and Extensions
• Key-type Property
• The keys are used as indices into an array and cannot be arbitrary
objects
• No external comparator
• Stable Sort Property
• The relative order of any two items with the same key is preserved
after the execution of the algorithm
• Extensions
• Integer keys in the range [a, b]
• Put item (k, o) into bucket B[k - a]
• String keys from a set D of possible strings, where D has constant
size (e.g., names of the 50 U.S. states)
• Sort D and compute the rank r(k) of each string k of D in the sorted
sequence
• Put item (k, o) into bucket B[r(k)]
76
Lexicographic Order
• A d-tuple is a sequence of d keys (k1, k2, …, kd), where
key ki is said to be the i-th dimension of the tuple
• Example:
• The Cartesian coordinates of a point in space are a 3-tuple
• The lexicographic order of two d-tuples is recursively
defined as follows
(x1, x2, …, xd) < (y1, y2, …, yd)
x1 < y1 [x1 = y1 (x2, …, xd) < (y2, …, yd)]
i.e., the tuples are compared by the first dimension,
then by the second dimension, …, etc.
77
Lexicographic-Sort
• Let Ci be the comparator that compares two tuples
by their i-th dimension
• Let stableSort(S, C) be a stable sorting algorithm
that uses comparator C
• Lexicographic-sort sorts a sequence of d-tuples in
lexicographic order by executing d times algorithm
stableSort, one per dimension
• Lexicographic-sort runs in O(dT(n)) time, where
T(n) is the running time of stableSort
78
Algorithm lexicographicSort(S)
Input sequence S of d-tuples
Output sequence S sorted in
lexicographic order
for i d downto 1
stableSort(S, Ci)
Example:
(7, 4, 6) (5, 1, 5) (2, 4, 6) (2, 1, 4) (3, 2,
4)
(2, 1, 4) (3, 2, 4) (5, 1, 5) (7, 4, 6) (2, 4,
6)
(2, 1, 4) (5, 1, 5) (3, 2, 4) (7,4, 6) (2, 4, 6)
79
(2, 1, 4) (2, 4, 6) (3, 2, 4) (5, 1, 5) (7, 4,
Basics of Sorting on Several
Keys
• Terminology
• Keys: K1, K2, ..., Kr
• K1 is the most significant key
• Kr is the least significant key
• Comparison of multiple key (lexicographic order)
• The r-tuple (x1, x2, ..., xr) is less than or equal to the r-
tuple (y1, y2, ..., yr) iff either one of the following two
conditions is satisfied
(1) xi= yi for 1≦i≦r, (=)
(2) xi= yi for 1≦i < , and x < y for some 1≦≦ r (<)
81
Sorting on Several Keys
• Two popular sorting strategies
• MSD first sort
• Sort the cards into 4 piles using K1, one for each suit
• Sort each of the 4 piles using K2
• Cascade the sorted 4 piles with the order of (club,
diamond, heart, spade):
• 2♣,…,A♣,2♦…,A♦,2♥……A♥,2♠,….A♠
• LSD first sort
• Sort the cards using K2 into 13 piles
• Then cascade the 13 piles into a big pile with the order
of 2, 3, 4, ..., J, Q, K, A
• Finally, sort the big pile using a stable sorting algorithm
on the suit (K1) into 4 piles, then combine:
• 2♣,…,A♣,2♦…,A♦,2♥……A♥,2♠,….A♠
82
Comparing MSD & LSD
• LSD-first is simpler, as the piles and subpiles
obtained do not have to be sorted independently
(provided the sorting scheme used for sorting on
the keys Ki, 1 ≤ i < r, is stable)
• LSD-first sort is commonly chosen for computer
sorting
• MSD-first sort tends to incur more overhead
because of the need to independently sort multiple
groups:
• First sorting on suit – bin sort into 4 bins
• Then sorting on face – similar to insertion sort for each
bin
83
Radix Sort
• LSD or MSD sorting can be used to sort even when the
records have only one key K.
• Interpret the key as being composed of several subkeys, e.g., if 0
≤ K ≤ 999 K = (K1, K2, K3), 0 ≤ Ki ≤ 9
• The sort on each Ki can be carried out using a bin sort with 10
bins.
• In a Radix Sort, we decompose the sort key into several
subkeys using some radix r
• e.g., 365 is decomposed into 3, 6, and 5 with a radix = 10, so the
number of bins required is 10.
• Radix-sort is a specialization of lexicographic-sort that uses
bucket-sort as the stable sorting algorithm in each
dimension
• Radix-sort runs in time O(d( n + N))
84
Radix Sort Example
• Sorting a sequence of 4-bit integers (binary
decomposition of key)
85
LSD-First Radix Sort Example
86
Prog. 7.5 LSD Radix Sort
template <class T>
int RadixSort(T *a, int *link, const int d, const int
r, const int n)
{
// 使用一個 d-digit 、 radix-r 的基數排序法來排序 a[1:n]
// digit(a[i], j, r) 回傳 a[i] 的鍵值在第 j 個基數 r 的
數字
// (從左邊)每一個數字的範圍都是 [0, r) 。
// Sorting within a digit is done using bin sort
a [1] a [2] a [3] a [4] a [5] a [6] a [7] a [8] a [9] a [10]
179 208 306 93 859 984 55 9 271 33
e [0] e [1] e [2] e [3] e [4] e [5] e [6] e [7] e [8] e [9]
33 859
f [0] f [1] f [2] f [3] f [4] f [5] f [6] f [7] f [8] f [9]
f [0] f [1] f [2] f [3] f [4] f [5] f [6] f [7] f [8] f [9]
90
e [0] e [1] e [2] e [3] e [4] e [5] e [6] e [7] e [8] e [9]
93
55
33 271
f [0] f [1] f [2] f [3] f [4] f [5] f [6] f [7] f [8] f [9]
91
Outline
• 7.1 Introduction
• 7.2 Insertion Sort
• 7.3 Quick Sort
• 7.4 How fast can we sort?
• 7.5 Merge sort
• 7.6 Heap sort
• 7.7 Radix sort
• 7.8 (List and table sorts)
• 7.9 Summary of internal sorting
92
Comparison of Sort Methods
93
Comparison of Sort Methods
n Insert Heap Merge Quick
0 0.000 0.000 0.000 0.000
50 0.004 0.009 0.008 0.006
100 0.011 0.019 0.017 0.013
200 0.033 0.042 0.037 0.029
300 0.067 0.066 0.059 0.045
400 0.117 0.090 0.079 0.061
500 0.179 0.116 0.100 0.079
1000 0.662 0.245 0.213 0.169
2000 2.439 0.519 0.459 0.358
3000 5.390 0.809 0.721 0.560
4000 9.530 1.105 0.972 0.761
5000 15.935 1.410 1.271 0.970
94
Comparison of Sort Methods
5 Insertion Sort
2
Heap Sort
Merge Sort
1
Quick Sort
0
0 500 1000 2000 3000 4000 5000
95
Summary of Sorting Algorithms
Algorithm Time (WC) Time(Ave) Notes
in-place
selection-
sort O(n2) O(n2) slow (good for small
inputs)
in-place
insertion-
sort O(n2) O(n2) slow (good for small
inputs)
in-place, randomized
quick-sort O(n log n)
O(n2) fastest (good for large
expected inputs)
in-place
heap-sort O(n log n) O(n log n) fast (good for large
inputs)
sequential data access
merge-sort O(n log n) O(n log n) fast (good for huge
inputs) 96
Summary
• Every sorting algorithm has its pros and cons
• No one size fit all solution
• C++'s sort methods
• sort()
• Quick Sort that reverts to Heap Sort when the recursion depth
exceeds some threshold and to Insertion Sort when the
segment size becomes small
• stable_sort()
• Merge Sort that reverts to Insertion Sort when the segment
size becomes small
• partial_sort()
• Heap Sort that has ability to stop when only the first k
elements need to be sorted
97