MA214 Lecture Slides 2
MA214 Lecture Slides 2
Week 2: The sorting problem, Insertion Sort, Loop invariants, Merge Sort, Big-O notation
MA214 – Algorithms and Data Structures
Julia Böttcher, London School of Economics and Political Science
Week 2: The sorting problem, Insertion Sort, Loop invariants, Merge Sort, Big-O notation
Recall
1
Handy Python notation
List: A = [5,4,7,10,2]
Length: len(A)
2
Insertion Sort
Insertion Sort: The Idea
3
Insertion Sort: In Python
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
# Example usage
A = [5, 4, 7]
InsertionSort(A)
4
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [5,4,7]
i =
current =
j =
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [5,4,7]
i =1
current =
j =
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [5,4,7]
i =1
current = 4
j =
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [5,4,7]
i =1
current = 4
j =0
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [5,4,7]
i =1
current = 4
j =0
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [5,5,7]
i =1
current = 4
j =0
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [5,5,7]
i =1
current = 4
j = -1
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [5,5,7]
i =1
current = 4
j = -1
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [4,5,7]
i =1
current = 4
j = -1
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [4,5,7]
i =2
current = 4
j = -1
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [4,5,7]
i =2
current = 7
j = -1
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [4,5,7]
i =2
current = 7
j =1
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [4,5,7]
i =2
current = 7
j =1
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [4,5,7]
i =2
current = 7
j =1
5
Example
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
A = [4,5,7]
i =3
current = 7
j =1
5
Analysing this algorithm
Questions
6
Analysing this algorithm
Questions
Proving correctness
General approach:
6
The Correctness of Algorithms
Proof idea
7
Proof by finite induction
Induction hypothesis: At the start of the i-th iteration of the for loop
! "
A[0], . . . , A[i − 1] is sorted.
8
Proof by finite induction
Induction hypothesis: At the start of the i-th iteration of the for loop
! "
A[0], . . . , A[i − 1] is sorted.
Base case: i = 1:
8
Proof by finite induction
Induction hypothesis: At the start of the i-th iteration of the for loop
! "
A[0], . . . , A[i − 1] is sorted.
Base case: i = 1:
! "
Proof: A[0] has only one element so the claim is trivially true.
8
Proof by finite induction
Induction hypothesis: At the start of the i-th iteration of the for loop
! "
A[0], . . . , A[i − 1] is sorted.
Inductive step: i → i + 1
Proof: Assume that at the beginning of the i-th iteration of the for loop
! "
A[0], . . . , A[i − 1] is sorted.
We want to show that at the beginning of the (i + 1)-st iteration of the for loop
! "
A[0], . . . , A[i] is sorted.
8
Proof by finite induction
Induction hypothesis: At the start of the i-th iteration of the for loop
! "
A[0], . . . , A[i − 1] is sorted.
Inductive step: i → i + 1
Proof: Assume that at the beginning of the i-th iteration of the for loop
! "
A[0], . . . , A[i − 1] is sorted.
We want to show that at the beginning of the (i + 1)-st iteration of the for loop
! "
A[0], . . . , A[i] is sorted.
Denote the elements A[0], . . . , A[i] at the beginning of the i-th iteration of the for
loop by A0 , . . . , Ai .
8
Proof by finite induction
The while loop moves elements Ai−1 , Ai−2 , . . . that are larger than Ai one
position to the right until it finds an element Aj such that Aj ! Ai at which point
it inserts Ai in position j + 1.
8
Proof by finite induction
The while loop moves elements Ai−1 , Ai−2 , . . . that are larger than Ai one
position to the right until it finds an element Aj such that Aj ! Ai at which point
it inserts Ai in position j + 1.
So, afterwards,
! "
A[0] , . . . , A[j] , A[j + 1] , A[j + 2] , . . . , A[i]
.
= [ A0 , . . . , Aj , Ai , Aj+1 , . . . , Ai−1 ]
# $% & # $% & # $% &
sorted by I.H. Aj !Ai <Aj+1 sorted by I.H.
! "
Hence at the beginning of the (i + 1)-th iteration of the for loop A[0], . . . , A[i] is
sorted.
8
Proof by finite induction
Induction hypothesis: At the start of the i-th iteration of the for loop
! "
A[0], . . . , A[i − 1] is sorted.
Termination: i = len(A):
8
Proof by finite induction
Induction hypothesis: At the start of the i-th iteration of the for loop
! "
A[0], . . . , A[i − 1] is sorted.
Termination: i = len(A):
Plugging i = len(A) into the induction hypothesis shows that the entire list
! "
A[0], . . . , A[len(A) − 1] is sorted at the end of the algorithm.
8
General technique
9
General technique
9
The Running Time of Algorithms
Analysing the running time
10
Analysing the running time
10
Analysing the running time (ctd.)
Here: Each line ! takes constant time c! > 0 per execution (“basic operation”).
for i in range(1,len(A)): c1
current = A[i] c2
j = i-1 c3
while j >= 0 and A[j] > current: c4
A[j+1] = A[j] c5
j -= 1 c6
A[j+1] = current c7
11
Analysing the running time (ctd.)
Here: Each line ! takes constant time c! > 0 per execution (“basic operation”).
for i in range(1,len(A)): c1
current = A[i] c2
j = i-1 c3
while j >= 0 and A[j] > current: c4
A[j+1] = A[j] c5
j -= 1 c6
A[j+1] = current c7
12
Worst case analysis (How large can ti get?)
(
k
Using this and that a= a(a+1)
2 we get:
a=1
n−1
' n−1
' n(n + 1) n2 n
ti = (i + 1) = −1= + −1
2 2 2
i=1 i=1
n−1
' n−1
' (n − 1)n n2 n
(ti − 1) = i= = − ,
2 2 2
i=1 i=1
12
Worst case analysis (How large can ti get?)
(
k
Using this and that a= a(a+1)
2 we get:
a=1
n−1
' n−1
' n(n + 1) n2 n
ti = (i + 1) = −1= + −1
2 2 2
i=1 i=1
n−1
' n−1
' (n − 1)n n2 n
(ti − 1) = i= = − ,
2 2 2
i=1 i=1
hence ) * ) 2 *
n2 n n n
T (n) = c1 · n + (c2 + c3 + c7 ) · (n − 1) + c4 · + − 1 + (c5 + c6 ) · − .
2 2 2 2
12
Worst case analysis (Conclusion)
13
Worst case analysis (Conclusion)
(for a = 4, b = 3, c = 0)
1,000
500
5 10 15 20 13
Worst case analysis (Conclusion)
(for a = 4, b = 3, c = 0)
1,000
500
This means: the worst-case running time
of InsertionSort “scales like” n2 .
5 10 15 20 13
Merge Sort
Recall: Divide-and-conquer
14
Recall: Divide-and-conquer
14
Applied to sorting
15
Applied to sorting
15
In Python
def MergeSort(A):
# base case, list of length 1 is sorted
if len(A) <= 1:
return A
16
Idea behind Merge
left: right:
3 4 6 8 1 2 5 7
result:
17
Idea behind Merge
left: right:
3 4 6 8 1 2 5 7
result:
17
Idea behind Merge
left: right:
3 4 6 8 1 2 5 7
result:
1 2
17
Idea behind Merge
left: right:
3 4 6 8 1 2 5 7
result:
1 2 3
17
Idea behind Merge
left: right:
3 4 6 8 1 2 5 7
result:
1 2 3 4
17
Idea behind Merge
left: right:
3 4 6 8 1 2 5 7
result:
1 2 3 4 5
17
Idea behind Merge
left: right:
3 4 6 8 1 2 5 7
result:
1 2 3 4 5 6
17
Idea behind Merge
left: right:
3 4 6 8 1 2 5 7
result:
1 2 3 4 5 6 7
17
Idea behind Merge
left: right:
3 4 6 8 1 2 5 7
result:
1 2 3 4 5 6 7 8
17
In Python
18
Correctness
Proof Idea
First use loop invariant technique to show that Merge is correct:
19
Correctness
Proof Idea
First use loop invariant technique to show that Merge is correct:
• At the start of each iteration of the while loop, result contains the
len(result) smallest elements of left and right in sorted order.
• Moreover (at the start of each iteration of the while loop), left[l] and
right[r] are the smallest elements of their lists that have not been copied
to result.
19
Correctness
Proof Idea
First use loop invariant technique to show that Merge is correct:
• At the start of each iteration of the while loop, result contains the
len(result) smallest elements of left and right in sorted order.
• Moreover (at the start of each iteration of the while loop), left[l] and
right[r] are the smallest elements of their lists that have not been copied
to result.
Then use finite induction over recursive calls to argue correctness of MergeSort.
def MergeSort(A):
if len(A) <= 1:
return A
m = len(A) // 2
left = MergeSort(A[:m])
right = MergeSort(A[m:])
20
Running Time
def MergeSort(A):
if len(A) <= 1: = c1
return A = c2
m = len(A) // 2 = c3
left = MergeSort(A[:m]) ! c4 · n/2 + T (n/2)
right = MergeSort(A[m:]) ! c4 · n/2 + T (n/2)
where TM erge (n) is the worst-case running time for merging two lists of total
length n.
20
Recurrence relation
TMerge (n) ! c6 · n + c7 .
Then there exist constants a > 0 and b such that T (n) satisfies the
recurrence relation:
21
Recurrence relation
TMerge (n) ! c6 · n + c7 .
Then there exist constants a > 0 and b such that T (n) satisfies the
recurrence relation:
T (n) ! 2 · T (n/2) + a · n + b
21
Recurrence relation
TMerge (n) ! c6 · n + c7 .
Then there exist constants a > 0 and b such that T (n) satisfies the
recurrence relation:
T (n) ! 2 · T (n/2) + a · n + b
+ ,
! 2 · 2 · T (n/4) + a · n/2 + b + a · n + b
+ + , ,
! 2 · 2 · 2 · T (n/8) + a · n/4 + b + a · n/2 + b + a · n + b ! . . .
! 2k · T (n/2k ) + k · a · n + (2k−1 + · · · + 2 + 1) · b .
21
Recurrence relation
TMerge (n) ! c6 · n + c7 .
Then there exist constants a > 0 and b such that T (n) satisfies the
recurrence relation:
21
Recurrence relation
TMerge (n) ! c6 · n + c7 .
Then there exist constants a > 0 and b such that T (n) satisfies the
recurrence relation:
800
600
400
(for a=4,b=-3,c=0,a’=10,b’=1)
200
2 4 6 8 10 12 14
6,000
4,000
10 20 30 40
4
(for a=4,b=-3,c=0,a’=10,b’=1)
• We argued correctness of Merge Sort and that its worst-case running time
“scales like” n · log2 (n).
• Merge Sort is significantly faster than InsertionSort even for moderately
sized inputs!
• Merge Sort (as described) unlike Insertion Sort requires additional storage
space.
• Insertion Sort is an in-place sorting algorithm
23
Big-O notation
25
O-notation
for example :
1042 + 100 =
0 (ut)
O(g(n)) = f (n) .
. 0
. s.t. 0 ! f (n) ! cg(n) for all n " n0 f(u)EO(f()
25
Ω-notation
26
Θ-notation
1,500 T (n)
a · n2 + b · n + c
1,000
a! ·n·log2 (n)+(2n−1)·b!
n0
500
n
5 10 15 20
28
Next week
29