[go: up one dir, main page]

0% found this document useful (0 votes)
3 views82 pages

12sorting

The lecture covers various sorting algorithms, including Bubble, Selection, Insertion, and Merge sort, emphasizing their time complexities. Bubble and Selection sort have a worst-case time complexity of O(n^2), while Insertion sort can achieve O(n) in the best case with nearly sorted data. Merge sort, a divide-and-conquer algorithm, offers improved efficiency with a time complexity of O(n log n).

Uploaded by

thirtythr33spam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views82 pages

12sorting

The lecture covers various sorting algorithms, including Bubble, Selection, Insertion, and Merge sort, emphasizing their time complexities. Bubble and Selection sort have a worst-case time complexity of O(n^2), while Insertion sort can achieve O(n) in the best case with nearly sorted data. Merge sort, a divide-and-conquer algorithm, offers improved efficiency with a time complexity of O(n log n).

Uploaded by

thirtythr33spam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

Data Structures

<Lecture 12: Sorting 2>

Sungmin Cha
New York University

10.16.2024
Outline
• Notice

• Review the previous lecture


– Bubble and selection sort

• Sorting algorithm
– Insert sort
– Merge sort

2
Outline
• Notice

• Review the previous lecture


– Bubble and selection sort

• Sorting algorithm
– Insert sort
– Merge sort

3
Notice
• The example of midterm exam
– I will share it in the next lecture

• Schedule

4
Outline
• Notice

• Review the previous lecture


– Bubble and selection sort

• Sorting algorithm
– Insert sort
– Merge sort

5
Sorting
• Why is sorting necessary? – for search!
– Computers typically handle data on the scale of millions or more
– Moreover, searching for data is constantly carried out
 Log in to a website, searching an item on Amazon, etc.

– Sorting data => binary search (O(log n))!

6
Bubble Sort
• Pseudo algorithm of bubble sort n-1 iterations

(n-1) + (n-2) + …. + 1 = n(n-1)/2

O(1)

• What is the time complexity (Big-O)?


– O(n(n-1)/2) = O(n^2)

7
Bubble Sort
• Time complexity of Bubble sort in the best case
– Example: the first iteration

1 12 15 23 48 74 128

74 < 128

– If we add a code to check whether any element swaps occur


during each iteration,
 It is possible to reduce the runtime when the array is already sorted

– Time complexity in the best case?


 n, n is the number of elements
8
Selection Sort
• Pseudo algorithm of selection sort
n-1 iterations Number of comparison
: (n-1) + (n-2) + … + 1 = n(n-1)/2

O(1)

• What is the time complexity (Big-O)?


– O(n(n-1)/2) = O(n^2)

9
Selection Sort
• Is there the best case in using selection sort?
Swap!

1 12 15 23 48 74 128

– No!
 It will find the largest element for n-1 times
 Namely, time complexity of selection sort is always n^2

10
Outline
• Notice

• Review the previous lecture


– Bubble and selection sort

• Sorting algorithm
– Insert sort
– Merge sort

11
Insertion Sort
• Process of insertion sort (ascending order)
– Starting from the first element

23 12 1 74 15 128 48

Sorted array

12
Insertion Sort
• Process of insertion sort (ascending order)
– Starting from the first element

23 12 1 74 15 128 48

Sorted array

Insert 12 to an appropriate location

13
Insertion Sort
• Process of insertion sort (ascending order)
– Starting from the first element

12 23 1 74 15 128 48

Sorted array

14
Insertion Sort
• Process of insertion sort (ascending order)
– Starting from the first element

12 23 1 74 15 128 48

Sorted array

Insert 1 to an appropriate location

15
Insertion Sort
• Process of insertion sort (ascending order)
– Starting from the first element

1 12 23 74 15 128 48

Sorted array

Insert 74 to an appropriate location

– And the repeat the same process.. 16


Insertion Sort
• Process of insertion sort (ascending order)
– The sorted array

1 12 15 23 48 74 128

Sorted array

17
Insertion Sort
• Pseudo algorithm of insertion sort

23 12 1 74 15 128 48

j=0 i=1

newItem = 12
18
Insertion Sort
• Pseudo algorithm of insertion sort

23 12 1 74 15 128 48

j=0 i=1
0 == j
newItem = 12
newItem < A[j] = 23 19
Insertion Sort
• Pseudo algorithm of insertion sort

23 23 1 74 15 128 48

j=0 A[j+1]

newItem = 12
20
Insertion Sort
• Pseudo algorithm of insertion sort

23 23 1 74 15 128 48

j=-1 i=1

newItem = 12
21
Insertion Sort
• Pseudo algorithm of insertion sort

12 23 1 74 15 128 48

j=0 i=1 And the repeat the same process..

newItem = 12
22
Insertion Sort
• Pseudo algorithm of insertion sort
n-1 iterations Maximum number
of comparison: i
(A[i] -> A[0])

• Time complexity of insertion sort (Big-O)?

23
Insertion Sort
• Pseudo algorithm of insertion sort
n-1 iterations Maximum number
of comparison: i
(A[i] -> A[0])

• Time complexity of insertion sort (Big-O)?


– Sum of maximum number of comparison in each i
– 1 + 2 + …. + (n-1) = n(n-1)/2 => O(n^2)
24
Insertion Sort
• Time complexity of insertion sort in the best case
j=0 i=1
1 12 15 23 48 74 128
1 < 12

– Time complexity of insertion sort in the base case?

25
Insertion Sort
• Time complexity of insertion sort in the best case
j=0 i=1
1 12 15 23 48 74 128
1 < 12

– Time complexity of insertion sort in the base case?


 n, n is the number of elements
26
Insertion Sort
• Given the almost sorted array
– Insertion sort is generally considered to be fast on average
1 12 23 15 48 128 74

27
Insertion Sort
• Given the almost sorted array
– Insertion sort is generally considered to be fast on average
1 12 23 15 48 128 74

Sorted array

Insert 15 to an appropriate location


(only one comparison, 23 > 15)

28
Insertion Sort
• Given the almost sorted array
– Insertion sort is generally considered to be fast on average
1 12 15 23 48 128 74

Sorted array

There is no insertion
because 23 < 45

29
Insertion Sort
• Given the almost sorted array
– Insertion sort is generally considered to be fast on average
1 12 15 23 48 128 74

Sorted array

There is no insertion
because 48 < 128

30
Insertion Sort
• Given the almost sorted array
– Insertion sort is generally considered to be fast on average
1 12 15 23 48 128 74

Sorted array

Insert 74 to an appropriate location


(only one comparison, 128 > 74)

– We can expect time complexity of near n

31
Insertion Sort
• Given the almost sorted array
– Insertion sort is generally considered to be fast on average
1 12 15 23 48 128 74

Sorted array

Insert 74 to an appropriate location


(only one comparison, 128 > 74)

– This can be useful for solving certain programming problems


 After a preprocessing step, the data is given as nearly sorted
 In this case, we can sort the data using the insertion sort
o We can expect time complexity of near O(n) 32
Summary
• Time complexity comparison
– n is the number of elements

Bubble Selection Insertion


Worse case n^2 n^2 n^2
Best case n n^2 n

33
Sorting
• Sorting algorithms
– There are many sorting algorithms but we will focus on learning
the below basic algorithms
 Basic sort (O(n^2))
o Bubble/Selection/Insertion sort
 Advanced sort (O(nlogn))
o Merge/Quick sort

34
Sorting
• Limitations of basic sorting algorithms
– Why is their time complexity O(n^2)?

35
Sorting
• Limitations of basic sorting algorithms
– Why is their time complexity O(n^2)?

1. Access to each element: n-1 iterations

36
Sorting
• Limitations of basic sorting algorithms
– Why is their time complexity O(n^2)?

1. Access to each element: n-1 iterations


2. Comparison: maximum n-1 iterations

In which part can we


reduce the time complexity?

37
Sorting
• Limitations of basic sorting algorithms
– Why is their time complexity O(n^2)?

O(log n) by divide
and conquer

1. Access to each element: n-1 iterations


2. Comparison: maximum n-1 iterations

In which part can we


reduce the time complexity?

38
Merge Sort
• Merge sort
– A divide-and-conquer algorithm devised by Jon von Neumann
 Breaking down the problem into two smaller subproblems
 Solving each of them separately
 And then combining the results to address the original problem
 This approach is typically implemented using recursive calls!

– Three steps of merge sort


 Divide: divided the input array into two equal-sized subarrays
o Continue dividing into two parts until # of elements becomes 1 or 0
 Conquer: sort the subarrays.
 Combine: merge the sorted subarrays into a single array

39
Merge Sort
• The overall process of merge sort (ascending order)
– Divide

21 10 12 25 128 48 32 3 A[]

21 10 12 25 128 48 32 3

21 10 12 25 128 48 32 3

21 10 12 15 128 48 32 3

40
Merge Sort
• The overall process of merge sort (ascending order)
– Divide

21 10 12 25 128 48 32 3 A[]

21 10 12 25 128 48 32 3

21 10 12 25 128 48 32 3

21 10 12 15 128 48 32 3

Continue dividing into two parts until the number of elements becomes 1 or 0

41
Merge Sort
• The overall process of merge sort Each part can be regarded
as being sorted already
– Conquer and combine
: making a sorted subarray in the process of merging
21 10 12 15 128 48 32 3

42
Merge Sort
• The overall process of merge sort (ascending order)
– Conquer and combine
: making a sorted subarray in the process of merging
21 10 12 15 128 48 32 3

10 21 12 25 48 128 3 32

10 12 21 25 3 32 48 128

3 10 12 21 25 32 48 128

The actual sorting takes place during the merging phase

43
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

10 12 21 25 3 32 48 128

10 12 21 25 3 32 48 128 A[]

Two sorted subarrays in the previous step are the subarrays of A[]

44
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase
t, i, and j are the values for indexing

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 10 12 15 21 3 32 48 128
i j

p, q and r are the given values Make a temp array to store a sorted array

45
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 12 15 21 3 32 48 128
i j

A[i] = 10 > A[j] = 3

46
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 12 15 21 3 32 48 128
i j

A[i] = 10 > A[j] = 3

47
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

Increase j, t
A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 12 15 21 3 32 48 128
i j

48
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 12 15 21 3 32 48 128
i j

A[i] = 10 < A[j] = 32

49
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 15 21 3 32 48 128
i j

A[i] = 10 < A[j] = 32

50
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

Increase i, t
A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 15 21 3 32 48 128
i j

51
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 15 21 3 32 48 128
i j

A[i] = 12 < A[j] = 32

52
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 3 32 48 128
i j

A[i] = 12 < A[j] = 32

53
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

Increase i, t
A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 3 32 48 128
i j

54
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 3 32 48 128
i j

A[i] = 21 < A[j] = 32

55
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 3 32 48 128
i j

A[i] = 21 < A[j] = 32

56
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

Increase i, t
A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 3 32 48 128
i j

57
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 3 32 48 128
i j

A[i] = 25 < A[j] = 32

58
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 25 32 48 128
i j

A[i] = 25 < A[j] = 32

59
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

Increase i, t
A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 25 32 48 128
i j

60
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 25 32 48 128
i j

If q < i
: it means we already checked
the entire values in the left subarray

61
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 25 32 48 128
i j

How will the remaining portion of


the right subarray be handled?

62
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 25 32 48 128
i j

Each subarray is already sorted.


Therefore, just move them to tmp[]

63
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

Reinit i and t
A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 25 32 48 128
i j

64
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
10 12 21 25 3 32 48 128 3 10 12 21 25 32 48 128
i j

Copy all values in tmp[] to A[]


using j and t

65
Merge Sort
• The overall process of merge sort (ascending order)
– Illustration of the merging phase

A[] tmp[]
p q r t
3 10 12 21 25 32 48 128 3 10 12 21 25 32 48 128
j
i

The sorted subarray (or final array)

66
Merge Sort
• The overall process of merge sort (ascending order)
– Conquer and combine
: making a sorted subarray in the process of merging
21 10 12 15 128 48 32 3

10 21 12 25 48 128 3 32

10 12 21 22 3 32 48 128

3 10 12 21 25 32 48 128

The previous process happens in every merging phase!

67
Merge Sort
• Pseudo code of merge sort

– The details of the merge function are introduced in the previous


slides
68
Merge Sort
• Time complexity of merge sort (Big-O)
– When and where are the comparison and movement
operations performed in the merge sort?

– In the divide phase,


 No comparison or movement operations take place
– In the merging phase,
 Many comparison and movement operations occur
69
Merge Sort
• Time complexity of merge sort (Big-O)
– The number of comparisons in the merging phase
 1) what is the number of merging steps?

21 10 12 15 128 48 32 3

10 21 12 25 48 128 3 32

10 12 21 25 3 32 48 128

3 10 12 21 25 32 48 128

3 merging steps for n elements of the array => 3 = log 2 8


: 𝑘 = log 2 𝑛

70
Merge Sort
• Time complexity of merge sort (Big-O)
– The number of comparisons in the merging phase
 2) what is the number of comparisons in each merging step?

21 10 12 15 128 48 32 3

10 21 12 25 48 128 3 32

10 12 21 25 3 32 48 128

3 10 12 21 25 32 48 128

Merging two subarrays of size 1


requires a maximum of 2 comparison

71
Merge Sort
• Time complexity of merge sort (Big-O)
– The number of comparisons in the merging phase
 2) what is the number of comparisons in each merging step?

21 10 12 15 128 48 32 3

10 21 12 25 48 128 3 32

10 12 21 25 3 32 48 128

3 10 12 21 25 32 48 128

Since there are four pairs of subarrays,


a total of 8 comparisons are needed.

72
Merge Sort
• Time complexity of merge sort (Big-O)
– The number of comparisons in the merging phase
 2) what is the number of comparisons in each merging step?

21 10 12 15 128 48 32 3

10 21 12 25 48 128 3 32

10 12 21 25 3 32 48 128

3 10 12 21 25 32 48 128

Merging two subarrays of size 2


requires a maximum of 4 comparison

73
Merge Sort
• Time complexity of merge sort (Big-O)
– The number of comparisons in the merging phase
 2) what is the number of comparisons in each merging step?

21 10 12 15 128 48 32 3

10 21 12 25 48 128 3 32

10 12 21 25 3 32 48 128

3 10 12 21 25 32 48 128

Since there are two pairs of subarrays,


a total of 8 comparisons are needed.

74
Merge Sort
• Time complexity of merge sort (Big-O)
– The number of comparisons in the merging phase
 2) what is the number of comparisons in each merging step?

21 10 12 15 128 48 32 3

10 21 12 25 48 128 3 32

10 12 21 25 3 32 48 128

3 10 12 21 25 32 48 128

Generalizing this, in a single merging step, a


maximum of n comparisons are performed

75
Merge Sort
• Time complexity of merge sort (Big-O)
– The number of comparisons in the merging phase
 2) what is the number of comparisons in each merging step?

21 10 12 15 128 48 32 3

10 21 12 25 48 128 3 32

10 12 21 25 3 32 48 128

3 10 12 21 25 32 48 128

o The number of merging steps * the number of comparisons in each step


= 𝐥𝐨𝐠 𝟐 𝒏 ∗ 𝒏 = 𝒏𝒍𝒐𝒈𝟐 𝒏

76
Merge Sort
• Time complexity of merge sort (Big-O)
– The number of movements in the merging phase
 3) what is the number of movements in each step?

A[] tmp[]

10 12 21 25 3 32 48 128 10 12 15 21 3 32 48 128

77
Merge Sort
• Time complexity of merge sort (Big-O)
– The number of movements in the merging phase
 3) what is the number of movements in each step?

A[] tmp[]

10 12 21 25 3 32 48 128 3 10 12 21 25 32 48 128

Move n values in sorted order

78
Merge Sort
• Time complexity of merge sort (Big-O)
– The number of movements in the merging phase
 3) what is the number of movements in each step?

A[] tmp[]

3 10 12 21 25 32 48 128 3 10 12 21 25 32 48 128

Then, move n values to A[]

o The number of merging steps * the number of movements in each step


= 𝐥𝐨𝐠 𝟐 𝒏 ∗ 𝟐𝒏 = 𝟐𝒏𝒍𝒐𝒈𝟐 𝒏

79
Merge Sort
• Time complexity of merge sort (Big-O)
– The number of comparisons + The number of movements
= 𝒏𝒍𝒐𝒈𝟐 𝒏 + 𝟐𝒏𝒍𝒐𝒈𝟐 𝒏 = 𝟑𝒏𝒍𝒐𝒈𝟐 𝒏

– In the Big-O notation,O 𝟑𝒏𝒍𝒐𝒈𝟐 𝒏 = O 𝒏𝒍𝒐𝒈(𝒏)

– Time complexity of merge sort in the best and average case


 Always, O 𝒏𝒍𝒐𝒈(𝒏)

80
Merge Sort
• Pros and Cons of merge sort
– Pros
 Stable sorting method
: less affected by data distribution. In other words, regardless of the input
data, the sorting time remains the same (O(nlogn))

– Cons
 It requires a temporary array (tmp[])
o Using additional memory
o Merge sort is not in-place sorting
: in-place sorting: a sort algorithm in which the sorted items occupy the
same storage as the original ones

81
Thank you!

E-mail: sungmin.cha@nyu.edu

82

You might also like