[go: up one dir, main page]

0% found this document useful (0 votes)
45 views21 pages

Final

This document describes 11 sorting algorithms: selection sort, insertion sort, bubble sort, shaker sort, shell sort, heap sort, merge sort, quick sort, counting sort, radix sort, and flash sort. It provides pseudocode and analysis of the time complexity for each algorithm. Experimental results are also presented on sorting randomized, nearly sorted, sorted, and reverse sorted data with each algorithm to compare performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views21 pages

Final

This document describes 11 sorting algorithms: selection sort, insertion sort, bubble sort, shaker sort, shell sort, heap sort, merge sort, quick sort, counting sort, radix sort, and flash sort. It provides pseudocode and analysis of the time complexity for each algorithm. Experimental results are also presented on sorting randomized, nearly sorted, sorted, and reverse sorted data with each algorithm to compare performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

TRƯỜNG ĐẠI HỌC KHOA HỌC TỰ NHIÊN

KHOA CÔNG NGHỆ THÔNG TIN

CSC10004 : Data Structures and Algorithms

Lab 3 Project

GROUP 07
21127092 - Trần Hoàng Lâm
21127239 - Lê Xuân Đạt
21127157 - Dương Phước Sang
21127328 - Cao Tuấn Kiệt
Table of contents
1 Introduction 1

2 Algorithm presentation 1
2.1 Selection Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.2 Insertion Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.3 Bubble Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.4 Shaker Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.5 Shell Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.6 Heap Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.7 Merge Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.8 Quick Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.9 Counting Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.10 Radix Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.11 Flash Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Experimental 10
3.1 Randomized data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Nearly sorted data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Sorted data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4 Reverse sorted data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Project organization and Programming notes 17


4.1 PROJECT ORGANIZATION . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 PROGRAMMING NOTES . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5 References 18
List of tables
1 Randomize Order Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Randomize Order Line Graph . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Randomize Order Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4 Nearly Sorted Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5 Nearly Sorted Line Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
6 Nearly Sorted Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
7 Sorted Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
8 Sorted Line Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
9 Sorted Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
10 Reverse Sorted Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
11 Reverse Sorted Line Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
12 Reverse Sorted Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Data Structures and Algorithms

1 Introduction
Sorting is the process of arranging data into meaningful order so that you can
analyze it more effectively. Sorting algorithm is a set of instructions helping us sort a
specific data. This report will analysis each sorting algorithm of 11 algorithms such as:
Selection Sort, Insertion Sort, Bubble Sort, Shaker Sort, Shell Sort, Heap Sort, Merge
Sort, Quick Sort, Counting Sort, Radix Sort, and Flash Sort. Data collected from
experiments with each algorithms for a given data order, data size will be collected
and there are graphs for each data order as well. As a result, we will find the algorithm
which is the best in each case.

2 Algorithm presentation
2.1 Selection Sort
• Idea:
The list is divided into two sublists, sorted and unsorted, by an imaginary wall.
Find the smallest element from the unsorted part and swap it with the element at
the beginning of the unsorted data. After each selection and swapping, the size of
the sorted region grows by 1 and the size of the unsorted region shrinks by 1.

• Step-by-step description:

∎ Step 1. Set the increment variable i = 1


∎ Step 2. Find the smallest element in a[i..n]
∗ Swap the smallest element found above with a[i].
∗ Increase i by 1 and go to Step 3.
∎ Check whether the end of the array is reached by comparing i with n
∗ If i<n then go to Step 2 (The first i elements are in place)
∗ Otherwise, stop the algorithm.

• Complexity evaluations:
n(n−1)
∎ The number of comparisons: (n − 1) + (n − 2) + ... + 1 = 2
The inner loop executes the size of the unsorted part minus 1, and in each
iteration, there is one key comparison.

1
Data Structures and Algorithms

∎ The number of assignments: 3(n-1)


The outer loop executes n-1 times and invokes swapping once at each iteration.
∎ Together, the number of key operations that a selection sort of n elements
2
requires is: n(n2−1) + 3(n-1) = n2 + 5n
2 - 3

∎ Thus, selection sort is O(n2 ) all cases.

2.2 Insertion Sort


• Idea:
The list is divided into two sublists, sorted and unsorted, by an imaginary wall.
Take the first element of the unsorted region and places it into its correct position
in the sorted region. After each placement, the size of the sorted region grows by
1 and the size of the unsorted region shrinks by 1.

• Step-by-step description:

∎ Step 1. Set the increment variable i = 2


∎ Step 2. Find the correct position pos in a[1..i-1] to insert a[i], i.e., where
a[pos − 1] ≤ a[i] ≤ a[pos].
∗ Set x=a[i].
∗ Move forward a[pos..i - 1 ] element.
∗ Set a[pos]=x;
∗ Increase i by 1 and go to Step 3
∎ Check whether the end of the array is reached by comparing i with n
∗ If i ≤ n then go to Step 2
∗ Otherwise, stop the algorithm.

• Complexity evaluations:
n(n−1)
∎ The number of comparisons: (n − 1) + (n − 2) + ... + 1 = 2
The inner loop executes the size of the sorted part, and in each iteration, there
is one key comparison.
∎ The number of assignments: n(n2−1) + 2(n − 1)
The inner loop moves data items at most the same number of times for com-
parisons. The outer loop moves data items twice per iteration.

2
Data Structures and Algorithms

∎ Together, the number of key operations that an insertion sort of n elements


requires in the worst case is: n(n2−1) + n(n2−1) + 2(n-1) = n2 + n − 2
∎ Worst case performance O(n2 )
∎ Best case performance O(n)
∎ Average efficiency O(n2 )

2.3 Bubble Sort


• Idea:
The list is divided into two sublists, sorted and unsorted, by an imaginary wall.
Compare adjacent elements in the unsorted region and exchange them if they are
out of order. Ordering successive pairs of elements causes the largest element
bubbles to the end of the array.

• Step-by-step description:

∎ Step 1. Set the increment variable i = 1


∎ Step 2. Swap any pair of adjacent elements in a[1..n-i+1] if they are in wrong
order.
∗ Set the increment variable j=1;
∗ if a[j]>a[j+1] then swap a[j] with a[j+1]
∗ Increase j by 1 and repeat Step 2 until the end of the unsorted region is
reached.
∗ After finishing the above loop, increase i by 1 and go to Step 3
∎ Check whether all elements are sorted by comparing i with n
∗ If i<n then go to Step 2
∗ Otherwise, stop the algorithm.

• Complexity evaluations:
n(n−1)
∎ The number of comparisons: (n − 1) + (n − 2) + ... + 1 = 2
The inner loop executes the size of the unsorted part minus 1, and in each
iteration, there is one key comparison.
∎ The number of assignments: the same as above.

3
Data Structures and Algorithms

∎ Together, the number of key operations that an bubble sort of n elements


requires in the worst case is: 2n(n − 1) = 2n2 − 2n
∎ Worst case performance O(n2 )
∎ Best case performance O(n)
∎ Average efficiency O(n2 )

• Variant: Shaker sort.

2.4 Shaker Sort


• Idea:
The idea of Shaker Sort is kind of similar to Bubble Sort. But after one iteration
to the end of an array, Shaker Sort will traverse again to the start of the array.

• Step-by-step description:

1. Start at the first element, compare two adjacent elements.


2. If the first one of the two is greater than the second two, swap them. If not,
move to the next element.
3. Repeat Step 2 until reach the end of an array.
4. After this iteration, if no swaps happen the array has adready been ordered.
If not move to step 5 but just traverse to the element just before the most
recently sorted element.
5. Traverse backward, starting from the element just before the most recently
sorted element.
6. If the first one of the two is greater than the second two (from left to right),
swap them. If not, move to the previous element.
7. Repeat Step 5 until reach the start of the array.
8. After this iteration, if no swaps happen the array has already been ordered.
On the other hand, if there are swaps, return to step 1 but just traverse to the
element just before the most recently sorted element.

• Complexity evaluations:

∎ Worst case performance O(n2 )

4
Data Structures and Algorithms

∎ Best case performance O(n)


∎ Average efficiency O(n2 )

2.5 Shell Sort


• Idea:
An improved algorithm from Insertion sort. The main idea of the algorithm is to
divide the original sequence into subsequences where each element of the sequence
is separated by a position h. Insertion sort applied then on each subsequence
will cause the elements to be returned to the correct position (in the subsequence)
quickly. Then continue to reduce the distance h to form new subsequences (Create
conditions to compare an element with many other elements previously not in the
samesubsequence) and continue sorting. The algorithm stops when h = 1, which
ensures that all elements in the original sequence will be compared to determine
the final order.

• Step-by-step description:

1. Initialize the value h.


2. Divide the list into smaller sublists corresponding to h.
3. Sort these sublists using insertion sort.
4. Repeat until the list is sorted.

• Complexity evaluations:

∎ Worst case performance Less Than or Equal to O(n2 )


∎ Best case performance O(n ∗ logn)
∎ Average efficiency O(n ∗ logn)
∎ The space complexity of the shell sort algorithm is O(1).

2.6 Heap Sort


• Idea:
Improve selection sort by retaining from each scan more information than just
the recognition of the single least item. Construct a max heap from the given

5
Data Structures and Algorithms

array and repeatedly move the largest element in the heap to the end of the array.
Elements are moved from the heap in descending order and placed into sequentially
decreasing positions in the array.

• Step-by-step description:

– P hase 1. Heap construction. Construct a heap for the array


– P hase 2. Maximum deletion. Apply maximum key deletion n-1 times to the
remaining heap
∗ Swap the first element and the last element of the heap.
∗ Decrease heapSize by 1, heapSize=n-1;
∗ While heapSize>1:
· Rebuild the heap at the first position, a[0..heapSize].
· Swap the first element and the last element of the heap.
· Decrease heapSize by 1, heapSize = heapSize-1.

• Complexity evaluations:

– Worst case performance: O(n log2 n)


– Best case performance: O(n log2 n)
– Average case performance: O(n log2 n)

2.7 Merge Sort


• Idea:
Recursively divide the array into halves, sort each half, and then merge the sorted
halves into one sorted array.

• Step-by-step description:

– Divide the unsorted list into n sublists, each containing one element (a list of
one element is considered sorted).
– Repeatedly merge sublists to produce new sorted sublists until there is only
one sublist remaining. This will be the sorted list.

• Complexity evaluations:

6
Data Structures and Algorithms

– Worst case performance: O(n log2 n)


– Best case performance: O(n log2 n)
– Average case performance: O(n log2 n)

• Variants/improvements:

– In-place merge sort: this algorithm doesn’t use extra memory while functioning
just like the above one.
– 3-way merge sort: instead of dividing the array into 2 parts like normal merge
sort, this one divides the array into 3 parts, and thus the complexity os
O(nlog3n).

2.8 Quick Sort


• Idea:
Quick sort is based on Divide an Conquer idea. It picks an element as pivot and
starting partitioning an array. The elements in left side of the pivot is smaller,
and the elements in right side of pivot is greater. The step by step description
below will choose the middle element as pivot.

• Step-by-step description:

– Step 1. Pick the pivot p=a[k], where k=[(first+last)/2].


– Step 2. Identity pairs of elements that are not in their correct positions and
swap them.
∗ Set the increment variables, i=first and j=last-1.
∗ While a[i]<p do increase i by 1. While a[j]>p do increase j by 1.
∗ If i ≤ j then swap a[i] with a[j], increase i by 1 and decrease j by 1.
∗ Go to Step 3.
– Step 3. Check whether the two smaller subarrays overlap.
∗ If i<j then go to Step 2.
∗ Otherwise, recursively go to Step 1 with a[first..j] and a[i..last].

• Complexity evaluations:

– Worst case performance: O(n2 ).

7
Data Structures and Algorithms

– Best case performance: O(n log2 n).


– Average case performance: O(n log2 n).

• Variants/improvements:

– 3-way quick sort: based on the Dutch National Flag algorithm, this sort divide
the array into 3 parts: elements smaller than pivot, elements equal to pivot
and elements larger than pivot.
– Quick sort with bubble sort and Quick sort with insertion sort: the other
sorting algorithm will be used under a specific condition of the number of
elements in a partition.
– Enhanced quick sort: the pivot is now moved to the end of the array to make
the sorting more efficient.

2.9 Counting Sort


• Idea:
Counting sort works by iterating through the input, counting the number of times
each item occurs, and using those counts to compute an item’s index in the final,
sorted array. The implementation of counting sort here will only work with natural
numbers.

• Step-by-step description:

– Compute the frequency of each of those values and store them in array: Create
an array to store the frequencies of elements, the array’s size should cover all
possible elements in the range [0, u], which u is the value of maximum number.
f[i] return the frequency of key i, which is an element in the array.
– Distribution counting: add up sums of frequencies f[i] = f[i] + f[i-1], where
f[i-1] is the last position of (i-1) if it exists. If (i-1) does not exist, the position
of the smaller existing item will be maintained.
– Distribute values to their final positions (The distribution values decide the
proper positions for the last occurrences of their elements in the sorted array.)
– Copy array b[ ] back to array a[ ]: a[1..n] = b[1..n].

• Complexity evaluations:

8
Data Structures and Algorithms

– Find maximum value(u) –> O(n)


– Perform distribution counting –> O(n)
– Accumulate sum of frequencies –> O(u)
– Distribute values to their final positions –> O(n)
– Copy array b[ ] back to array a[ ] –> O(n)
=> Time Complexity: O(n).

2.10 Radix Sort


• Idea:
Radix Sort is a distribution-sorting algorithm. Thus, it does not compare the
items of an array.The algorithm forms groups of data items and combines them to
sort a collection of data.

• Step-by-step description:

– Define 10 queues each representing a bucket for each digit from 0 to 9.


– Consider the least significant digit of each number in the list which is to be
sorted.
– Insert each number into their respective queue based on the least significant
digit.
– Group all the numbers from queue 0 to queue 9 in the order they have inserted
into their respective queues.
– Repeat from step 3 based on the next least significant digit.
– Repeat from step 2 until all the numbers are grouped based on the most
significant digit.

• Complexity evaluations:

– Worst case performance: O(n)


– Best case performance: O(n)
– Average case performance: O(n)

• Complexity evaluations: The above algorithm implements the use of 2D array,


which costs a lot of memory when the number of elements in the given array is
extremely large. To fix this, a Linked List may be used instead.

9
Data Structures and Algorithms

2.11 Flash Sort


• Idea:
FlashSort is a distribution sorting algorithm such as countingSort, radixSort.
FlashSort in-place permutes elements into their appropriate buckets, which helps
flashSort outweigh its counterparts. After the permutation, the original array
becomes nearly sorted.

• Step-by-step description:

– Test 1. Find the minimum value and the index of the maximum value in the
array. If the value in the max index equals to the min value, it means that the
array has only one kind of number and thus is already sorted.
– Test 2. Declare m - the number of buckets. There are many ways to calculate
m, but in this algorithm, it equal to int(0.45 * n), with n is the length of the
given array. Then create an array with the size of m.
– Test 3. Make a pass over the given array and count the number of elements
in each bucket.
– Test 4. Change the count number in each bucket into a preflix sum, where
bucket[i] = bucket[i] + bucket[i-1].
– Test 5. Apply a formula to rearrange the given array, in which every element
is in its correct bucket.
– Test 6. With those below steps, the array is nearly sorted. Thus, insertion
sort is used to do the rest of the work and return the sorted array.

• Complexity evaluations:

∎ Worst case performance O(n2 )


∎ Best case performance O(n)
∎ Average efficiency O(n)

3 Experimental
3.1 Randomized data
• TABLE :

10
Data Structures and Algorithms

Picture 1: Randomize Order Table

• LINE GRAPH :

Picture 2: Randomize Order Line Graph

11
Data Structures and Algorithms

• BAR CHART :

Picture 3: Randomize Order Chart

3.2 Nearly sorted data


• TABLE :

Picture 4: Nearly Sorted Table

• LINE GRAPH :

12
Data Structures and Algorithms

Picture 5: Nearly Sorted Line Graph

• BAR CHART :

Picture 6: Nearly Sorted Chart

3.3 Sorted data


• TABLE :

13
Data Structures and Algorithms

Picture 7: Sorted Table

• LINE GRAPH :

Picture 8: Sorted Line Graph

14
Data Structures and Algorithms

• BAR CHART :

Picture 9: Sorted Chart

3.4 Reverse sorted data


• TABLE :

Picture 10: Reverse Sorted Table

• LINE GRAPH :

15
Data Structures and Algorithms

Picture 11: Reverse Sorted Line Graph

• BAR CHART :

Picture 12: Reverse Sorted Chart

16
Data Structures and Algorithms

4 Project organization and Programming notes


4.1 PROJECT ORGANIZATION
There are a total of 5 .cpp files and 4 .h header files in the submission, each of them
has a distinguished function:

• ProjectLab3.cpp: the main .cpp file, used to run the program and store the input
arguments.

• CommandLine.cpp and CommandLine.h: contain functions realted to checking


the validation of the input command line, determining the type of command and
running the command.

• DataGenerator.cpp and DataGenerator.h: contain functions to initialize an array


in a total of four different modes (random, sorted, nearly sorted and reversed).

• File.cpp and File.h: provide functions to work with file (create a file, write data
to a file, read data from a file).

• SortingAlgo.cpp and SortingAlgo.h: 11 sorting algorithms are stored in those files,


with 11 functions with comparison counting and 11 overloaded functions without
comparison counting.

To build an .exe file with g++:

• Open Command Prompt and locate the folder containing all those .cpp and .h
files.

• Type: "g++ ProjectLab3.cpp CommandLine.cpp DataGenerator.cpp File.cpp Sortin-


gAlgo.cpp" to begin building .exe file.

• An a.exe file should be created, then in Command Prompt, run that file together
with an appropriate command line.ialize an array in a total of four different modes
(random, sorted, nearly sorted and reversed).

4.2 PROGRAMMING NOTES


Some special libraries were used to make the program work:

17
Data Structures and Algorithms

• The <time.h> library (used in DataGenerator.cpp): this library was used to create
random factors while creating an array (eg. random array or nearly sorted array).

• The <chrono> library (used in CommandLine.cpp): by using this library, a time


counter can be declared in order to count the running time of a sorting algorithm.

• he <cstdlib> library (used in CommandLine.cpp): the function atoi(); in this


library was used to check the validation of a number input in the command line.

Some other notes:

• At the beginning of the program, a random array with the size of 10000 will always
be created and written down to input.txt.

• When the time counter starts, there is a very small amount of time belonging to
the switch case in RunAlgorithm function (CommandLine.cpp). But overall, that
duration of time is insignificant and thus won’t affect the final result.

• While checking the input size to distinguish between command 3 and command 1
(see checkIsCommand function in CommandLine.cpp), if the input size is larger
than 6 characters or the number converted is bigger than 1000000 or smaller than
1, it will be treated as a file name for command 1.

5 References
• Flash sort source code: https ∶ //github.com/HaiDuc0147/sortingAlgorithm/blob/main

• Shell sort source code: https ∶ //www.geeksf orgeeks.org /shellsort/

• Sorting algorithms’ efficency and their variants:http ∶ //home.westman.wave.ca/ rhenry /

• Quick sort implementation and source code: Pr. Nguyen Thanh Phuong (2020)
“Lecture notes of CS163 – Data structures” University of Science - Vietnam Na-
tional University HCM.

• All other sorting algorithms’ source code: Lecturer Nguyen Ngoc Thao’s lecture
notes: https ∶ //drive.google.com/drive/f olders/16IIrM 22−ubN enD4pjprCDGoa7Bn

• https ∶ //www.programiz.com/dsa/selection − sort

18

You might also like