ALG Part1 ASSIGNMENT
ALG Part1 ASSIGNMENT
Merge is a sorting method that uses the conquer and divide method. Merge sort is
similar to rapid sort algorithm in that it sorts the elements using a divide and conquer
strategy. It’s one of the most widely used and effective sorting algorithms. In addition, by
implementing merge sort, it splits the given list into the 2 equal halves, then calls itself for
each half and merges the 2 sorted parts. (Merge Sort Algorithm, n.d) Moreover, the sub-lists
will be divided into halves again and again until the list can no longer be divided. The first
stage involves splitting the list into separate parts known as sub-lists. The ‘merge’ stage
begins after that. The sub-lists are matched up here and sorted in the order specified.
(Holamyself, 2019)
By implementing merge sort, for bigger lists, it is faster because, unlike bubble and
insertion sort, it does not traverse through the entire list many times. It has a regular running
time and performs different portions in a stage at similar times. (Holamyself, 2019)
An example will be given to represent the steps to sort an array using the Merge Sort
algorithm.
Figure 2.1.2 Merge Sort Process
Take an unsorted array as an example to better grasp the merge sort algorithm;s
operating principle. Merge sort is easier to grasp by using examples. The elements of the
example array are 12, 31, 25, 8, 32, 17, 40, 42. First, divide the provided array into 2 equal
halves according to the merging sort. The merge sort needs to divide the list into equal
portions until it can not be divided any further. The first step of dividing is to divide the 8
elements of the array into 2 arrays of size 4. Next, the array needs to be divided again into
halves, divide them into new arrays of size 2 because they are of size 4. After dividing into
arrays of size 2, divide these arrays once more to obtain the atomic value that can not be
divided any further. (Working on Merge Sort Algorithms, n.d)
After dividing until not able to be divided again, combine the elements as the same
ways they were broken. When merging arrays, compare the elements of each array first, then
merge them in sorted order into a new array. First, compare between 12 and 31, which are
both in sorted order. Then compare between 25 and 8, and place 8 first in the list of 2 values,
followed by 25. After that, compare between 32 and 17 and rank them so that 17 comes first.
After that, compare between 42 and 40 and arrange them in a logical order. Compare the
arrays containing 2 data values in the following iteration of combining and combine them in
sorted order into a set of found values. After final merging, the array order will be 8, 12, 17,
25, 31, 32, 40, 42. (Working on Merge Sort Algorithms, n.d)
class Merge {
void merge(int x[], intleft, int mid, int right)
{
int x, y, z;
int a1 = mid - right + 1;
int a2 = left - mid;
x = 0;
y = 0;
z = left;
Heap Sort is one of the popular comparison-based sorting methods invented by J.W.J
Williams in 1964. (Kataria, 2018) It is an in-place algorithm that only requires a constant
amount of auxiliary memory at any time beyond the input as it sorts the array by swapping
the position of the elements within the original array. (Black & Martinez, 2019) The list of
elements to be sorted will be transformed into a Binary Heap data structure which refers to a
complete binary tree with heap properties. In a binary tree, each element of the array will be
represented as a node of the tree with the value stored in the node. Every node in the tree will
only have two descendants except leaves and be completely filled from the left up to a point.
The value of the parent node will be either greater or lesser than its child nodes. A Binary
Heap can be categorized into 2 types, which are Max Heap and Min Heap. A Max Heap
indicates the value of the root node must be greater than or equal to its children while the
value of the root node in a Min Heap must be lesser or equal to its children. The diagram
below shows an example of Max Heap and Min Heap. ("HeapSort - GeeksforGeeks", 2021)
In the Heap Sort algorithm, Max Heap will be used to sort the array in descending
order while Min Heap will be used to arrange the elements in the array ascendingly. ("Data
Structures Tutorials - Heap Sort Algorithm", n.d.) To sort an array by using Heap Sort
algorithms, the first phase is to create either a Max Heap or Min Heap depending on the
required order of the sorted array. The elements will be moved to its proper position
accordingly until it meets the heap property and this process is called Heapify. Once a heap
has been created, it comes to the second phase in which the root of the heap will be shifted to
the end node and the root element will be eliminated from the heap by moving it to the end of
the array. After removal, the heap will be heaped and the process in the second phase will be
repeated until all the elements are eliminated as well as the array is completely sorted. ("Heap
Sort Algorithm - InterviewBit", n.d.)
2.2.2 Steps to perform Heap Sort algorithm
An example will be given to represent the steps to sort an array using the Heap Sort
algorithm.
Assume the array to be rearranged consists of the elements: 3, 21, 18, 7, 9, 15, and the array
should be sorted in descending order. A binary tree will be created from the array elements.
In array-based representation, the element of index 0 will be the root element while the left
and right child of any element can be calculated by index 2*i+1 and 2*i +2. The array is
assumed to start at index 0. ("HeapSort - GeeksforGeeks", 2021) Based on the assumption,
the diagram below shows the array with its binary tree in visualization.
In phase 2, the root element of the Max Heap will be eliminated by swapping it with the end
node of the heap. The removed element will be shifted to the end of the array. In array-based
representation, the element will be extracted from the heap by reducing the maximum array
index of the heap by 1. After that, the heap will be heaped. The whole step will be conducted
repeatedly until all the elements have been covered and sorted into the array while the size of
the heap is greater than 1 in array-based representation. ("Heap Sort Algorithm -
InterviewBit", n.d.) The figure below represents the eliminating and sorting process.
Eliminating and Sorting Process
The table below shows the implementation of the Heap Sort algorithm written in Java.
// Driver code
public static void main(String args[]) {
int arr[] = { 3, 21, 18, 7, 9, 15 };
HeapSort hs = new HeapSort();
hs.sort(arr);
Assume that we have n numbers in our unsorted input list. We have to apply the mergesort
function to the left half of the input array (size n/2), as well as the right half of the input array
(size n/2). The merge process does not contain any nested loops, so it is executed with linear
complexity. If the array size is doubled, the merge time doubles. Therefore, a running time of
O(n) will be required to merge the subarrays. So based on this we get a recursive formula, but
we still need to compute the number of steps used by merge. We compare two numbers and
shift one number to the result array, so there are two steps for each number. Hence, the
number of steps taken is 2n steps in total. ("Introduction to Recursion and Merge Sort", 2021)
After that, we have to solve the recurrence formula. We can see that plugging in the formula
recursively k times gives us the formula as below.
As merge sort algorithm will always divide the array into two halves and takes linear time to
merge two halves, so the time complexity of best case scenario is O(n*log n). However, it can
be improved if the number of comparisons can be reduced while doing merge and sort. In
order to achieve the minimum number of comparisons, the input array has to be ascending or
descending order. The example array will be [0,1,2,3,4,5,6,7]. As a result, merge sort is faster
than random data since the number of comparisons is reduced.
The time complexity of worst case scenario is the same as the best case scenario which is
O(n*log n). This is because the merge sort divides the array in two halves at each stage,
giving it the log(n) component, while the other N component comes from the comparisons
made at each stage, whether the worst case or average scenario. As a result, combining it
becomes nearly O(n*log n). Whether it's the best-case scenario or the worst-case scenario, the
log(n) element is always present. And the rest n factor depends on the comparisons made,
which derive from both situations' comparisons. Therefore, the worst case of merge sort will
be the one where merge sort will have to do a maximum number of comparisons. The
example of the input array will be [4,0,6,2,5,1,7,3], where we can see that it has the
maximum number of comparisons. ("Introduction to Recursion and Merge Sort", 2021)
Time complexity of heap sort can be separately calculated by its two processes, which
are build max heap function and extract max function. Below table shows the time
complexity analysis for these two operations.
= 2n * c (c is constant)
The table 3.2.1.1 shows the time complexity of the buildMaxHeap function. It can see
that the outer loop will only execute n/2 times because it is not required to perform heapify
for leaf nodes. This is not the final time complexity because the heapify function will execute
recursively in a bubble down way. Therefore, a count of comparison for each level is shown
in the figure 3.2.1. Every heapify operation will perform two comparisons, one for choosing
the largest value among children and one for comparison with the parent’s value. The total
number of comparisons is counted in table 3.2.1.2 by summing all the comparisons at each
level. At last, the result for the example given is 2n * c, which c is a constant. Therefore, the
highest order of number of operations is n and it can be concluded that the time complexity of
the buildMaxHeap function is O(n).
3.2.1.2 Extract Max
//Extract Max
public void extraxtMax(int[] arr, int n, int i) { -> 1
for (int i = n - 1; i >= 0; i--) { -> n + 1
int temp = arr[0]; -> n
arr[0] = arr[i]; -> n
arr[i] = temp; -> n
heapify(arr, i, arr[0]); -> n
}
}
In conclusion, the time complexity of heap sort will always depend on the size of the
input array and the call of the heapify function in the algorithm. In the heap sort algorithm
the heapify can happen at two stages, the first stage is build max heap function and the
second stage is extract max function. The time complexities of build max heap and extract
max function are respectively O(n) and O(nlogn). Since O(nlogn) dominates the time
complexity, therefore, it only considers the heapify operation in extract max function when
calculating the overall time complexity.
3.2.2 Best Case
In order to achieve the best case, the call of heapify operations in extract max function
should be reduced to minimum. It can achieve the minimum call of n times when all the input
values are the same, which n is the input of array size. For example the input unsorted array is
[8, 8, 8, 8, 8, 8]. In this scenario, the heapify function will not be executed in a bubble down
way or in recursively. The heapify function will always execute exactly one time only when
the maximum value is extracted. Figure 3.2 shows the process of extracting the maximum
number stage in heap sort when the input values are identical.
Figure 3.2.2 - Eliminating and Sorting Process with Identical Input Array Values
Operation Count (n is the input array size)
Swap n times
Eliminate n times
From the figure 3.2, the process of extracting the maximum for identical input array
values can be seen. The heapify operation for each round of extracting will only execute
once. This is because the values of the child index are always the same with the parent value.
Therefore, it does not require heapify calling to maintain the max heap property and will not
recursively call the heapify again for its child. The total number of heapify function calls for
extract max function is n times instead of log n, which n is the number of input.
After the analysis above, it can be seen that the time complexity of heap sort can
actually reduce to linear time which is O(n) in the best case. The problem is the situation in
which all the input values are identical is unrealistic and impractical in the real world. That is
because nobody wants to sort an array with all the identical values. Therefore, this situation is
being ignored and the time complexity in the best case will still be maintained in O(nlogn)
where the input array’s values are not all identical.
Same with calculation of best case, since extract max function dominates the overall
time complexity, therefore, the worst case analysis only focuses on calculating the worst
scenario in the extract max function. The worst case happens when all the input is distinct
and it is required to perform a heapify function in a bubble down way for every call of extract
max process. To illustrate this, a perfect example is an input unsorted array with values [1, 2,
3, 4, 5, 6, 7, 8, 9]. After buildMaxHeap, it will become [9, 8, 7, 6, 5, 4, 3, 2, 1]. Everytime,
the maximum value is swapped with the minimum value at the last, it is required to compare
at every level until the bottom which will cause total time complexity of nlogn as discussed in
the section 3.1.1.2.
3.3 Suitability of Algorithms
· To explain which algorithms are most suitable in what scenarios, consider various
attributes
6. Kataria, A. (2018). Heap Sort (Introduction, Algorithm and Program using C).
Includehelp.com. Retrieved 17 October 2021, from
https://www.includehelp.com/data-structure-tutorial/heap-sort-introducntion-algorith
m-
7. Black, P., & Martinez, C. (2019). in-place sort. Xlinux.nist.gov. Retrieved 17 October
2021, from https://xlinux.nist.gov/dads/HTML/inplacesort.html.