5-sort
5-sort
Sort is a term used to describe the process of organizing data in a particular order
allowing for information to be found easier. For example, names and contact
information may be sorted in alphabetical order to allow the person looking for a
name to see if it is available.
Types of sorts
Over the history of the computer there have been many different algorithms
developed to help sort data. Below is a listing of the different types of sorts.
Bubble sort is a simple sorting algorithm. The algorithm starts at the beginning
of the data set. It compares the first two elements, and if the first is greater than the
second, then it swaps them. It continues doing this for each pair of adjacent elements to
the end of the data set. It then starts again with the first two elements, repeating until
no swaps have occurred on the last pass.
This algorithm's average and worst case performance is O(n2), so it is rarely
used to sort large, unordered, data sets. Bubble sort can be used to sort a small number
of items. Bubble sort may also be efficiently used on a list that is already sorted except
for a very small number of elements.
For example, if only one element is not in order, bubble sort will only take 2n
time. If two elements are not in order, bubble sort will only take at most 3n time.
Bubble sort average case and worst case are both O(n²).
The bubble sort is slow, but it's conceptually the simplest of the sorting algorithms.
a. Compare two players.
b. If the one on the left is taller, swap them.
c. Move one position right.
Efficiency of the Bubble sort
The bubble sort is the least efficient, but the simplest, sort.A bubble sort
can be best understood if the array to be sorted as a vertical column whose
smallest elements are at the top and whose largest elements are at the bottom. The
array is scanned from the bottom up, and two adjacent elements are interchanged
if they are found to be out of order with respect to each other.
First, items data[n-1] and data[n-2] are compared and swapped if they are
out of order. Next, data[n-2] and data[n-3] are compared, and their order is
changed if necessary and so on up to data[1] and data[0]. In this way, the smallest
element is bubbled up to the top of the array.
However, this is only the first pass through the array. The array is scanned
again comparing consecutive items and interchanging them when needed, but this
time, the last comparison is done for data[2] and data[1] because the smallest element
is already in its proper position, namely, position 0.
The second pass bubbles the second smallest element of the array up to the
second position, position 1. The procedure continues until the last pass when only one
comparison, data[n-1] with data[n-2], and possibly one interchange are performed.
Merge sort
This is a much more efficient sorting technique than the bubble
Sort and the insertion Sort at least in terms of speed. Although the bubble
and insertion sorts take O(N2) time, the mergesort is O(N*logN).
Merging Two Sorted Arrays
The heart of the mergesort algorithm is the merging of two already sorted arrays.
Merging two sorted arrays A and B creates a third array, C, that contains all the elements of
A and B, also arranged in sorted order. We’ll examine the merging process first; later we’ll
see how it’s used in sorting.
Imagine two sorted arrays. They don’t need to be the same size. Let’s say array A
has 4 elements and array B has 6 elements. They will be merged into an array C that starts
with 10 empty cells. In the below figure, the circled numbers indicate the order in which
elements are transferred from A and B to C.
The below table shows the comparisons necessary to determine which element will
be copied. The steps in the table correspond to the steps in the figure. Following each
comparison, the smaller element is copied to A.
Merging two arrays
Notice that because B is empty following step 8, no more comparisons are necessary; all the
remaining elements are simply copied from A into C.
The algorithm finds the minimum value, swaps it with the value in the first
position, and repeats these steps for the remainder of the list. It does no more than
n swaps, and thus is useful where swapping is very expensive.
The algorithm works as follows:
1. Find the minimum value in the list
2. Swap it with the value in the first position
3. Repeat the steps above for the remainder of the list (starting at the second
position and advancing each time)
Effectively, the list is divided into two parts: the sublist of items already
sorted, which is built up from left to right and is found at the beginning, and the
sublist of items remaining to be sorted, occupying the remainder of the array.
To sort unordered list of elements, we remove its entries one at a time and then insert each of them into a
sorted part (initially empty):
void insertionSort(int[] ar)
{
for (int i=1; i ‹ ar.length; i++)
{
int index = ar[i]; int j = i;
while (j > 0 && ar[j-1] > index)
{
ar[j] = ar[j-1];
j--;
}
ar[j] = index;
}}
Example. We color a sorted part in green, and an unsorted part in black. Here is an insertion sort step by step.
We take an element from unsorted part and compare it with elements in sorted part, moving form right to left.
29, 20, 73, 34, 64
29, 20, 73, 34, 64
20, 29, 73, 34, 64
20, 29, 73, 34, 64
20, 29, 34, 73, 64
20, 29, 34, 64, 73
Efficiency of the Selection Sort
There are two indices i and j and at the very beginning of the partition
algorithm i points to the first element in the array and j points to the last one.
Then algorithm moves i forward, until an element with value greater
or equal to the pivot is found. Index j is moved backward, until an element
with value lesser or equal to the pivot is found. If i ≤ j then they are swapped
and i steps to the next position (i + 1), j steps to the previous one (j - 1).
Algorithm stops, when i becomes greater than j. After partition, all values
before i-th element are less or equal than the pivot and all values after j-th
element are greater or equal to the pivot.
Example. Sort {1, 12, 5, 26, 7, 14, 3, 7, 2} using quicksort.
we show here only the first recursion step, in order not to make example too long. But, in fact,
{1, 2, 5, 7, 3} and {14, 7, 26, 12} are sorted then recursively
Advantages:
One of the fastest algorithms on average
Does not need additional memory (the
sorting takes place in the array - this is called
in-place processing )
Analysis
– Average case: O(n * log2n)
– Worst case: O(n2)
• When the array is already sorted and the
smallest item is chosen as the pivot
– Quicksort is usually extremely fast in practice
– Even if the worst case occurs, quicksort’s
performance is acceptable for moderately large
arrays
The Heap Data Structure
• Def: A heap is a nearly complete binary tree with the following two
properties:
– Structural property: all levels are full, except possibly the last one, which
is filled from left to right
– Order (heap) property: for any node x
Parent(x) ≥ x
24
Heap Types
• Max-heaps (largest element at root), have the max-heap
property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≥ A[i]
25
Adding/Deleting Nodes
• New nodes are always inserted at the bottom
level (left to right)
• Nodes are removed from the bottom level (right
to left)
26
Operations on Heaps
• Maintain/Restore the max-heap property
– MAX-HEAPIFY
• Create a max-heap from an unordered array
– BUILD-MAX-HEAP
• Sort an array in place
– HEAPSORT
• Priority queues
27
Maintaining the Heap Property
• Suppose a node is smaller than a
child
– Left and Right subtrees of i are max-heaps
• To eliminate the violation:
– Exchange with larger child
– Move down the tree
– Continue until node is not smaller than
children
28
Example
MAX-HEAPIFY(A, 2, 10)
A[2] A[4]
A[2] violates the heap property A[4] violates the heap property
A[4] A[9]
30
MAX-HEAPIFY Running Time
• Intuitively:
- h
-
- 2h
- O(h)
31
Building a Heap
• Convert an array A[1 … n] into a max-heap (n = length[A])
• The elements in the subarray A[(n/2+1) .. n] are leaves
• Apply MAX-HEAPIFY on elements between 1 and n/2
Alg: BUILD-MAX-HEAP(A) 1
1. n = length[A] 4
3. do MAX-HEAPIFY(A, i, n) 8
2 9 10
16 9 10
14 8 7
A: 4 1 3 2 16 9 10 14 8 7
32
Example: A 4 1 3 2 16 9 10 14 8 7
4 4 4
2 3 2 3 2 3
1 3 1 3 1 3
4 5 6 7 4 5 6 7 4 5 6 7
8
2 9 10
16 9 10 8 2 9 10
16 9 10 8 14 9 10
16 9 10
14 8 7 14 8 7 2 8 7
i=2 i=1
1 1 1
4 4 16
2 3 2 3 2 3
1 10 16 10 14 10
4 5 6 7 4 5 6 7 4 5 6 7
8
14 9 10
16 9 3 8
14 9 10
7 9 3 8
8 9 10
7 9 3
2 8 7 2 8 1 2 4 1
33
Running Time of BUILD MAX HEAP
Alg: BUILD-MAX-HEAP(A)
1. n = length[A]
2. for i ← n/2 downto 1
O(n)
3. do MAX-HEAPIFY(A, i, n) O(lgn)
34
Example: A=[7, 4, 3, 1, 2]
MAX-HEAPIFY(A, 1, 1)
35