[go: up one dir, main page]

0% found this document useful (0 votes)
12 views35 pages

5-sort

The document discusses various sorting algorithms, including Bubble Sort, Merge Sort, Selection Sort, and Quick Sort, detailing their processes, efficiencies, and complexities. Bubble Sort is simple but inefficient for large datasets, while Merge Sort and Quick Sort are more efficient with O(N log N) complexity. Additionally, it covers the Heap data structure, its properties, and operations like MAX-HEAPIFY and BUILD-MAX-HEAP.

Uploaded by

Amnah Issa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views35 pages

5-sort

The document discusses various sorting algorithms, including Bubble Sort, Merge Sort, Selection Sort, and Quick Sort, detailing their processes, efficiencies, and complexities. Bubble Sort is simple but inefficient for large datasets, while Merge Sort and Quick Sort are more efficient with O(N log N) complexity. Additionally, it covers the Heap data structure, its properties, and operations like MAX-HEAPIFY and BUILD-MAX-HEAP.

Uploaded by

Amnah Issa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Sorting

Sort is a term used to describe the process of organizing data in a particular order
allowing for information to be found easier. For example, names and contact
information may be sorted in alphabetical order to allow the person looking for a
name to see if it is available.

Types of sorts
Over the history of the computer there have been many different algorithms
developed to help sort data. Below is a listing of the different types of sorts.

Bubble sort - Heap sort - Insertion sort


Merge sort - Quick sort - Selection sort
Shell sort - Tree sort
Bubble Sort

Bubble sort is a simple sorting algorithm. The algorithm starts at the beginning
of the data set. It compares the first two elements, and if the first is greater than the
second, then it swaps them. It continues doing this for each pair of adjacent elements to
the end of the data set. It then starts again with the first two elements, repeating until
no swaps have occurred on the last pass.
This algorithm's average and worst case performance is O(n2), so it is rarely
used to sort large, unordered, data sets. Bubble sort can be used to sort a small number
of items. Bubble sort may also be efficiently used on a list that is already sorted except
for a very small number of elements.
For example, if only one element is not in order, bubble sort will only take 2n
time. If two elements are not in order, bubble sort will only take at most 3n time.
Bubble sort average case and worst case are both O(n²).
The bubble sort is slow, but it's conceptually the simplest of the sorting algorithms.
a. Compare two players.
b. If the one on the left is taller, swap them.
c. Move one position right.
Efficiency of the Bubble sort
The bubble sort is the least efficient, but the simplest, sort.A bubble sort
can be best understood if the array to be sorted as a vertical column whose
smallest elements are at the top and whose largest elements are at the bottom. The
array is scanned from the bottom up, and two adjacent elements are interchanged
if they are found to be out of order with respect to each other.
First, items data[n-1] and data[n-2] are compared and swapped if they are
out of order. Next, data[n-2] and data[n-3] are compared, and their order is
changed if necessary and so on up to data[1] and data[0]. In this way, the smallest
element is bubbled up to the top of the array.
However, this is only the first pass through the array. The array is scanned
again comparing consecutive items and interchanging them when needed, but this
time, the last comparison is done for data[2] and data[1] because the smallest element
is already in its proper position, namely, position 0.
The second pass bubbles the second smallest element of the array up to the
second position, position 1. The procedure continues until the last pass when only one
comparison, data[n-1] with data[n-2], and possibly one interchange are performed.
Merge sort
This is a much more efficient sorting technique than the bubble
Sort and the insertion Sort at least in terms of speed. Although the bubble
and insertion sorts take O(N2) time, the mergesort is O(N*logN).
Merging Two Sorted Arrays
The heart of the mergesort algorithm is the merging of two already sorted arrays.
Merging two sorted arrays A and B creates a third array, C, that contains all the elements of
A and B, also arranged in sorted order. We’ll examine the merging process first; later we’ll
see how it’s used in sorting.
Imagine two sorted arrays. They don’t need to be the same size. Let’s say array A
has 4 elements and array B has 6 elements. They will be merged into an array C that starts
with 10 empty cells. In the below figure, the circled numbers indicate the order in which
elements are transferred from A and B to C.
The below table shows the comparisons necessary to determine which element will
be copied. The steps in the table correspond to the steps in the figure. Following each
comparison, the smaller element is copied to A.
Merging two arrays

Notice that because B is empty following step 8, no more comparisons are necessary; all the
remaining elements are simply copied from A into C.

Complexity of Mergesort = T(n) = 2*T(n/2) + n - 1


Merging larger and larger arrays Array size not a power of 2
The mergesort is a divide-and-conquer algorithm:
1. Divides the array near its midpoint
2. Sorts the two half-arrays by recursive calls
3. Merges the two sorted half-arrays to obtain a fully sorted array
Mergesort has a time complexity of O(n log2 n) in all cases
Selection Sort

The algorithm finds the minimum value, swaps it with the value in the first
position, and repeats these steps for the remainder of the list. It does no more than
n swaps, and thus is useful where swapping is very expensive.
The algorithm works as follows:
1. Find the minimum value in the list
2. Swap it with the value in the first position
3. Repeat the steps above for the remainder of the list (starting at the second
position and advancing each time)

Effectively, the list is divided into two parts: the sublist of items already
sorted, which is built up from left to right and is found at the beginning, and the
sublist of items remaining to be sorted, occupying the remainder of the array.
To sort unordered list of elements, we remove its entries one at a time and then insert each of them into a
sorted part (initially empty):
void insertionSort(int[] ar)
{
for (int i=1; i ‹ ar.length; i++)
{
int index = ar[i]; int j = i;
while (j > 0 && ar[j-1] > index)
{
ar[j] = ar[j-1];
j--;
}
ar[j] = index;
}}
Example. We color a sorted part in green, and an unsorted part in black. Here is an insertion sort step by step.
We take an element from unsorted part and compare it with elements in sorted part, moving form right to left.
29, 20, 73, 34, 64
29, 20, 73, 34, 64
20, 29, 73, 34, 64
20, 29, 73, 34, 64
20, 29, 34, 73, 64
20, 29, 34, 64, 73
Efficiency of the Selection Sort

The selection sort performs the same number of comparisons as the


bubble sort: N*(N-1)/2. For 10 data items, this is 45 comparisons. However, 10
items require fewer than 10 swaps. With 100 items, 4,950 comparisons are
required, but fewer than 100 swaps. For large values of N, the comparison
times will dominate, so we would have to say that the selection sort runs in
O(N2) time, just as the bubble sort did.
However, it is faster because there are so few swaps. For smaller values
of N, the selection sort may in fact be considerably faster, especially if the
swap times are much larger than the comparison times.
Quick sort

Quicksort is one of the most common sorting algorithms for sequential


computers because of its simplicity, low overhead, and optimal average
complexity.
• Quicksort selects one of the entries in the sequence to be the pivot and divides
the sequence into two - one with all elements less than or equal to pivot are placed
before the pivot and elements greater than pivot are placed after the pivot.
• The process is recursively applied to each of the sublists.
• Quicksort operates in O(N*logN) time.
• The base case of the recursion is list of size zero or one, which never need to be sorted.
Quicksort consists of two phases:
- Sort phase.
- Partition phase.
Algorithm
The divide-and-conquer strategy is used in quicksort. Below the recursion step is
described:
1. Choose a pivot value. We take the value of the middle element as pivot value,
but it can be any value, which is in range of sorted values, even if it doesn't
present in the array.
2. Partition. Rearrange elements in such a way, that all elements which are
lesser than the pivot go to the left part of the array and all elements greater than
the pivot, go to the right part of the array. Values equal to the pivot can stay in
any part of the array. Notice, that array may be divided in non-equal parts.
3. Sort both parts. Apply quicksort algorithm recursively to the left and the
right parts.
Partition algorithm in detail

There are two indices i and j and at the very beginning of the partition
algorithm i points to the first element in the array and j points to the last one.
Then algorithm moves i forward, until an element with value greater
or equal to the pivot is found. Index j is moved backward, until an element
with value lesser or equal to the pivot is found. If i ≤ j then they are swapped
and i steps to the next position (i + 1), j steps to the previous one (j - 1).
Algorithm stops, when i becomes greater than j. After partition, all values
before i-th element are less or equal than the pivot and all values after j-th
element are greater or equal to the pivot.
Example. Sort {1, 12, 5, 26, 7, 14, 3, 7, 2} using quicksort.

we show here only the first recursion step, in order not to make example too long. But, in fact,
{1, 2, 5, 7, 3} and {14, 7, 26, 12} are sorted then recursively
Advantages:
One of the fastest algorithms on average
Does not need additional memory (the
sorting takes place in the array - this is called
in-place processing )

Analysis
– Average case: O(n * log2n)
– Worst case: O(n2)
• When the array is already sorted and the
smallest item is chosen as the pivot
– Quicksort is usually extremely fast in practice
– Even if the worst case occurs, quicksort’s
performance is acceptable for moderately large
arrays
The Heap Data Structure
• Def: A heap is a nearly complete binary tree with the following two
properties:
– Structural property: all levels are full, except possibly the last one, which
is filled from left to right
– Order (heap) property: for any node x
Parent(x) ≥ x

8 From the heap property, it


follows that:
7 4 “The root is the maximum
5 2 element of the heap!”
Heap
23
A heap is a binary tree that is filled in order
Array Representation of Heaps
• A heap can be stored as an
array A.
– Root of tree is A[1]
– Left child of A[i] = A[2i]
– Right child of A[i] = A[2i + 1]
– Parent of A[i] = A[ i/2 ]
– Heapsize[A] ≤ length[A]
• The elements in the subarray
A[(n/2+1) .. n] are leaves

24
Heap Types
• Max-heaps (largest element at root), have the max-heap
property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≥ A[i]

• Min-heaps (smallest element at root), have the min-heap


property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≤ A[i]

25
Adding/Deleting Nodes
• New nodes are always inserted at the bottom
level (left to right)
• Nodes are removed from the bottom level (right
to left)

26
Operations on Heaps
• Maintain/Restore the max-heap property
– MAX-HEAPIFY
• Create a max-heap from an unordered array
– BUILD-MAX-HEAP
• Sort an array in place
– HEAPSORT
• Priority queues

27
Maintaining the Heap Property
• Suppose a node is smaller than a
child
– Left and Right subtrees of i are max-heaps
• To eliminate the violation:
– Exchange with larger child
– Move down the tree
– Continue until node is not smaller than
children

28
Example
MAX-HEAPIFY(A, 2, 10)

A[2]  A[4]

A[2] violates the heap property A[4] violates the heap property

A[4]  A[9]

Heap property restored


29
Maintaining the Heap Property
• Assumptions: Alg: MAX-HEAPIFY(A, i, n)
– Left and Right 1. l ← LEFT(i)
subtrees of i are 2. r ← RIGHT(i)
max-heaps 3. if l ≤ n and A[l] > A[i]
– A[i] may be 4. then largest ←l
smaller than its
5. else largest ←i
children
6. if r ≤ n and A[r] > A[largest]
7. then largest ←r
8. if largest  i
9. then exchange A[i] ↔ A[largest]
10. MAX-HEAPIFY(A, largest, n)

30
MAX-HEAPIFY Running Time
• Intuitively:
- h
-
- 2h
- O(h)

• Running time of MAX-HEAPIFY is O(lgn)

• Can be written in terms of the height of the heap,


as being O(h)
– Since the height of the heap is lgn

31
Building a Heap
• Convert an array A[1 … n] into a max-heap (n = length[A])
• The elements in the subarray A[(n/2+1) .. n] are leaves
• Apply MAX-HEAPIFY on elements between 1 and n/2

Alg: BUILD-MAX-HEAP(A) 1

1. n = length[A] 4

for i ← n/2 downto 1


2 3
2. 1 3
4 5 6 7

3. do MAX-HEAPIFY(A, i, n) 8
2 9 10
16 9 10
14 8 7

A: 4 1 3 2 16 9 10 14 8 7

32
Example: A 4 1 3 2 16 9 10 14 8 7

i=5 i=4 i=3


1 1 1

4 4 4
2 3 2 3 2 3

1 3 1 3 1 3
4 5 6 7 4 5 6 7 4 5 6 7

8
2 9 10
16 9 10 8 2 9 10
16 9 10 8 14 9 10
16 9 10
14 8 7 14 8 7 2 8 7

i=2 i=1
1 1 1

4 4 16
2 3 2 3 2 3

1 10 16 10 14 10
4 5 6 7 4 5 6 7 4 5 6 7

8
14 9 10
16 9 3 8
14 9 10
7 9 3 8
8 9 10
7 9 3
2 8 7 2 8 1 2 4 1
33
Running Time of BUILD MAX HEAP

Alg: BUILD-MAX-HEAP(A)
1. n = length[A]
2. for i ← n/2 downto 1
O(n)
3. do MAX-HEAPIFY(A, i, n) O(lgn)

 Running time: O(nlgn)


• This is not an asymptotically tight upper bound

34
Example: A=[7, 4, 3, 1, 2]

MAX-HEAPIFY(A, 1, 4) MAX-HEAPIFY(A, 1, 3) MAX-HEAPIFY(A, 1, 2)

MAX-HEAPIFY(A, 1, 1)

35

You might also like