0% found this document useful (0 votes)

13 views109 pages

Unit - 4 Sorting

Sorting is the process of arranging data in a specific order, which optimizes data searching and enhances readability. Common sorting algorithms include Bubble Sort, Selection Sort, Insertion Sort, Quick Sort, and Merge Sort, each with varying complexities and applications. The choice of sorting algorithm depends on factors like data size, memory usage, and the specific requirements of the task.

Uploaded by

harinik842

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views109 pages

Unit - 4 Sorting

Uploaded by

harinik842

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 109

Sorting

 Sorting refers to arranging data in a particular format. Sorting

algorithm specifies the way to arrange data in a particular
order. Most common orders are in numerical or lexicographical
order.

 The importance of sorting lies in the fact that data searching

can be optimized to a very high level, if data is stored in a
sorted manner. Sorting is also used to represent data in more
readable formats. Following are some of the examples of
sorting in real-life scenarios −

 Telephone Directory − The telephone directory stores the

telephone numbers of people sorted by their names, so that the
names can be searched easily.

 Dictionary − The dictionary stores words in an alphabetical order

so that searching of any word becomes easy.
Sorting is the process of arranging the
elements of an array so that they can be
placed either in ascending or descending
order. For example, consider an array A =
{A1, A2, A3, A4, ?? An }, the array is called
to be in ascending order if element of A are
arranged like A1 > A2 > A3 > A4 > A5 > ?
> An .
Consider an array;
int A[10] = { 5, 4, 10, 2, 30, 45, 34, 14, 18,
9)
The Array sorted in ascending order
will be given as;
Sorting Category
 In-place Sorting and Not-in-place Sorting
 Sorting algorithms may require some extra space for
comparison and temporary storage of few data
elements. These algorithms do not require any extra
space and sorting is said to happen in-place, or for
example, within the array itself. This is called in-place
sorting.
 Bubble sort is an example of in-place sorting.

 some sorting algorithms, the program requires space

which is more than or equal to the elements being
sorted. Sorting which uses equal or more space is called
not-in-place sorting.
 Merge-sort is an example of not-in-place sorting.
There are two different categories in
sorting:
Internal sorting: If the input data is such
that it can be adjusted in the main memory
at once, it is called internal sorting.
External sorting: If the input data is such
that it cannot be adjusted in the memory
entirely at once, it needs to be stored in a
hard disk, floppy disk, or any other storage
device. This is called external sorting.
Bubble sort
Bubble sort is a simple sorting algorithm.
This sorting algorithm is comparison-based
algorithm in which each pair of adjacent
elements is compared and the elements
are swapped if they are not in order.
This algorithm is not suitable for large data
sets as its average and worst case
complexity are of Ο(n2) where n is the
number of items.
Algorithm
begin BubbleSort(arr)
for all array elements
if arr[i] > arr[i+1]
swap(arr[i], arr[i+1])
end if
end for
return arr
end BubbleSort
Let the elements of array are -

.
Now, compare 32 and 35.
Now, move to the second iteration.
Second Pass
The same process will be followed for
second iteration.
 Third Pass
 The same process will be followed for third
iteration.
Bubble Sort compares the adjacent elements.

Cycle Number of Comparisons

1st (n-1)

2nd (n-2)

3rd (n-3)

....... ......

last 1

the number of comparisons is

(n-1) + (n-2) + (n-3) +.....+ 1 = n(n-

1)/2
Bubble sort complexity
 Best Case Complexity - It occurs when there is no
sorting required, i.e. the array is already sorted. The
best-case time complexity of bubble sort is O(n).
 Average Case Complexity - It occurs when the
array elements are in jumbled order that is not
properly ascending and not properly descending.
The average case time complexity of bubble sort
is O(n2).
 Worst Case Complexity - It occurs when the array
elements are required to be sorted in reverse order.
That means suppose you have to sort the array
elements in Case
ascending Time
order, but its elements are in
Complexity
descending Best
order. The O(n)
Case
worst-case time complexity of
bubble sort is O(n2).
Average O(n )
2

Case
Worst Case O(n2)
Space Complexity

Space Complexity O(1)

Stable YES

The space complexity of bubble sort is O(1). It is

because, in bubble sort, an extra variable is required
for swapping.
Applications of Bubble sort

Bubble sort is a sorting algorithm that is

used to sort the elements in an ascending
order.
It uses less storage space
Bubble sort can be beneficial to sort the
unsorted elements in a specific order.
It can be used to sort the students on basis
of their height in a line.
To create a stack , pile up the elements on
basis of their weight
Selection Sorting
 In selection sort, the smallest value among the unsorted
elements of the array is selected in every pass and inserted to
its appropriate position into the array.
 Selection sort is a simple sorting algorithm. This sorting
algorithm is an in-place comparison-based algorithm in which
the list is divided into two parts, the sorted part at the left end
and the unsorted part at the right end.
 Initially, the sorted part is empty and the unsorted part is the
entire list.

 The smallest element is selected from the unsorted array and

swapped with the leftmost element, and that element becomes a
part of the sorted array. This process continues moving unsorted
array boundary by one element to the right.

 This algorithm is not suitable for large data sets as its average
and worst case complexities are of Ο(n2), where n is the number
of items.
Selection sort is generally used when -
A small array is to be sorted
Swapping cost doesn't matter
It is compulsory to check all elements
 SELECTION SORT(arr, n)

Step 1: Repeat Steps 2 and 3 for i = 0 to n-1

Step 2: CALL SMALLEST(arr, i, n, pos)
Step 3: SWAP arr[i] with arr[pos]
[END OF LOOP]
Step 4: EXIT

SMALLEST (arr, i, n, pos)

Step 1: [INITIALIZE] SET SMALL = arr[i]
Step 2: [INITIALIZE] SET pos = i
Step 3: Repeat for j = i+1 to n
if (SMALL > arr[j])
SET SMALL = arr[j]
SET pos = j
[END OF if]
[END OF LOOP]
Step 4: RETURN pos
After two iterations, two least values are
positioned at the beginning in a sorted
manner.

Selection Sort
The same process is applied to the rest of
the items in the array.
Following is a pictorial depiction of
the entire sorting process −
 Complexity = O(n2)
 we can analyze the complexity by simply observing
the number of loops. There are 2 loops so the
complexity is n*n = n2.
 Time Complexities:
 Worst Case Complexity: O(n2)
If we want to sort in ascending order and the array
is in descending order then, the worst case occurs.
 Best Case Complexity: O(n2)
It occurs when the array is already sorted
 Average Case Complexity: O(n2)
It occurs when the elements of the array are in
jumbled order (neither ascending nor descending).
Case Time Complexity
Best Case O(n2)
Average Case O(n2)
Worst Case O(n2)
Space Complexity
O(1)
Stable
YES
Space complexity is O(1) because an extra
variable temp is used.
Selection Sort Applications

The selection sort is used when

a small list is to be sorted
cost of swapping does not matter
checking of all the elements is compulsory
cost of writing to a memory matters like in
flash memory (number of writes/swaps is
O(n) as compared to O(n2) of bubble sort)
Insertion Sort
 This is an in-place comparison-based sorting
algorithm. Here, a sub-list is maintained which is
always sorted. For example, the lower part of an array
is maintained to be sorted.
 An element which is to be 'insert'ed in this sorted sub-
list, has to find its appropriate place and then it has to
be inserted there. Hence the name, insertion sort.

 The array is searched sequentially and unsorted items

are moved and inserted into the sorted sub-list (in the
same array). This algorithm is not suitable for large
data sets as its average and worst case complexity
are of Ο(n2), where n is the number of items.
Algorithm
Step 1 − If it is the first element, it is
already sorted. return 1;
Step 2 − Pick next element
Step 3 − Compare with all elements in the
sorted sub-list
Step 4 − Shift all the elements in the sorted
sub-list that is greater than the
 value to be sorted
Step 5 − Insert the value
Step 6 − Repeat until list is sorted
 Time Complexity
 Worst Case Complexity: O(n2)
Suppose, an array is in ascending order, and you want to sort it in
descending order. In this case, worst case complexity occurs.

Each element has to be compared with each of the other

elements so, for every nth element, (n-1) number of comparisons
are made.

Thus, the total number of comparisons = n*(n-1) ~ n 2

 Best Case Complexity: O(n)
When the array is already sorted, the outer loop runs for n number
of times whereas the inner loop does not run at all. So, there are
only n number of comparisons. Thus, complexity is linear.
 Average Case Complexity: O(n2)
It occurs when the elements of an array are in jumbled order
(neither ascending nor descending).
Space Complexity
Space complexity is O(1) because an extra
variable key is used.
Insertion Sort Applications
The insertion sort is used when:
 the array is has a small number of elements
 there are only a few elements left to be sorted
Quick Sort
 Quick sort is a fast sorting algorithm used to sort a list of
elements. Quick sort algorithm is invented by C. A. R.
Hoare.
 The quick sort algorithm attempts to separate the list of
elements into two parts and then sort each part
recursively. That means it use divide and
conquer strategy.
 In quick sort, the partition of the list is performed based
on the element called pivot. Here pivot element is one of
the elements in the list.

 The list is divided into two partitions such that "all

elements to the left of pivot are smaller than the
pivot and all elements to the right of pivot are
greater than or equal to the pivot".
Quick sort is also known as Partition-exchange sort based on
the rule of Divide and Conquer.
It is a highly efficient sorting algorithm.
Quick sort is the quickest comparison-based sorting algorithm.
It is very fast and requires less additional space, only O(n log n)
space is required.
Quick sort picks an element as pivot and partitions the array
around the picked pivot.
 Step by Step Process
 In Quick sort algorithm, partitioning of the list is
performed using following steps...
• Step 1 - Consider the first element of the list
as pivot (i.e., Element at first position in the list).
• Step 2 - Define two variables i and j. Set i and j to first
and last elements of the list respectively.
• Step 3 - Increment i until list[i] > pivot then stop.
• Step 4 - Decrement j until list[j] < pivot then stop.
• Step 5 - If i < j then exchange list[i] and list[j].
• Step 6 - Repeat steps 3,4 & 5 until i > j.
• Step 7 - Exchange the pivot element with list[j]
element.
1. First element as pivot

2. Last element as pivot

3. Random element as pivot

4. Median as pivot
Algorithm for Quick Sort

Step 1: Choose the highest index value as pivot.

Step 2: Take two variables to point left and right of the list excluding
pivot.

Step 3: Left points to the low index.

Step 4: Right points to the high index.

Step 5: While value at left < (Less than) pivot move right.

Step 6: While value at right > (Greater than) pivot move left.

Step 7: If both Step 5 and Step 6 does not match, swap left and right.

Step 8: If left = (Less than or Equal to) right, the point where they met
is new pivot.
Implementaion of Quick Sort Algorithm using C
Programming Language
#include<stdio.h> void quickSort(int list[10],int first,int last){
#include<conio.h> int pivot,i,j,temp;
if(first < last){
void quickSort(int [10],int,int); pivot = first;
void main(){ i = first;
int list[20],size,i; j = last;
while(i < j){
printf("Enter size of the list: ");
while(list[i] <= list[pivot] && i < last)
scanf("%d",&size); i++;
printf("Enter %d integer values: ",size); while(list[j] > list[pivot])
j--;
for(i = 0; i < size; i++)
if(i <j){
scanf("%d",&list[i]); temp = list[i];
quickSort(list,0,size-1); list[i] = list[j];
printf("List after sorting is: "); list[j] = temp;
}
for(i = 0; i < size; i++) }
printf(" %d",list[i]); temp = list[pivot];
getch(); list[pivot] = list[j];
list[j] = temp;
}
quickSort(list,first,j-1);
quickSort(list,j+1,last);
}
}
Complexity of the Quick Sort Algorithm

To sort an unsorted list with 'n' number of elements, we need to make

((n-1)+(n-2)+(n-3)+......+1) = (n (n-1))/2 number of comparisions in the
worst case. If the list is already sorted, then it requires 'n' number of
comparisions.

Worst Case : O(n2)

Best Case : O (n log n)
Average Case : O (n log n)
The diagram represents how to find the pivot
value in an array. As we see, pivot value
divides the list into two parts (partitions) and
then each part is processed for quick sort.
Quick sort is a recursive function. We can call
the partition function again.
Application of Quick sort
Quicksort is used everywhere where a
stable sort is not needed
Variants of quicksort are used to separate
the k smallest or largest elements
Quicksort's divide-and-conquer enables the
use of parallelization
Merge Sort in Data Structure
 Merge sort is a sorting algorithm based on the Divide and
conquer strategy. It works by recursively dividing the
array into two equal halves, then sort them and combine
them. It takes a time of (n logn) in the worst case.
 The merge sort algorithm is an implementation of the
divide and conquer technique. Thus, it gets completed in
three steps:
 1. Divide: In this step, the array/list divides itself recursively
into sub-arrays until the base case is reached.
 2. Recursively solve: Here, the sub-arrays are sorted using
recursion.
 3. Combine: This step makes use of the merge( ) function to
combine the sub-arrays into the final sorted array.
 Merge sort is similar to the quick sort algorithm as it uses
the divide and conquer approach to sort the elements.
 It is one of the most popular and efficient sorting
algorithm.
 It divides the given list into two equal halves, calls itself
for the two halves and then merges the two sorted
halves. We have to define the merge() function to
perform the merging.
 The sub-lists are divided again and again into halves until
the list cannot be divided further. Then we combine the
pair of one element lists into two-element lists, sorting
them in the process.
 The sorted two-element pairs is merged into the four-
element lists, and so on until we get the sorted list.
Merge sort keeps on dividing the list into
equal halves until it can no more be
divided. By definition, if it is only one
element in the list, it is sorted. Then, merge
sort combines the smaller sorted lists
keeping the new list sorted too.

Step 1 − if it is only one element in the list

it is already sorted, return.
Step 2 − divide the list recursively into two
halves until it can no more be divided.
Step 3 − merge the smaller lists into new
Algorithm for Merge Sort

Step 1: Find the middle index of the array.

Middle = 1 + (last – first)/2
Step 2: Divide the array from the middle.
Step 3: Call merge sort for the first half of the
array
MergeSort(array, first, middle)
Step 4: Call merge sort for the second half of
the array.
MergeSort(array, middle+1, last)
Step 5: Merge the two sorted halves into a
single sorted array.
void mergeSort(int a[], int beg, int en /* Function to print the array */
d) void printArray(int a[], int n)
{ {
int i;
if (beg < end)
for (i = 0; i < n; i++)
{ printf("%d ", a[i]);
int mid = (beg + end) / 2; printf("\n");
mergeSort(a, beg, mid); }
mergeSort(a, mid + 1, end);
merge(a, beg, mid, end);
}
int main()
}
{
int a[] = { 12, 31, 25, 8, 32, 17, 40, 42 };
int n = sizeof(a) / sizeof(a[0]);
printf("Before sorting array elements are -\
n");
printArray(a, n);
mergeSort(a, 0, n - 1);
printf("After sorting array elements are - \n");
printArray(a, n);
return 0;
}
Output
Below given figure shows how Merge Sort works:
 Time Complexity
Best Case Complexity - It occurs when there is no sorting
required, i.e. the array is already sorted. The best-case time
complexity of merge sort is O(n*logn).
Average Case Complexity - It occurs when the array elements
are in jumbled order that is not properly ascending and not
properly descending. The average case time complexity of merge
sort is O(n*logn).
Worst Case Complexity - It occurs when the array elements are
required to be sorted in reverse order. That means suppose you
have to sort the array elements in ascending order, but its elements
are in descending order. The worst-case time complexity of merge
sort is O(n*logn).
Space Complexity
The space complexity of merge sort is O(n). It is because, in
merge sort, an extra variable is required for swapping .
Applications of Merge sort
Merge Sort is useful for sorting linked lists
in O(n Log n) time

Merge sort can be implemented without

extra space for linked lists

Merge sort is used for counting inversions

in a list

Merge sort is used in external sorting

Heap Sort Algorithm
Heap sort processes the elements by creating the min-heap or max-
heap using the elements of the given array. Min-heap or max-heap
represents the ordering of array in which the root element represents
the minimum or maximum element of the array.
Heap sort basically recursively performs two main operations -
•Build a heap H, using the elements of array.
•Repeatedly delete the root element of the heap formed in 1st phase.
Before knowing more about the heap sort, let's first see a brief
description of Heap.

What is a heap?
A heap is a complete binary tree, and the binary tree is a tree in
which the node can have the utmost two children. A complete
binary tree is a binary tree in which all the levels except the last
level, i.e., leaf node, should be completely filled, and all the nodes
should be left-justified.
What is heap sort?
Heapsort is a popular and efficient sorting algorithm. The concept of heap
sort is to eliminate the elements one by one from the heap part of the list,
and then insert them into the sorted part of the list.
Heapsort is the in-place sorting algorithm.

Heap sort is one of the sorting algorithms used to arrange a list of

elements in order. Heapsort algorithm uses one of the tree concepts
called Heap Tree. In this sorting algorithm, we use Max Heap to
arrange list of elements in Descending order and Min Heap to arrange
list elements in Ascending order.
Step by Step Process
The Heap sort algorithm to arrange a list of elements in ascending order
is performed using following steps...
•Step 1 - Construct a Binary Tree with given list of Elements.
•Step 2 - Transform the Binary Tree into Min Heap.
•Step 3 - Delete the root element from Min Heap
using Heapify method.
•Step 4 - Put the deleted element into the Sorted list.
•Step 5 - Repeat the same until Min Heap becomes empty.
•Step 6 - Display the sorted list.
Implementation of Heapsort
#include <stdio.h>
/* function to heapify a subtree. Here 'i' is the
index of root node in array a[], and 'n' is the size of he
ap. */
void heapify(int a[], int n, int i)
{
int largest = i; // Initialize largest as root
int left = 2 * i + 1; // left child
int right = 2 * i + 2; // right child
// If left child is larger than root
if (left < n && a[left] > a[largest])
largest = left;
// If right child is larger than root
if (right < n && a[right] > a[largest])
largest = right;
// If root is not largest
if (largest != i) {
// swap a[i] with a[largest]
int temp = a[i];
a[i] = a[largest];
a[largest] = temp;

heapify(a, n, largest);
}
}
/
*Function to implement the heap sort
*/
void heapSort(int a[], int n)
{
for (int i = n / 2 - 1; i >= 0; i--)
heapify(a, n, i);
// One by one extract an element fr
om heap
for (int i = n - 1; i >= 0; i--) {
/
* Move current root element to end*/
// swap a[0] with a[i] /
int temp = a[0]; * function to print the array elements
a[0] = a[i]; */
a[i] = temp; void printArr(int arr[], int n)
{
heapify(a, i, 0); for (int i = 0; i < n; ++i)
} {
} printf("%d", arr[i]);
printf(" ");
}
int main()
{
int a[] = {48, 10, 23, 43, 28, 26, 1};
int n = sizeof(a) / sizeof(a[0]);
printf("Before sorting array elements are - \
n");
printArr(a, n);
heapSort(a, n);
printf("\nAfter sorting array elements are - \
n");
printArr(a, n);
return 0;
}
Output
Heap sort complexity
Time complexity of Heap sort in the best case, average case, and worst case. We will also see the space
complexity of Heapsort.
1. Time Complexity

Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted.
The best-case time complexity of heap sort is O(n logn).
•Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. The average case time complexity of heap sort
is O(n log n).
•Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse
order. That means suppose you have to sort the array elements in ascending order, but its elements are
in descending order. The worst-case time complexity of heap sort is O(n log n).
The time complexity of heap sort is O(n logn) in all three cases (best case, average case, and worst
case). The height of a complete binary tree having n elements is logn.
2. Space Complexity
•The space complexity of Heap sort is O(1).
Hashing
Hash table is one of the most important data
structures that uses a special function known
as a hash function that maps a given value
with a key to access the elements faster.
A Hash table is a data structure that stores
some information, and the information has
basically two main components, i.e., key and
value.
The hash table can be implemented with the
help of an associative array. The efficiency of
mapping depends upon the efficiency of the
hash function used for mapping.
Hash function:
Hash function is a mathematical formula,
produces an integer which can be used
as an index for the key in the hash
table.
Perfect Hash Function- Each key is
transformed into a unique storage
location
Imperfect hash Function- Maps more
than one key to the same storage
location
 For example, suppose the key value is John and the
value is the phone number, so when we pass the key
value in the hash function shown as below:
 Hash(key)= index;
 When we pass the key in the hash function, then it
gives the index.
 Hash(john) = 3;
 The above example adds the john at the index 3.
 Drawback of Hash function
 A Hash function assigns each value with a unique key.
Sometimes hash table uses an imperfect hash function
that causes a collision because the hash function
generates the same key of two different values.
Operation in hash function:
Insert - T[ h(key) ] = value;
It calculates the hash, uses it as the key and
stores the value in hash table.
Delete - T[ h(key) ] = NULL;
It calculates the hash, resets the value stored
in the hash table for that key.
Search - return T[ h(key) ];
It calculates the hash, finds and returns the
value stored in the hash table for that key.
How to choose a Hash Function

An efficient hash function should be built such that the

index value of the added item is distributed equally
across the table.
An effective collision resolution technique should be
created to generate an alternate index for a key
whose hash index corresponds to a previously inserted
position in a hash table.
We must select a hash algorithm that is fast to
calculate.
Characteristics of a good Hash Function
Uniform Distribution: For distribution throughout the
constructed table.
Fast: The generation of hash should be very fast, and
should not produce any considerable overhead.
Methods of hash function:
 Division Method
Multiplication Method
Mid Square Method
Folding Method
Division-method:
In this method we use modular arithmetic
system to divide the key value by some
integer division m. It gives us the location
value, where the element can be placed.
L=(k mod m)
L->Location in table
K->key value
m->table size
Eg: k=52, m=10 then
L=(52 mod 10)=2
The key whose value 52 is placed in 2nd
location.
Mid square method
In this we square the value of a key and
take the number of digits required to form
an
address, from the middle position of
squared value
(eg) key =16
Square is 256. Then the address as 56(two
digits starting form mid of 256).
 Collision
 When the two different values have the same value, then
the problem occurs between the two values, known as a
collision. In the above example, the value is stored at
index 6. If the key value is 26, then the index would be:
 h(26) = 26%10 = 6
 Therefore, two values are stored at the same index, i.e.,
6, and this leads to the collision problem. To resolve
these collisions, we have some techniques known as
collision techniques.
 The following are the collision techniques:
 Open Hashing: It is also known as closed addressing.
 Closed Hashing: It is also known as open addressing.
Open Hashing (Separate
Chaining)
It is the most commonly used collision
hashing technique implemented using
Lined List. When any two or more elements
collide at the same location, these
elements are chained into a single-linked
list called a chain.
chain all the elements in a linked list that
hash to the same slot.
Let’s consider an example of a simple hash
function.
h(key) = key%table size
In a hash table with the size 7
h(27) = 27%7 = 6
h(130) = 130%7 = 4
If we insert a new element
(18, “Saleema”), that would also go to the
4th index.
h(18) = 18%7 = 4
For separate chaining, the worst-case scenario is
when all of the keys will get the same hash value and
will be inserted in the same linked list. We can avoid
this by using a good hash function.
Closed Hashing
In Closed hashing, three techniques are
used to resolve the collision:
Linear probing
Quadratic probing
Double Hashing technique
Linear Probing
Linear probing is one of the forms of open
addressing.
each cell in the hash table contains a key-
value pair, so when the collision occurs by
mapping a new key to the cell already
occupied by another key, then linear
probing technique searches for the closest
free locations and adds a new key to that
empty cell.
In this case, searching is performed
sequentially, starting from the position
where the collision occurs till the empty cell
 If slot hash(x) % S is full, then we try (hash(x)
+ 1) % S

 If (hash(x) + 1) % S is also full, then we try

(hash(x) + 2) % S

 If (hash(x) + 2) % S is also full, then we try

(hash(x) + 3) % S
A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and
h(k) = 2k+3
The key values 3, 2, 9, 6 are stored at the indexes 9,
7, 1, 5 respectively. The calculated index value of 11 is
5 which is already occupied by another key value, i.e.,
6. When linear probing is applied, the nearest empty
cell to the index 5 is 6; therefore, the value 11 will be
added at the index 6.
The next key value is 13. The index value associated
with this key value is 9 when hash function is applied.
The cell is already filled at index 9. When linear
probing is applied, the nearest empty cell to the index
9 is 0; therefore, the value 13 will be added at the
index 0.
The next key value is 7. The index value
associated with the key value is 7 when
hash function is applied. The cell is already
filled at index 7. When linear probing is
applied, the nearest empty cell to the index
7 is 8; therefore, the value 7 will be added
at the index 8.
The next key value is 12. The index value
associated with the key value is 7 when
hash function is applied. The cell is already
filled at index 7. When linear probing is
applied, the nearest empty cell to the index
7 is 2; therefore, the value 12 will be added
Challenges in Linear Probing :
Primary Clustering: One of the problems
with linear probing is Primary clustering,
many consecutive elements form groups
and it starts taking time to find a free slot
or to search for an element.
Secondary Clustering: Secondary
clustering is less severe, two records only
have the same collision chain (Probe
Sequence) if their initial position is the
same.
Quadratic probing

In case of linear probing, searching is

performed linearly. In contrast, quadratic
probing is an open addressing technique
that uses quadratic polynomial for
searching until a empty slot is found.
It can also be defined as that it allows the
insertion ki at first free location
from (u+i2)%m where i=0 to m-1.
let hash(x) be the slot index computed
using hash function.
If slot hash(x) % S is full, then we try (hash(x) +
1*1) % S
If (hash(x) + 1*1) % S is also full, then we try
(hash(x) + 2*2) % S
If (hash(x) + 2*2) % S is also full, then we try
(hash(x) + 3*3) % S
 Consider the same example which we discussed in the linear
probing.
 A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3
 The key values 3, 2, 9, 6 are stored at the indexes 9, 7, 1, 5,
respectively. We do not need to apply the quadratic probing
technique on these key values as there is no occurrence of the
collision.
 The index value of 11 is 5, but this location is already occupied
by the 6. So, we apply the quadratic probing technique.
 When i = 0
 Index= (5+02)%10 = 5
 When i=1
 Index = (5+12)%10 = 6
 Since location 6 is empty, so the value 11 will be added at the
index 6.
The next element is 13. When the hash function is
applied on 13, then the index value comes out to
be 9, which we already discussed in the chaining
method. At index 9, the cell is occupied by another
value, i.e., 3. So, we will apply the quadratic
probing technique to calculate the free location.
When i=0
Index = (9+02)%10 = 9
When i=1
Index = (9+12)%10 = 0
Since location 0 is empty, so the value 13 will be
added at the index 0.
The next element is 7. When the hash function is
applied on 7, then the index value comes out to be
7, which we already discussed in the chaining
method. At index 7, the cell is occupied by another
value, i.e., 7. So, we will apply the quadratic
probing technique to calculate the free location.
When i=0
Index = (7+02)%10 = 7
When i=1
Index = (7+12)%10 = 8
Since location 8 is empty, so the value 7 will be
added at the index 8.
 The next element is 12. When the hash function is applied on 12, then
the index value comes out to be 7. When we observe the hash table
then we will get to know that the cell at index 7 is already occupied by
the value 2. So, we apply the Quadratic probing technique on 12 to
determine the free location.
 When i=0
 Index= (7+02)%10 = 7
 When i=1
 Index = (7+12)%10 = 8
 When i=2
 Index = (7+22)%10 = 1
 When i=3
 Index = (7+32)%10 = 6
 When i=4
 Index = (7+42)%10 = 3
 Since the location 3 is empty, so the value 12 would be stored at the
index 3.
Therefore, the order of the elements is 13, 9, _, 12, _, 6, 11, 2, 7,
3.
Double Hashing
Double hashing is an open addressing technique which
is used to avoid the collisions. When the collision
occurs then this technique uses the secondary hash of
the key. It uses one hash value as an index to move
forward until the empty location is found.
In double hashing, two hash functions are used.
Suppose h1(k) is one of the hash functions used to
calculate the locations whereas h2(k) is another hash
function.
It can be defined as "insert ki at first free place
from (u+v*i)%m where i=(0 to m-1)". In this case, u is
the location computed using the hash function and v is
equal to (h2(k)%m).
let hash(x) be the slot index computed
using hash function.
 If slot hash(x) % S is full, then we try
(hash(x) + 1*hash2(x)) % S If (hash(x) +
1*hash2(x)) % S is also full,
then we try (hash(x) + 2*hash2(x)) % S If
(hash(x) + 2*hash2(x)) % S is also full,
then we try (hash(x) + 3*hash2(x)) % S
Linear probing has the best cache
performance but suffers from clustering.
One more advantage of Linear probing is
easy to compute.

Quadratic probing lies between the two in

terms of cache performance and
clustering.

Double hashing has poor cache

performance but no clustering. Double
hashing requires more computation time as
two hash functions need to be computed.
S.No. Separate Chaining Open Addressing

Open Addressing requires more

1. Chaining is Simpler to implement.
computation.

In chaining, Hash table never fills up,

In open addressing, table may
2. we can always add more elements to
become full.
chain.

Open addressing requires extra

Chaining is Less sensitive to the hash
3. care to avoid clustering and load
function or load factors.
factor.

Chaining is mostly used when it is

Open addressing is used when the
unknown how many and how
4. frequency and number of keys is
frequently keys may be inserted or
known.
deleted.

Cache performance of chaining is not Open addressing provides better

5. good as keys are stored using linked cache performance as everything
list. is stored in the same table.

In Open addressing, a slot can be

Wastage of Space (Some Parts of hash
6. used even if an input doesn’t map
table in chaining are never used).
to it.
Complexity and Load Factor

 For the first step, time taken depends on the K and the
hash function.
For example, if the key is a string “abcd”, then it’s hash
function may depend on the length of the string. But for
very large values of n, the number of entries into the
map, length of the keys is almost negligible in
comparison to n so hash computation can be considered
to take place in constant time, i.e, O(1).
 For the second step, traversal of the list of K-V pairs
present at that index needs to be done. For this, the
worst case may be that all the n entries are at the same
index. So, time complexity would be O(n).
 But, enough research has been done to make hash
functions uniformly distribute the keys in the array so this
almost never happens.
on an average, if there are n entries and b
is the size of the array there would be n/b
entries on each index. This value n/b is
called the load factor that represents the
load that is there on our map.
This Load Factor needs to be kept low, so
that number of entries at one index is less
and so is the complexity almost constant,
i.e., O(1).
Rehashing

rehashing means hashing again.

Basically, when the load factor increases to
more than its pre-defined value (default
value of load factor is 0.75), the complexity
increases.
To overcome this, the size of the array is
increased (doubled) and all the values are
hashed again and stored in the new double
sized array to maintain a low load factor
and low complexity.
Rehashing can be done as follows:

For each addition of a new entry to the map,

check the load factor.
If it’s greater than its pre-defined value (or
default value of 0.75 if not given), then
Rehash.
For Rehash, make a new array of double the
previous size and make it the new bucketarray.
Then traverse to each element in the old
bucketArray and call the insert() for each so as
to insert it into the new larger bucket array.

Module-03
No ratings yet
Module-03
29 pages
Unit3 Notes
No ratings yet
Unit3 Notes
42 pages
Sorting and Searching Ppt
No ratings yet
Sorting and Searching Ppt
57 pages
Design And Analysis Of Algorithm
From Everand
Design And Analysis Of Algorithm
Bhupendra Mandloi
No ratings yet
UNIT 2 Searching and Sorting Techniques
No ratings yet
UNIT 2 Searching and Sorting Techniques
151 pages
DS Lecture Week 4 Sorting
No ratings yet
DS Lecture Week 4 Sorting
22 pages
Cs 1311 Lecture 16 Wdl
No ratings yet
Cs 1311 Lecture 16 Wdl
211 pages
Unit 5PPTs
No ratings yet
Unit 5PPTs
27 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
UNIT -4 DSA
No ratings yet
UNIT -4 DSA
67 pages
DS Unit 5
No ratings yet
DS Unit 5
30 pages
358_33_powerpoint-slides_14-sorting_Chapter-14
No ratings yet
358_33_powerpoint-slides_14-sorting_Chapter-14
35 pages
Lecture 06
No ratings yet
Lecture 06
39 pages
Module 3 (Part 2- Sorting)
No ratings yet
Module 3 (Part 2- Sorting)
14 pages
3.2 Sorting Techniques
No ratings yet
3.2 Sorting Techniques
88 pages
Data Structures. MOD_5
No ratings yet
Data Structures. MOD_5
12 pages
unit 2(data and file structure)part(1)
No ratings yet
unit 2(data and file structure)part(1)
14 pages
Data Structures Unit 2 SPJ Sorting
No ratings yet
Data Structures Unit 2 SPJ Sorting
57 pages
M Ramzan Presentation
No ratings yet
M Ramzan Presentation
26 pages
Searching and Sorting
No ratings yet
Searching and Sorting
51 pages
13031423049 _DSA
No ratings yet
13031423049 _DSA
10 pages
Data Structures Unit-1 Chapter-3
No ratings yet
Data Structures Unit-1 Chapter-3
42 pages
Dsa Chapter 8
No ratings yet
Dsa Chapter 8
55 pages
Data Structure Notes 3to 5 Modules
No ratings yet
Data Structure Notes 3to 5 Modules
34 pages
CENG205 W-13b
No ratings yet
CENG205 W-13b
31 pages
Diploma CS III Sem Algorithm Unit 2 Notes
No ratings yet
Diploma CS III Sem Algorithm Unit 2 Notes
22 pages
Unit 5
No ratings yet
Unit 5
23 pages
DSA-Chapter 3 - Sorting and Searching Algorithms
No ratings yet
DSA-Chapter 3 - Sorting and Searching Algorithms
42 pages
Chapter 2 Simple Sorting and Searching Algorithms
No ratings yet
Chapter 2 Simple Sorting and Searching Algorithms
6 pages
Raabta Thesis
No ratings yet
Raabta Thesis
74 pages
Sorting Algorithms
No ratings yet
Sorting Algorithms
84 pages
Sorting
No ratings yet
Sorting
11 pages
Sorting, Bubble, Selection, Insertion
No ratings yet
Sorting, Bubble, Selection, Insertion
65 pages
Searching Sorting Hashing
No ratings yet
Searching Sorting Hashing
116 pages
Unit 2 - Sorting and Divide and Conquer Approach
No ratings yet
Unit 2 - Sorting and Divide and Conquer Approach
138 pages
Lecture 2: Sorting Techniques
No ratings yet
Lecture 2: Sorting Techniques
34 pages
CACS201 Unit 8 - Sorting
No ratings yet
CACS201 Unit 8 - Sorting
46 pages
Module5 DSA
No ratings yet
Module5 DSA
68 pages
LEC 9 Sorting
No ratings yet
LEC 9 Sorting
87 pages
Data Structure and Algorithm Chap 2
No ratings yet
Data Structure and Algorithm Chap 2
59 pages
CSE 12 Sorting and Searching
No ratings yet
CSE 12 Sorting and Searching
44 pages
Machine Learning Unit 1
100% (7)
Machine Learning Unit 1
112 pages
Unit 04 Sorting
No ratings yet
Unit 04 Sorting
191 pages
Lab Assignment 4 (Searching and Sorting)
No ratings yet
Lab Assignment 4 (Searching and Sorting)
56 pages
CTSD2 UNIT-6 Searching and Sorting
No ratings yet
CTSD2 UNIT-6 Searching and Sorting
47 pages
03 Sorting
No ratings yet
03 Sorting
57 pages
Chapter 3 Sorting Techniques
No ratings yet
Chapter 3 Sorting Techniques
45 pages
03 Sorting
No ratings yet
03 Sorting
57 pages
07 SortingAlgorithms
No ratings yet
07 SortingAlgorithms
41 pages
Durstenfeld - Random Permutation Algorithm
No ratings yet
Durstenfeld - Random Permutation Algorithm
1 page
Faculty e Notes Unit 3 DS 2
No ratings yet
Faculty e Notes Unit 3 DS 2
51 pages
Research Report 1
No ratings yet
Research Report 1
9 pages
Lect 4 Sorting and Searching Algorithm
No ratings yet
Lect 4 Sorting and Searching Algorithm
4 pages
FDSA UNIT 5
No ratings yet
FDSA UNIT 5
48 pages
Sortheoryppt
No ratings yet
Sortheoryppt
62 pages
ML - Chapter 6 - Model Evaluation
No ratings yet
ML - Chapter 6 - Model Evaluation
65 pages
Data Structures Using C (MSBTE), 1/e: Reema Thareja
No ratings yet
Data Structures Using C (MSBTE), 1/e: Reema Thareja
37 pages
Algo 2
No ratings yet
Algo 2
9 pages
Sorting Algorithms
No ratings yet
Sorting Algorithms
28 pages
Analysis of Determinate Structures
No ratings yet
Analysis of Determinate Structures
49 pages
3. DAA Lectures - (Unit -3)
No ratings yet
3. DAA Lectures - (Unit -3)
29 pages
Unit-4 Sorting (1)
No ratings yet
Unit-4 Sorting (1)
10 pages
Sorting Final
No ratings yet
Sorting Final
34 pages
Data Structure Sorting
No ratings yet
Data Structure Sorting
14 pages
Chapter 2
No ratings yet
Chapter 2
12 pages
lec2 (1)
No ratings yet
lec2 (1)
46 pages
External and Internal Challenges
No ratings yet
External and Internal Challenges
28 pages
BASIC CONCEPTS OF ESTIMATION
No ratings yet
BASIC CONCEPTS OF ESTIMATION
13 pages
Wafer_Map_Failure_Pattern_Recognition_and_Similarity_Ranking_for_Large-Scale_Data_Sets
No ratings yet
Wafer_Map_Failure_Pattern_Recognition_and_Similarity_Ranking_for_Large-Scale_Data_Sets
12 pages
180212 Lecture 1 Introduction and Fuzzy Control I
No ratings yet
180212 Lecture 1 Introduction and Fuzzy Control I
108 pages
Movesim
No ratings yet
Movesim
27 pages
Daa Ca1 Mcq.docx
No ratings yet
Daa Ca1 Mcq.docx
4 pages
Introduction To Deep Learning: Suresh Jaganathan
No ratings yet
Introduction To Deep Learning: Suresh Jaganathan
73 pages
Optimal Power Flow Solutions
100% (2)
Optimal Power Flow Solutions
25 pages
Live 301 1562 Jair PDF
No ratings yet
Live 301 1562 Jair PDF
49 pages
Lab 9 Report
No ratings yet
Lab 9 Report
5 pages
Project 5 Surabhi Sood - Report
No ratings yet
Project 5 Surabhi Sood - Report
34 pages
CSE 2201 Algorthm Analysis & Design: Solving Recurrence
No ratings yet
CSE 2201 Algorthm Analysis & Design: Solving Recurrence
22 pages
Numerical Methods For Engineers - 9780073397924 - Exercise 13 - Quizlet
No ratings yet
Numerical Methods For Engineers - 9780073397924 - Exercise 13 - Quizlet
4 pages
(It-704c) Data Warehousing and Data Mining (2013-14)
No ratings yet
(It-704c) Data Warehousing and Data Mining (2013-14)
6 pages
Assignment - 1 - Essay On Automata Theory
No ratings yet
Assignment - 1 - Essay On Automata Theory
3 pages
Java Security, Part 1: Crypto Basics
No ratings yet
Java Security, Part 1: Crypto Basics
32 pages
Tutorial 2: Basic C++ Elements: Csc126 Fundamentals of Algorithms and Computer Problem Solving
No ratings yet
Tutorial 2: Basic C++ Elements: Csc126 Fundamentals of Algorithms and Computer Problem Solving
4 pages
Girish Data Scientist 1
No ratings yet
Girish Data Scientist 1
3 pages
Cluster Based Analyasis For Google YouTube Videos Viewer
No ratings yet
Cluster Based Analyasis For Google YouTube Videos Viewer
6 pages
Multiple Choice: 1. Name Matrix A Using Its Dimensions
No ratings yet
Multiple Choice: 1. Name Matrix A Using Its Dimensions
10 pages
Spring 2 Spring 1
No ratings yet
Spring 2 Spring 1
4 pages
FX Smile Modelling
No ratings yet
FX Smile Modelling
5 pages
Rapidminer Report
No ratings yet
Rapidminer Report
28 pages
Control of Integrating Processes Using Dynamic Matrix Control
No ratings yet
Control of Integrating Processes Using Dynamic Matrix Control
6 pages

Unit - 4 Sorting

Uploaded by

Unit - 4 Sorting

Uploaded by

Sorting

 Sorting refers to arranging data in a particular format. Sorting

 The importance of sorting lies in the fact that data searching

 Telephone Directory − The telephone directory stores the

 Dictionary − The dictionary stores words in an alphabetical order

 some sorting algorithms, the program requires space

Cycle Number of Comparisons

the number of comparisons is

(n-1) + (n-2) + (n-3) +.....+ 1 = n(n-

Space Complexity O(1)

The space complexity of bubble sort is O(1). It is

Bubble sort is a sorting algorithm that is

 The smallest element is selected from the unsorted array and

Step 1: Repeat Steps 2 and 3 for i = 0 to n-1

SMALLEST (arr, i, n, pos)

The selection sort is used when

 The array is searched sequentially and unsorted items

Each element has to be compared with each of the other

Thus, the total number of comparisons = n*(n-1) ~ n 2

 The list is divided into two partitions such that "all

2. Last element as pivot

3. Random element as pivot

Step 1: Choose the highest index value as pivot.

Step 3: Left points to the low index.

Step 4: Right points to the high index.

To sort an unsorted list with 'n' number of elements, we need to make

Worst Case : O(n2)

Step 1 − if it is only one element in the list

Step 1: Find the middle index of the array.

Merge sort can be implemented without

Merge sort is used for counting inversions

Merge sort is used in external sorting

Heap sort is one of the sorting algorithms used to arrange a list of

An efficient hash function should be built such that the

 If (hash(x) + 1) % S is also full, then we try

 If (hash(x) + 2) % S is also full, then we try

In case of linear probing, searching is

Quadratic probing lies between the two in

Double hashing has poor cache

Open Addressing requires more

In chaining, Hash table never fills up,

Open addressing requires extra

Chaining is mostly used when it is

Cache performance of chaining is not Open addressing provides better

In Open addressing, a slot can be

rehashing means hashing again.

For each addition of a new entry to the map,

You might also like