DS Material Module-3
DS Material Module-3
Searching Techniques
1. Linear Search:
Definition: Linear Search is a simple search algorithm that sequentially checks each
element in a list or array until a match is found or the entire list has been traversed.
Explanation: Linear Search is like searching for a specific item in an unordered list by
checking each element one by one from the beginning until you either find the target
element or reach the end of the list. It's a straightforward but not very efficient way to
search for an item in a collection.
#include <stdio.h>
Time Complexity: O(n) - Linear Search checks each element in the array exactly once in
the worst case, so its time complexity is linear with respect to the size of the array (n).
2. Binary Search:
Definition: Binary Search is a fast search algorithm that works on sorted lists or arrays. It
repeatedly divides the search interval to half until the target element is found or the
search interval is empty.
Explanation: Binary Search is an efficient algorithm for finding a specific item in a
sorted list or array. It starts by comparing the target value with the middle element and
narrows down the search range by half in each iteration.
Algorithm for Binary Search:
#include <stdio.h>
int main() {
int arr[] = {10, 21, 32, 43, 54};
int n = sizeof(arr) / sizeof(arr[0]);
int target = 32;
int result = binarySearch(arr, 0, n - 1, target);
if (result != -1) {
printf("Element found at index: %d\n", result);
} else {
printf("Element not found in the array.\n");
}
return 0;
}
Binary Search Time Complexity Analysis:
Time Complexity: O(log n) - Binary Search repeatedly divides the search interval in half,
so its time complexity is logarithmic with respect to the size of the array (n).
3. Fibonacci Search:
Definition: Fibonacci Search is an algorithm for searching a sorted array using a divide-
and-conquer approach. It uses Fibonacci numbers to determine the split points in the
array.
Explanation: Fibonacci Search divides the search interval into smaller subintervals using
Fibonacci numbers. It is similar to Binary Search but uses a more complex formula for
selecting the split point.
Algorithm for Fibonacci Search:
#include <stdio.h>
int main() {
int arr[] = {1, 2, 3, 4, 5};
int n = sizeof(arr) / sizeof(arr[0]);
int target = 3;
int result = fibonacciSearch(arr, n, target);
if (result != -1) {
printf("Element found at index: %d\n", result);
} else {
printf("Element not found in the array.\n");
}
return 0;
}
i 1 2 3 4 5 6 7 8 9 10 11
arr[i] 10 22 35 40 45 50 80 82 85 90 100
F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10
Fib.no 0 1 1 2 3 5 8 13 21 34 55
offset=-1
Time Complexity: O(log n) - Fibonacci Search also has a logarithmic time complexity
similar to Binary Search.
Each of the algorithms for Linear Search, Binary Search, and Fibonacci Search has its own
characteristics and use cases, with Binary Search being the most efficient for sorted data, while
Linear Search is simple but less efficient. Linear Search has a linear time complexity, while
Binary Search and Fibonacci Search both have logarithmic time complexity. Binary Search is
generally preferred for searching in sorted arrays due to its simplicity and efficiency. Fibonacci
Search is less commonly used
Sorting Techniques:
1. Selection Sort:
#include <stdio.h>
// Find the index of the minimum element in the unsorted part of the array
for (j = i + 1; j < n; j++) {
if (arr[j] < arr[minIndex]) {
minIndex = j;
}
}
// Swap the minimum element with the element at the current index
temp = arr[i];
arr[i] = arr[minIndex];
arr[minIndex] = temp;
}
}
int main() {
int arr[] = {64, 34, 25, 12, 22, 11, 90};
int n = sizeof(arr) / sizeof(arr[0]);
selectionSort(arr, n);
return 0;
}
In this code:
selectionSort is a function that takes an integer array arr and its length n as parameters
and sorts the array using the Selection Sort algorithm.
The outer loop iterates through each element of the array from left to right, considering
them as the minimum element initially.
The inner loop finds the index of the minimum element in the unsorted part of the array.
When the minimum element is found, it is swapped with the element at the current index
of the outer loop.
This process continues until the entire array is sorted.
The main function demonstrates the usage of the selectionSort function on an example array.
2. Insertion Sort:
#include <stdio.h>
int main() {
int arr[] = {34, 8, 64, 51, 32, 21};
int n = sizeof(arr) / sizeof(arr[0]);
insertionSort(arr, n);
This code defines the insertionSort function, which takes an array of integers arr and its size n as
input and sorts the array in ascending order using the Insertion Sort algorithm. The main function
demonstrates how to use the insertionSort function to sort an array of integers.
When you run this code, it will output the original array followed by the sorted array.
3. Bubble Sort:
#include <stdio.h>
// If no two elements were swapped in the inner loop, the array is already sorted
if (swapped == 0) {
break;
}
}
}
int main() {
int arr[] = {64, 34, 25, 12, 22, 11, 90};
int n = sizeof(arr) / sizeof(arr[0]);
bubbleSort(arr, n);
return 0;
}
This code defines the bubbleSort function, which takes an array of integers arr and its length n,
and sorts the array in ascending order using the Bubble Sort algorithm. The main function
demonstrates how to use this sorting function on an example array.
When you run this code, it will output the original array followed by the sorted array.
4. Merge Sort:
#include <stdio.h>
int main() {
int arr[] = {12, 11, 13, 5, 6, 7};
int n = sizeof(arr) / sizeof(arr[0]);
printf("Original Array: ");
for (int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
mergeSort(arr, 0, n - 1);
return 0;
}
This code defines the merge function to merge two subarrays and the mergeSort function to
perform Merge Sort on an array of integers. The main function demonstrates how to use the
mergeSort function to sort an example array.
5. Quick Sort:
#include <stdio.h>
// Partition function to select the pivot and place it in the correct position
int partition(int arr[], int low, int high)
{
int pivot = arr[low]; // Choose the leftmost element as the pivot
int i = low; // Initialize the index of the smaller element
int j = high;
while(i < j)
{
while(arr[i]<=pivot){
i++;
}
while(arr[j] >pivot){
j--;
}
if (i<j)
{
swap(&arr[i], &arr[j]);
}
}
swap(&arr[low], &arr[j]);
return (j);
}
int main() {
int arr[] = {12, 11, 13, 5, 6, 7};
int n = sizeof(arr) / sizeof(arr[0]);
quickSort(arr, 0, n - 1);
return 0;
}
This code defines the swap function to swap two elements, the partition function to choose a
pivot and partition the array, and the quickSort function to perform Quick Sort on an array of
integers. The main function demonstrates how to use the quickSort function to sort an example
array.
When you run this code, it will output the original array followed by the sorted array.
6. Heap Sort:
Heap Sort is an efficient sorting technique based on the heap data structure. The heap is a
nearly-complete binary tree where the parent node could either be minimum or
maximum. The heap with minimum root node is called min-heap and the root node with
maximum root node is called max-heap.
Definition: Heap Sort is a comparison-based sorting algorithm that uses a binary heap
data structure to build a max-heap (or min-heap) and repeatedly extract the maximum (or
minimum) element to build a sorted array.
Explanation: Heap Sort builds a max-heap (or min-heap) from the array, where the
maximum (or minimum) element is at the root. It then repeatedly removes the root and
places it at the end of the array, reducing the heap size. This process continues until the
entire array is sorted.
Time Complexity: O(n log n) in the worst, average, and best cases, making it efficient
for large datasets.
The heapify method converts a complete binary tree into a heap data structure. This
method uses recursion approach to heapify all the nodes of the binary tree.
Example: Given array, build a tree for it and do heapify.
Next, we have to delete the root element (89) from the max heap. To delete this node, we have
to swap it with the last node, i.e. (11). After deleting the root element, we again have to heapify
it to convert it into max heap.
After swapping the array element 89 with 11, and converting the heap into max-heap, the
elements of array are as follows.
Repeat deleting root again and heapify till no elements left.
#include <stdio.h>
int main() {
int arr[] = {12, 11, 13, 5, 6, 7};
int n = sizeof(arr) / sizeof(arr[0]);
heapSort(arr, n);
return 0;
}
This code defines the heapify function to heapify a subtree, and the heapSort function to perform
Heap Sort on an array of integers. The main function demonstrates how to use the heapSort
function to sort an example array.
When you run this code, it will output the original array followed by the sorted array.
The radix sort algorithm makes use of the counting sort algorithm while sorting in every phase.
The detailed steps are as follows −
Step 1 − Check whether all the input elements have same number of digits. If not, check for
numbers that have maximum number of digits (say k) in the list and add leading zeroes to the
ones that do not.
Step 3 − Sort these digits using counting sort logic and change the order of elements based on
the output achieved. For example, if the input elements are decimal numbers, the possible values
each digit can take would be 0-9, so index the digits based on these values.
Step 4 − Repeat the Step 2 for the next least significant digits until all the digits in the elements
are sorted.
Step 5 − The final list of elements achieved after kth loop is the sorted output.
Hashing:
Hash Table is a data structure which stores data in an associative manner. In a hash table, data is
stored in an array format, where each data value has its own unique index value. Access of data
becomes very fast if we know the index of the desired data. Hash Table uses an array as a storage
medium and uses hash technique to generate an index where an element is to be inserted or is to
be located from.
Hashing is a technique to convert a range of key values into a range of indexes of an array.
Hashing is commonly used for data retrieval and data storage, as it allows for quick and efficient
data lookup. It also aids in the encryption and decryption of digital signatures that are used to
verify message senders and recipients.
Hash Functions: A hash function is a mathematical function that takes an input (or "message")
and returns a fixed-size string of bytes. It should be deterministic, meaning that for the same
input, it will always produce the same hash value.
1. Division Method:
Description: The division method is a simple hashing technique where you divide the
key by the size of the hash table and use the remainder as the hash code. It's expressed as
hash(key) = key % table_size.
Example: Suppose you have a hash table with 10 slots (0 to 9). To hash the key "42"
using the division method, you calculate hash(42) = 42 % 10, which results in a hash
code of 2. So, key "42" would be stored in slot 2 of the hash table.
2. Multiplication Method:
Steps to follow -
(Where, M = size of the hash table, k = key value and A = constant value)
Example: Suppose you have a hash table with 8 slots (0 to 7). To hash the key "25"
using the multiplication method with A = 0.618, you calculate hash(25) = floor(8 * (25 *
0.618 % 1)), which results in a hash code of 3. The key "25" would be stored in slot 4 of
the hash table.
3. Folding Method:
1. The key-value k should be divided into a specific number of parts, such as k1, k2,
k3,..., kn, where each part, except possibly the last, has the same number of digits as
the required address.
2. Add the parts together, ignoring the last carry. The last carry, if any, is disregarded to
determine the hash value.
Formula: k = k1, k2, k3, k4, ….., kn
s = k1+ k2 + k3 + k4 +….+ kn
h(K)= s
4. Mid-Square Method:
The steps involved in computing this hash method include the following -
1. Squaring the value of k ( like k*k)
2. Extract the hash value from the middle r digits.
Formula: h(K) = l where l is obtained by deleting digits from both ends of (k*k).
Examples:
In hashing, a collision occurs when two different keys produce the same hash code or
index in the hash table. Since hash functions map potentially infinite input values to a
finite range of hash codes (the size of the hash table), collisions are inevitable, especially
when dealing with a large number of keys. Handling collisions is a critical aspect of hash
table design.
There are several collision resolution techniques to address collisions:
Separate Chaining:
Description: Separate chaining is the most used collision hashing technique in data
structures that uses a linked list. Any two or more keys that are hashed to the same slot
are chained together to form a single-linked list known as a chain. Each bucket (or slot)
in the hash table maintains a linked list to store multiple values that hash to the same
index.
Pros: Easy to implement, allows for efficient handling of multiple collisions, and works
well when collisions are infrequent.
Cons: Requires additional memory for storing linked lists or other data structures.
Example:
Open Addressing:
Open addressing is one such technique where when a collision occurs, the algorithm searches
for the next available slot within the same table.
Linear Probing:
Description: In linear probing, when a collision occurs, the algorithm searches for the
next available slot by incrementing the index linearly until an empty slot is found.
Example: Suppose you have a hash table with 7 slots, and the keys 76, 93,40,47,10,55.
When collision occurs, the linear probing will check next available index.
Quadratic Probing:
The hash(n) is the index computed using a hash function, and T is the table size.
If slot index = ( hash(n) % T) is full, then the next slot index is calculated by adding 1
(hash(n) + 1 x 1) % T
index = ( hash(n) % T)
(hash(n) + 1 x 1) % T
(hash(n) + 2 x 2) % T
(hash(n) + 3 x 3) % T … and so on
Double Hashing:
Description: Double hashing involves using a secondary hash function to calculate the
step size for probing. When a collision occurs, the algorithm adds the step size to the
current index and checks the resulting index until an empty slot is found.
The formula for double hashing - (first hash(key) + i * secondHash(key)) % size of the
table
The sequence goes as follows -
index = hash(x) % S
(hash(x) + 1*hash2(x)) % S
(hash(x) + 2*hash2(x)) % S
(hash(x) + 3*hash2(x)) % S … an so on