Data Structures and Algorithms
Data Structures and Algorithms
FACULTY OF COMMERCE
GROUP 1 ASSIGNMENT
Jo-Anne Macharavanda
LEVEL 1 SEMESTER 2
LECTURER: MR GAVAI
• Searching is an operation or a technique that helps finds the place of a given element
or value in the list.
• Any search is said to be successful or unsuccessful depending upon whether the
element that is being searched is found or not.
• Searching is the process of finding a given value position in a list of values.
• It decides whether a search key is present in the data or not.
• It is the algorithmic process of finding a particular item in a collection of items.
• Sequential Search
• Binary Search
• Fibonnaci search
Sequential search is made over all items one by one. Every item is checked and if a match is
found then that particular item is returned, otherwise the search continues till the end of the
data structure. We often do this in your daily life, say when you are looking through your
shopping list to cross out an item you have picked up.
Linear search is a method of finding elements within a list. It is also called a sequential
search. It is the simplest searching algorithm because it searches the desired element in a
sequential manner.
It compares each and every element with the value that we are searching for. If both are
matched, the element is found, and the algorithm returns the key's index position.
The search begins at index 0 and compares the element at index 0 to the search criteria, in
this case 1. Does 2 = 1? No. This process of comparison continues until the value and index 2
are compared. Does 1 = 1? Yes. The algorithm can continue and search for other instances
until the end of the array, or stop the search.
Data 2 5 1 3 4
for i = 0 to 4
if Numbers[i] == 1 then
print("Item found")
stop
next i
Let's understand the following steps to find the element key = 7 in the given list.
Step - 1: Start the search from the first element and Check key = 7 with each element of list
x.
1. LinearSearch(list, key)
2. for each item in the list
3. if item == value
4. return its index position
5. return -1
Explanation
In the above code, we have created a function linear_Search(), which takes three arguments
- list1, length of the list, and number to search. We defined for loop and iterate each element
and compare to the key value. If element is found, return the index else return -1 which
means element is not present in the list.
In linear search, best-case complexity is O(1) where the element is found at the first index.
Worst-case complexity is O(n) where the element is found at the last index or element is not
present in the array.
Will perform fast searches of small to medium lists. With today's powerful computers,
small to medium arrays can be searched relatively quickly.
The list does not need to sorted. Unlike a binary search, linear searching does not require
an ordered list.
Not affected by insertions and deletions. As the linear search does not require the list to
be sorted, additional elements can be added and deleted. As other searching algorithms
may have to reorder the list after insertions or deletions, this may sometimes mean a linear
search will be more efficient.
Slow searching of large lists. For example, when searching through a database of
everyone in the Northern Ireland to find a particular name, it might be necessary to search
through 1.8 million names before you found the one you wanted. This speed disadvantage
is why other search methods have been developed.
Benefits and Drawbacks
The benefit is that it is a very simple search and easy to program. In the best-case scenario,
the item you are searching for may be at the start of the list in which case you get it on the
very first try.
Its drawback is that if your list is large, it may take a while to go through the list. In the
worst-case scenario, the item you are searching for may not be in the list, or it may be at the
opposite end of the list.
Let's first develop an algorithm for performing a linear search. We will then convert this into
a Python script.
BINARY SEARCH
• A binary search is an advanced type of search algorithm that finds and fetches data
from a sorted list of items.
• Its core working principle involves dividing the data in the list to half until the
required value is located and displayed to the user in the search result.
• This search algorithm works on the principle of divide and conquer in which the list is
divided into two halves and the item is compared with the middle element of the list.
If the match is found then, the location of middle element is returned otherwise, we
search into either of the halves depending upon the result produced through the match.
.
• It's time complexity of O(log n) makes it very fast as compared to other sorting
algorithms.
• If the desired value is equal to the central index’s worth, then the index is returned as
an answer.
• If the target value is lower than the central index’s deal of the list, then the list’s right
side is ignored.
• If the desired value is greater than the central index’s value, then the left half is
discarded.
The process is then repeated on shorted lists until the target value is found.
Consider-
• Binary search algorithm is being used to search an element ‘item’ in this linear array.
• If search ends in success, it sets loc to the index of the element otherwise it sets loc to
-1.
• Variables beg and end keeps track of the index of the first and last element of the
array or sub array in which the element is being searched at that instant.
• Variable mid keeps track of the index of the middle element of that array or sub array
in which the element is being searched at that instant.
Algorithm Example
Begin
Set beg = 0
Set end = n-1
else
endif
endwhile
Set loc = -1
else
endif
End
• Binary Search Algorithm searches an element by comparing it with the middle most
element of the array.
1. If the element being searched is found to be the middle most element, its index is
returned.
2. If the element being searched is found to be greater than the middle most element,
then its search is further continued in the right sub array of the middle most element.
3. If the element being searched is found to be smaller than the middle most element,
then its search is further continued in the left sub array of the middle most element.
This iteration keeps on repeating on the sub arrays until the desired element is found or size
of the sub array reduces to zero
Step-1:
Step-2:
• Since a[mid] = 20 > 15, so we take end = mid – 1 = 3 – 1 = 2 whereas beg remains
unchanged.
mid
= (beg + end) / 2
= (0 + 2) / 2
=1
Step-3:
• Since a[mid] = 10 < 15, so we take beg = mid + 1 = 1 + 1 = 2 whereas end remains
unchanged.
mid
= (beg + end) / 2
= (2 + 2) / 2
=2
Advantages Disadvantages
• It eliminates half of the list from • It employs recursive approach which
further searching by using the result requires more stack space.
of each comparison.
• Programming binary search
• This information is used to narrow algorithm is error prone and
the search. difficult.
• For large lists of data, it works • The interaction of binary search with
significantly better than linear memory hierarchy i.e. caching is
search. poor.
(because of its random access nature)
• In each iteration or in each recursive call, the search gets reduced to half of the array.
• So for n elements in the array, there are log2n iterations or recursive calls.
This time complexity of binary search remains unchanged irrespective of the element position
even if it is not present in the array
Fibonacci search
• Fibonacci search is used to search an element of a sorted array with the help of
Fibonacci numbers. It studies the locations whose addresses have lower dispersion.
• Fibonacci number is subtracted from the index thereby reducing the size of the list.
• The Fibonacci search technique is a method for searching a sorted array using a divide
and conquer algorithm that uses Fibonacci numbers to narrow down possible
locations.
• Fibonacci search divides the sorted array into two sections of sizes that are
consecutive Fibonacci numbers, as opposed to binary search, which divides the array
into two equal-sized parts, one of which is analyzed further
Fibonacci Numbers
• Fibonacci Numbers are recursively defined as F(n) = F(n-1) + F(n-2), F(0) = 0, F(1) =
1.
• First few Fibonacci Numbers are 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, …
•
Key points about Fibonacci search are:
• Fibonacci Search examines closer elements in few steps. So when input array is big
that cannot fit in CPU cache or in RAM, it is useful.
• Fibonacci search requires only addition and subtraction whereas binary search
requires bit-shift, division or multiplication operations.
• Fibonacci search can reduce the time needed to access an element in a random access
memory.
• On magnetic tape where seek time depends on the current head position, there are two
considerations: longer seek time and more comparisons that leads to prefer Fibonacci
search
When the element being searched for has a non-uniform access storage
Can be used for large arrays which do not fit in the CPU cache or in the RAM
Binary Search uses the division operator to divide range. Fibonacci Search doesn’t
use /, but uses + and -. The division operator may be costly on some CPUs.
arr = [10, 22, 30, 44, 56, 58, 60, 70, 100, 110, 130]
x = 60
Now, let’s define a function that will divide the list and search for the number as per
the Fibonacci Search
Best case time complexity: Θ(1) - occurs when the element to be searched is
the first element we compare.
With each step, the search space is reduced by 1/3 on average, hence, the time complexity is
O(log n) where the base of the logarithm is 3.
A) O(logn) -Explanation: In the case of large arrays, it is better than binary search since it
splits the array into two bits, but they are not identical. Its time complexity is O(logn).
First, we need to have the length of the given list. Then we find the smallest Fibonacci
number greater than or equal to the size of the list. That means if the size of the list is
100, then the smallest Fibonacci number greater than 100 is 144. Let us say that this is
the nth Fibonacci number. In the above example, 144 is the 12th Fibonacci number.
After this, we move back twice in the Fibonacci series from that number. Essentially,
we find the (n-2)th Fibonacci number.
Now let us dive into the code for this algorithm:
size = len(lst)
start = -1
f0 = 0
f1 = 1
f2 = 1
f0 = f1
f1 = f2
f2 = f1 + f0
f2 = f1
f1 = f0
f0 = f2 - f1
f2 = f0
f1 = f1 - f0
f0 = f2 - f1
else:
return index
if (f1) and (lst[size - 1] == target):
return size - 1
return None
Sorting algorithm is an algorithm that puts elements of a list in a certain order. The most
frequently used orders are numerical order and lexicographical order.
Efficient sorting is important for optimizing the efficiency of other algorithms which require
input data to be in sorted lists.
1. Bubble Sort
2. Selection Sort
3. Insertion Sort
4. Heap Sort
5. Merge Sort
6. Quick Sort
Graph definitions
A graph G = (V, E) consists of a set of vertices, V, and a set of edges, E, where each edge is a
pair (v,w) s.t. v,w V.Vertices are sometimes called nodes, edges are sometimes called
arcs.If the edge pair is ordered then the graph is called a directed graph (also called digraphs)
We also call a normal graph (which is not a directed graph) an undirected graph.When we
say graph we mean that it is an undirected graph.Two vertices of a graph are adjacent if they
are joined by an edge.Vertex w is adjacent to v iff (v,w) E.In an undirected graph with
edge (v, w) and hence (w,v) w is adjacent to v and v is adjacent to w.
A path between two vertices is a sequence of edges that begins at one vertex and ends at
another vertex.i.e. w1, w2, …, wN is a path if (wi, wi+1) E for 1 i . N-1
• A simple cycle is a cycle that does not pass through other vertices more than once.
A connected graph has a path between each pair of distinct vertices. A complete graph has
an edge between each pair of distinct vertices. A complete graph is also a connected graph.
But a connected graph may not be a complete graph.
Graph Traversal
Graph traversal means visiting every vertex and edge exactly once in a well-defined order.
While using certain graph algorithms, you must ensure that each vertex of the graph is visited
exactly once. The order in which the vertices are visited are important and may depend upon
the algorithm or question that you are solving.
During a traversal, it is important that you track which vertices have been visited. The most
common way of tracking vertices is to mark them.
A graph-traversal algorithm starts from a vertex v, visits all of the vertices that can be
reachable from the vertex v. A graph-traversal algorithm visits all vertices if and only if the
graph is connected. A connected component is the subset of vertices visited during a traversal
algorithm that begins at a given vertex. A graph-traversal algorithm must mark each vertex
during a visit and must never visit a vertex more than once. Thus, if a graph contains a cycle,
the graph-traversal algorithm can avoid infinite loop.
Depth first Search or Depth first traversal is a recursive algorithm for searching all the
vertices of a graph or tree data structure. Traversal means visiting all the nodes of
a graph. Unlike trees, graphs may contain cycles
It uses the concept of backtracking. It involves thorough searches of all the nodes by
going ahead if potential, else by backtracking.
Here, the word backtrack means once you are moving forward and there are not any
more nodes along the present path, you progress backward on an equivalent path to
seek out nodes to traverse. All the nodes are progressing to be visited on the current
path until all the unvisited nodes are traversed after which subsequent paths are going
to be selected
A standard DFS implementation puts each vertex of the graph into one of two categories:
1. Visited
2. Not Visited
The purpose of the algorithm is to mark each vertex as visited while avoiding cycles.
2. Take the top item of the stack and add it to the visited list.
3. Create a list of that vertex's adjacent nodes. Add the ones which aren't in the visited
list to the top of the stack.
Let's see how the Depth First Search algorithm works with an example. We use an undirected
graph with 5 vertices.
We start from vertex 0, the DFS algorithm starts by putting it in the Visited list and putting
all its adjacent vertices in the stack.
Vertex 2 has an unvisited adjacent vertex in 4, so we add that to the top of the stack and visit
it.
Vertex 2 has an unvisited adjacent vertex in 4, so we add that to the top of the stack and visit
it.
After we visit the last element 3, it doesn't have any unvisited adjacent nodes, so we have
completed the Depth First Traversal of the graph.
DFS Implementation (Source Code)
graph = {
'5': ['3','7'],
'7': ['8'],
'2': [],
'4’: ['8'],
'8’: []
print (node)
# Driver Code
In the code, first, we will create the graph for which we will use the depth-first search. After
creation, we will create a set for storing the value of the visited nodes to keep track of the
visited nodes of the graph.
After the above process, we will declare a function with the parameters as visited nodes, the
graph itself and the node respectively. And inside the function, we will check whether any
node of the graph is visited or not using the “if” condition. If not, then we will print the node
and add it to the visited set of nodes.
Then we will go to the neighboring node of the graph and again call the DFS function to use
the neighbor parameter.
At last, we will run the driver code which prints the final result of DFS by calling the DFS the
first time with the starting vertex of the graph.
Output
1) For an unweighted graph, DFS traversal of the graph produces the minimum spanning tree
and all pair shortest path tree.
2) Detecting cycle in a graph A graph has cycle if and only if we see a back edge during
DFS. So we can run DFS for the graph and check for back edges.
3) Path Finding We can specialize the DFS algorithm to find a path between two given
vertices u and z.
ii) Use a stack S to keep track of the path between the start vertex and the current vertex.
iii) As soon as destination vertex z is encountered, return the path as the contents of the stack
• Visits graph vertices by moving across to all the neighbors of last visited vertex
• “Redraws” graph in tree-like fashion (with tree edges and cross edges for undirected
graph)
• After visiting a given vertex v, the breadth-first traversal algorithm visits every vertex
adjacent to v that it can before visiting any other vertex.
The breath-first traversal algorithm does not completely specify the order in which it should
visit the vertices adjacent to v.We may visit the vertices adjacent to v in sorted order.BFS has
same efficiency as DFS and can be implemented with graphs represented as:
• Yields single ordering of vertices (order added/deleted from queue is the same)
recursive implementation
O(logn)
A complete analysis of the running time of an algorithm involves the following steps:
Identify unknown quantities that can be used to describe the frequency of execution of
the basic operations.
Calculate the total running time by multiplying the time by the frequency for each
operation, then adding all the products.
Asymptotic Analysis
Asymptotic analysis is an analysis of algorithms that focuses on
The asymptotic behavior of a function f(n) refers to the growth of f(n) as n gets large.
Lower order terms contribute lesser to the overall cost as the input grows larger
O(1) < O(log n) < O(n) < O(nlog n) < O(n2) < O(n3) < O(2n) < …
“fastest” “slowest”
Leading Terms
a ( n) = ½ n + 4
Leading term: ½ n
b ( n) = 240 n + 0.001 n2
Suppose we have an array of n elements [1: 𝑛], and a key element, KEY.
The following program is a simple loop which goes through the array sequentially
until it
either finds the key, or determines that it is not found. (dtype is the type declaration
for the array.)
for 𝑖 = 1 𝑡𝑜 𝑛 {
𝑖𝑓 (𝐾𝐸𝑌 == 𝐴[𝑖])
return (𝑖);
};
return (−1); //The for-loop ended and the key was not found.
Worst-case analysis: This determines the maximum amount of time the algorithm
would ever take. This analysis is usually easier than the average-case analysis. At the
same time, it is hoped that it is a good reflection of overall performance.
Average-case analysis: This method determines the overall average performance. It
considers all possible cases, assigns a probability to each case, and computes the
weighted average (called expected value) for the random variable (in this case, the
number of key-comparisons).
Since each iteration of the for-loop takes at most some constant amount of time, C,
then the total worst-case time of the algorithm is (𝑛) ≤ 𝐶𝑛 + 𝐷.
(The constant 𝐷 represents the maximum amount of time for all statements that are
executed only once, independent of the variable 𝑛. ) This total time is characterized as
“order of” 𝑛, denoted as O(𝑛).
Binary search
Problem
Analysis
Compare the key recorded in the middle position of the table with the search key.
If the two are equal, the search is successful; otherwise, use the middle position record
to divide the table into two sub-tables. If the key recorded in the middle position is
greater than the search key, Then further search the previous sub-table, otherwise
further search the next sub-table. Repeat the above process until a record that meets
the conditions is found, and the search is successful, or until the sub-table does not
exist, the search is unsuccessful at this time.
while(l<r){
int mid=(l+r)/2;
if(T[mid]<x){
l=mid+1;
else{
r=mid;
Analysis
Problem
Analysis
Select the middle node of the sequence table, compare the given x value with the key
of the middle node of the sequence, if they are equal, the search is successful; if not,
judge the x value and the middle node For the size of the point, select the correct
segment and compare the intermediate values until the intermediate node that is equal
to the value of x is found or there is no intermediate node in the table.
//Sequential search:
if (t[i] == key) {
cout << "The search is successful! The subscript is:" << i+1 << endl;
break;
//Binary search:
left = 0; right = n - 1;
middle = (left+right) / 2;
if (key == t[middle]) {
cout << "The search is successful! The subscript is:" << middle + 1 << endl;
break;
}
right = middle - 1;
else
left = middle + 1;
Analysis