Unit-I
Unit-I
DESIGN
Unit – 1
Analysis of
Algorithm
INTRODUCTION TO ANALYSIS OF ALGORITHMS
• To bake a cake, we can get different recipes from the internet. We can find ‘n’
number of steps for different varieties of cakes. All those different step by step
procedure to make a cake can be called as an algorithm. We can choose a simple,
easy and most convenient way to make a cake.
• Similarly in computer science, multiple algorithms are available for solving the
same problem. The algorithm analysis helps us to determine which algorithm is most
efficient in terms of running time, memory and space consumed. etc.,
Why Analyze an Algorithm?
1 Constant
log n Logarithmic
n Linear
n2 Quadratic
n3 Cubic
2n Exponential
Asymptotic Analysis
Asymptotic analysis, gives an idea about the performance of the algorithm based
on the input size. It should not calculate the exact running time, but find the
relation between the running time and the input size. We should follow the
running time when the size of the input is increased.
• For the space complexity, the goal is to get the relation or function that how
much space in the main memory is occupied to complete the algorithm.
• Algorithm analysis are broadly classified into three types such as
• Best case analysis: This analysis gives a lower bound on the run-time. It
describes the behaviour of an algorithm under optimal conditions.
Worst case analysis: This analysis gives the upper bound of the running time
of algorithms. In this case, a maximum number of operations are executed.
• Average case analysis: This analysis gives the region between the upper
and lower bound of the running time of algorithms. In this case, the number
of executed operations is not minimum and not maximum.
ASYMPTOTIC NOTATIONS
Asymptotic notation is one of the most efficient ways to calculate the time complexity of
an algorithm.
Asymptotic notations are mathematical tools to represent time complexity of algorithms
for asymptotic analysis.
The three asymptotic notations used to represent time complexity of algorithms are,
The Big O (Big-Oh) Notation
The Big Ω (Big-Omega) Notation
The Big (Big-Theta) Notation
The Big (O) Notation
Big Oh is an Asymptotic Notation for the worst-case scenario. The Big-O notation
defines asymptotic upper bound of an algorithm, it bounds a function only from above.
It is represented as f(n) = O(g(n)). That is, at higher values of n, the upper bound of f(n)
is g(n).
The Mathematical Definition of Big-Oh
f(n) ∈ O(g(n)) if and only if there exist some positive constant c and some non-
negative integer n₀ such that, f(n) ≤ c g(n) for all n ≥ n₀, n₀ ≥ 1 and c>0.
The above definition says, in the worst case, let the function f(n) be the algorithm's
runtime, and g(n) be an arbitrary time complexity. Then O(g(n)) says that the
function f(n) never grows faster than g(n) that is f(n)<=g(n) and g(n) is
the maximum number of steps that the algorithm can attain.
In the above graph c. g(n) is a function that gives the
maximum runtime (upper bound) and f(n) is the
algorithm’s runtime.
The Big Omega () notation
• Big Omega is an Asymptotic Notation for the best-case scenario. Big notation defines
an asymptotic lower bond.
• The Mathematical Definition of Big-Omega
• f(n) ∈ Ω(g(n)) if and only if there exist some positive constant c and some non-negative
integer n₀ such that, f(n) ≥ c g(n) for all n ≥ n₀, n₀ ≥ 1 and c>0.
• The above definition says, in the best case, let the function f(n) be the algorithm’s
runtime and g(n) be an arbitrary time complexity. Then Ω(g(n)) says that the
function g(n) never grows more than f(n) i.e. f(n)>=g(n), g(n) indicates
the minimum number of steps that the algorithm will attain.
• In the above graph, c.g(n) is a function that gives the minimum runtime (lower
bound) and f(n) is the algorithm’s runtime.
The Big Theta () Notation
• Big Theta is an Asymptotic Notation for the average case, which gives the average
growth for a given function. Theta Notation is always between the lower bound and the
upper bound. It provides an asymptotic average bound for the growth rate of an
algorithm. If the upper bound and lower bound of the function give the same result,
then the Θ notation will also have the same rate of growth.
• f(n) ∈ (g(n)) if and only if there exist some positive constant c₁ and c₂ some non-
negative integer n₀ such that, c₁ g(n) ≤ f(n) ≤ c₂ g(n) for all n ≥ n₀, n₀ ≥ 1 and c>0.
The above definition says in the average case, let the function f(n) be the
algorithm’s runtime and g(n) be an arbitrary time complexity. Then
(g(n)) says that the function g(n) encloses the function f(n) from above and
below using c1.g(n) and c2.g(n).
def factorial(n):
if n = = 0:
return 1
else:
return n*factorial(n-1)
the size of the interval decreases by half, the tick length decreases by
one.
• The English ruler pattern is, an example of a fractal, that is a shape that
has a self-recursive structure at various levels of magnification.
• The 0 and 1 inch tick are the major tick contains tick length 5. The
central tick (at ½ inch) has length 4. The two patterns of ticks above
and below this central tick are identical and has a central tick of length
3.
• An interval with a central tick length L ≥ 1 is composed of
Implementation:
def draw_line(tick_length,
tick_label): line = '-'* tick_length
if tick_label:
line += ' ' + tick_label
print (line)
draw_ line
• The draw_ line method draws a single tick with a specified number of dashes and an optional string label.
• It is a non-recursive method.
draw_ interval
• This function draws the sequence of minor ticks within some interval, based on the length of the interval’s central tick.
The center_length = 0 that draws nothing. For center_length ≥ 1 the first and last steps are performed by recursively by calling
draw_interval method. The middle step is performed by calling the function draw_line method.
Binary Search
• When the sequence is unsorted the standard approach for searching a target value is
sequential search. When the sequence is sorted and indexable, then binary search is
used to efficiently locate a target value within a sorted sequence of n elements.
• For any index j, all the values stored at indices 0 to j-1 are less than or equal to the
value at index j, and all the values stored at indices j+1 to n-1 are greater than equal to
that at index j. This allows us to quickly search target value.
• The algorithm maintains two parameters, low and high, such that all the candidate
entries have index at least low and at most high. Initially, low = 0 and high = n−1. Then
we compare the target value to the median candidate, that is, the item data[mid] with
index
mid = (low+high)/2.
Consider three cases:
If the target equals data[mid], then we have found the item, and the search terminates
successfully.
If target < data[mid], then we recur on the first half of the sequence, that is, on the
interval of indices from low to mid−1.
If target > data[mid], then we recur on the second half of the sequence, that is, on
the interval of indices from mid+1 to high.
An unsuccessful search occurs if low > high, as the interval [low, high] is empty. This
algorithm is known as binary search.
Implementation
return False
else:
if target == data[mid]:
return True
else:
This binary search algorithm requires O(log n) time. Where as the sequential search algorithm uses O(n)
time.
File Systems
• File system for a computer has a recursive structure in which directories can be nested
arbitrarily deep within other directories. Recursive algorithms are widely to explore and
manage these file systems.
• Modern operating systems define file-system directories in a recursive way. A file system
consists of a top-level directory, consists of files and other directories, which is turn
contain files and other directories and so on.
• The representation of such file system is,
• The file system representation uses recursive algorithms for copying a dictionary, deleting
a dictionary etc. In this example we consider computing the total disk usage for all files
and directories nested within a particular directory.
• The above figure shows the disk space being used by all entries in the example file
system. Here we use immediate disk space used by each entry and the cumulative
disk space used by that entry and all nested features.
• Example: CS016 uses 2K of immediate disk space and 249K of cumulative disk
space.
• The cumulative disk space can be computed with recursive algorithm. It is equal to
the immediate disk space used by the entry plus the sum of the cumulative disk
space usage of any entries that are stored directly within the entry.
• For example, the cumulative disk space for CS016 is 264. Because it is the sum of
immediate storage space 2K itself, cumulative storage space of grades 8K,
homework 10K and programs 229K.
Python’s OS Module
Some Python’s OS modules are used to implement recursive algorithm for computing disk
usage. They are,
import os
def disk_usage (path):
total = os . path . getsize (path)
if os . path . isdir (path):
for filename in os.listdir(path):
childpath = os . path . join (path, filename)
total + = disk_usage (childpath)
print (‘(0:< 7)’, format (total), path)
return total
Tower of Hanoi Puzzle
• The Tower of Hanoi puzzle consists of a board with three vertical poles and a stack of
disks. The diameter of the disks increases from the top to bottom creating tower
structure.
• The illustration of Tower of Hanoi with three disks is shown in Fig.
The objective of tower of hanoi puzzle is to move all the disks from the starting pole to
one of the other two poles to create a new tower, adhering two conditions.
1. Only one disk can be moved at a time.
2. A larger disk can never be placed on top of a smaller disk.
This problem can be solved by recursive operation. Given n disks and three poles are
named source(s) destination (D) and intermediate (I).
Computing factorials
• To compute factorials (n), there are a total of n + 1 activations. And each
individual activations of factorials execute a constant number of operations.
Therefore, we conclude that the overall number of operations for computing
factorial (n) is O(n) as there are n+1 activations, each of which accounts for O (1)
operations.
Drawing an English Ruler
• The efficiency of an English ruler algorithm depends on the number of lines that
are generated by an initial call draw _ interval(c), where c denotes the center
length. Each line of output is based upon a call to the draw_line utility, and each
recursive call to draw_interval makes exactly one call to draw_line, unless 0.
• A call to draw_interval(c) for c>0 spawns two calls to draw_interval (c-1) and a
single call to draw_line. For c ≥ 0, a call to draw_interval (c ) results in precisely
2c – 1 lines of output.
Binary search
In a binary search algorithm, a constant executed time is proportional to the number
of recursive calls performed. So, a binary search algorithm runs a O(log n) time for a
sorted sequence with n numbers.
Initially, the number of candidates is n, after the first call in a binary search, it is at
most n/2. After the second call, it is at most n/4 and so on. In general, after the j th call
in a binary search, the number of candidate entries remaining is at most n/2j. In an
unsuccessful search, the recursive calls stop when there are no more candidate entries.
Hence, the maximum number of recursive calls performed, is the smallest integer r
such that 𝑛 < 1
2𝑟
In other words, r > log n. Thus, we have r = [log n]+1, which implies that binary search
runs in O(log n) time.
Computing disk space usage
To characterize the cumulative time spent for an initial call to the disk_usage function,
we must analyse the total number of recursive invocations that are made, as well as the
number of operations that are executed with in those invocations.
There are n recursive invocations of the function. i.e., one for each entry in the relevant
portion of the file system.
The body of the disk_usage function includes a for loop that iterates over all entries that
are contained within that directory. Based on this reasoning we conclude that there are
O(n) recursive calls, each of which runs is O(n) time.
Solving the Tower of Hanoi Puzzle
To evaluate the time complexity of the move ( ) function, we need to determine the cost of
each invocation and the number of times the function is called for any value of n.
Each function invocation only requires O(1) time since there are only two non-recursive
function call steps performed by the function, both of which require constant time.
To determine the total number of times the function is called, we need to calculate the
number of times the function executes at each level of the call tree and then add those values
to obtain the final result. The number of function calls at each level is twice the number of
calls at the previous level.
If we label each level of the call tree starting with 0 at the top and going down to n-1 at the
bottom, then the number of function calls at each level i is equal to 2i.
Summing the number of calls at each level contains,
20 + 21 + 22 + … + 2n-1 = σ𝑛−1
𝑖= 2𝑖 Or Total of 2n-1 .
0
Thus, the recursive solution for solving tower of hanoi problem requires
exponential time of O (2n) in the worst case.