[go: up one dir, main page]

0% found this document useful (0 votes)
16 views37 pages

DAA - Unit - 1 Study Notes

The document provides an overview of algorithms, including their definitions, design, validation, analysis, and testing. It covers various algorithm design techniques such as Divide and Conquer, and discusses performance analysis focusing on time and space complexity. Additionally, it includes examples of recursive algorithms like the Towers of Hanoi and permutation generation, along with specifications for algorithm representation in natural language, flowcharts, and pseudocode.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views37 pages

DAA - Unit - 1 Study Notes

The document provides an overview of algorithms, including their definitions, design, validation, analysis, and testing. It covers various algorithm design techniques such as Divide and Conquer, and discusses performance analysis focusing on time and space complexity. Additionally, it includes examples of recursive algorithms like the Towers of Hanoi and permutation generation, along with specifications for algorithm representation in natural language, flowcharts, and pseudocode.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

DESIGN ANALYSIS AND ALGORITHMS

CSEN3001

UNIT 1 : Introduction to Algorithms

Introduction to Algorithms: Algorithm specification, Performance Analysis.


Divide and Conquer: The general method: Binary search, finding maximum
and minimum, Merge sort, Quick sort, Selection, Strassen’s Matrix
multiplication.

Introduction to Algorithms:
ALGORITHM:
 An algorithm is a systematic step-by-step procedure for solving a given problem.
 An algorithm is independent of any programming language and machine.
 An Algorithm is a finite sequence of effective steps to solve a particular problem.
Formal Definition:
An Algorithm is a finite set of instructions that, if followed, accomplishes a particular task.

All algorithms must satisfy the following criteria.


a) INPUT: Zero or more quantities are externally supplied.
b) OUTPUT: At least one quantity is produced.
c) DEFINITENESS: Each instruction is clear and unambiguous.
d) FINITENESS: If we trace out the instructions of an algorithm, then for all
cases, the algorithm terminates after a finite number of steps.
e) EFFECTIVENESS: Every instruction must be very basic so that it can be
carried out, in principle, by a person using only a pencil and paper.

Areas of study of an Algorithm:


 How to devise or design an algorithm -> Creating an algorithm.
 How to express an algorithm -> definiteness.
 How to analyse an algorithm -> time and space complexity.
 How to validate an algorithm -> fitness.
 Testing the algorithm -> checking for error.

Algorithm Design
 The study of Algorithms includes many important and
active areas of research. There are four distinct areas of
study that one can identify.
1. Devise an algorithm:
 Creating an algorithm is an art which may never be
fully automated.
 A major goal is to study various design techniques
that have proven to be useful.
 The major design techniques are Brute force, Divide
and conquer, Greedy approach, Dynamic programming, Backtracking, and Branch
and Bound.
 Exploration of specific features of the problem is the first important step.
 By mastering these design strategies, it will become easier to devise new and useful
algorithms.
2. Validating algorithms:
 Once an algorithm is devised, it is necessary to show that it computes the correct
answer for all possible legal inputs, which is referred to as algorithm validation.
 The purpose of validation is to ensure that this algorithm will work correctly and
independently without being expressed as a program.
 Proof of correctness requires that the solution be stated in two forms.
o One form is usually a program which is annotated by a set of assertions about the
input and output variables of the program. These assertions are often expressed in
the predicate calculus.
o The second form is called a specification, and this may also be expressed in the
predicate calculus.
o Once the validity of the method has been shown, a program can be written, and a
second phase begins. This phase is referred to as program proving or sometimes
as program verification.
3. Analyse algorithms:
 As an algorithm is executed, it uses the computer's central processing unit (CPU) to
perform operations and its memory to hold the program and data.
 Analysis of algorithms or performance analysis refers to the task of determining how
much computing time and storage algorithms require.
 Apriori Analysis: A machine-independent analysis of the algorithms to find a better one.
 Posteriori Analysis: After executing the program to find a better one.
 The algorithm is analysed based on time and space complexity.
 The amount of time needed to run the algorithm is called time complexity.
 The amount of memory needed to run the algorithm is called the space complexity.
 Three types of analysis
o Best case: Defines the input for which the algorithm takes the lowest time.
o Worst case: Defines the input for which the algorithm takes the maximum time.
o Average case: Provides a prediction about the running time of the algorithm.

4. Testing a program:
 Testing a program consists of two phases
o Debugging
o Profiling
 Debugging: It is the process of executing programs on sample data sets to determine
whether faulty results occur and, if so, to correct them.
 E. Dijkstra has pointed out, “debugging can only point to the presence of errors, but not
to the absence".
 A proof of correctness is much more valuable than a thousand tests (if that proof is
correct), since it guarantees that the program will work correctly for all possible inputs.
 Profiling: Profiling or performance measurement is the process of executing a correct
program on data sets and measuring the time and space it takes to compute the results.

1.1 ALGORITHM SPECIFICATION


An algorithm can be described in three ways.
1. Natural language like English:
 In this way of description, care should be taken to ensure that each statement is
definite.
2. Graphic representation (flowchart):
 This method will work well when the algorithm is small & simple.
3. Pseudo-code Method:
 Most of the algorithms use a pseudocode that resembles Pascal and C.
Pseudocode Conventions:
Pseudo-Code Conventions for expressing algorithms:
1. Comments begin with // and continue until the end of the line.
2. Blocks are indicated with matching braces { and }.
3. An identifier begins with a letter. The data types of variables are not explicitly declared.
4. Compound data types can be formed with records. Here is an example,
node=record
{
data type_1 data_1;
.
.
.
data type_n data_n;
node *link;
}
Here link is a pointer to the record type node. Individual data items of a record can be accessed
with -> and period.
5. Assignment of values to variables is done using the assignment statement.
<variable>:= <expression>;

6. There are two boolean values, true and false.


 Logical Operators: and, or, not
 Relational Operators : <, <=,>,>=, =, !=
7. The following looping statements are employed.
 for, while and repeat-until loops
while loop:
while < condition > do
{
<statement-1>
.
.
.
<statement-n>
}
for loop:
for variable: = value-1 to value-2 step step do
{
<statement-1>
.
.
.
<statement-n>
}
repeat-until:
repeat
<statement-1>
.
.
.
<statement-n>
until<condition>
8. A conditional statement has the following forms.

if statement:
if <condition> then
<statement>
if <condition> then
<statement-1>
else
<statement-2>

case statement:
switch
{
: <case-1> : <statement-1>
.
.
.
: <case-n> : <statement-n>
: default : <statement-n+1>
}

9. Input and output are represented using the instructions read & write.
10. There is only one type of procedure: Algorithm, the heading takes the form,
Algorithm Name (<parameter list>)
Examples:
1. An algorithm to find the maximum of two numbers
Algorithm Max(A,n)// A is an array of size n
{
Result := A[1]; for I:= 2 to n do
if A[I] > Result then
Result: =A[I];
return Result;
}

2. Algorithm for Selection Sort:


Algorithm selectionsort(a,n)
// Sort the array a[1:n] into non-decreasing order.
{
for i:=1 to n do
{
j:=i;
for k:=i+1 to n do
if (a[k]<a[j]) then
j:=k;
t:=a[i];
a[i]:=a[j];
a[j]:=t;
}
}

Recursive Algorithms:
 A Recursive function is a function that is defined in terms of itself.
 Similarly, an algorithm is said to be recursive if the same algorithm is invoked in the body.
 An algorithm that calls itself is Direct Recursive.
 Algorithm “A” is said to be Indirect Recursive if it calls another algorithm, which in turn
calls “A”.
 The Recursive technique is powerful and capable of clearly expressing complex processes.
 The following 2 examples show how to develop a recursive algorithm.
 In the first, the Towers of Hanoi problem, and in the second, generating all possible
permutations of a list of characters is considered.

Towers of Hanoi Problem:

 The Towers of Hanoi is a problem in which there will be some disks of decreasing sizes,
and they are stacked on the tower in decreasing order of size, bottom to top.
 There are two other towers (B and C) in which one tower will act as the destination tower
and the other will act as the intermediate tower.
 In this problem, we must move the disks from the source tower to the destination tower.
The conditions included during this problem are:
1) Only one disk should be moved at a time.
2) No larger disks should be kept on the smaller disks.
Consider an example to explain more about the Towers of Hanoi:
Consider there are three towers, A, B, and C, and there will be three disks present in tower A.
Consider C as the destination tower and B as the intermediate tower. The steps involved in
moving the disks from A to C are
Step 1: Move the smaller disk, which is present at the top of tower A, to tower C.
Step 2: Then move the next smallest disk present at the top of tower A to tower B.
Step 3: Now, move the smallest disk present at tower C to tower B
Step 4: Now, move the largest disk present at tower A to tower C
Step 5: Move the smallest disk present at the top of tower B to tower A.
Step 6: Move the disk present at tower B to tower C.
Step 7: Move the smallest disk present at tower A to tower C In this way, disks are moved from
the source tower to the destination tower.

Algorithm for Towers of Hanoi:


Algorithm TowersofHanoi (n, X ,Y, Z)
{
if (n>=1) then
{
TowersofHanoi(n-1, X, Z, Y);
Write(“move top disk from tower “,X, “to top of tower”,Y);
TowersofHanoi(n-1, Z, Y, X);
}
}

Permutation Generator Problem:


Given a set of n >= 1 elements, the problem is to print all possible permutations of this set. For
example, if the set is {a,b,c}, then the set of permutations is {(a,b,c), (a,c,b), (b,a,c), (b,c,a),
(b,c,a), (c,a,b), (c,b,a)}.
 It is easy to see that given n elements, there are n! different permutations. A simple algorithm
can be obtained by looking at the case of four elements (a,b,c,d). The answer can be
constructed by writing
1. a followed by all the permutations of (b,c,d)
2. b followed by all the permutations of (a,c,d)
3. c followed by all the permutations of (a,b,d)
4. d followed by all the permutations of (a,b,c)
 It implies that we can solve the problem for a set with n elements if we have an algorithm
that works on n-1 elements.
Algorithm for Permutation Generator Problem:
Algorithm Perm(a,k,n)
{
if(k==n) then
write (a[l:n]); // Output permutation.
else //a[k:n] has more than one permutation.
//Generate these recursively.
for i:=k to n do
{
t:=a[k]; a[k] :=a[i]; a[i] :=t;
Perm(a, k+l, n);
// All permutations of a[k + 1:n]
t :=a[k];a[k] :=a[i]; a[i] :=t;
}
}

PERFORMANCE ANALYSIS
 In computer science, there are multiple algorithms to solve a problem. When we have
more than one algorithm to solve a problem, we need to select the best one.
 Performance analysis helps us select the best algorithm from multiple algorithms to
solve a problem.
 When there are multiple alternative algorithms to solve a problem, we analyse them and
pick the one which is best suitable for our requirements.
 Generally, the performance of an algorithm depends on the following elements...
1. Is that algorithm providing the exact solution for the problem?
2. Is it easy to understand?
3. Is it easy to implement?
4. How much space (memory) does it require to solve the problem?
5. How much time does it take to solve the problem? etc.,
 When we want to analyse an algorithm, we consider only the space and time required
by that algorithm, and we ignore all remaining elements.
 The performance of algorithms can be measured on the scales of time and space.
 The performance of a program is the amount of computer memory and time needed to
run a program.
1. Space Complexity:
The space complexity of an algorithm is the amount of memory it needs to run to
compilation.
2. Time Complexity:
The time complexity of an algorithm is the amount of computer time it needs to run to
compilation.

Space Complexity:
The Space needed by each of these algorithms is seen to be the sum of the following
components.
1. A fixed part that is independent of the characteristics (eg: number, size) of the inputs and
outputs.
The part typically includes the instruction space (ie, Space for the code), space for simple
variables and fixed-size component variables (also called aggregate), space for constants, and so
on.
2. A variable part that consists of the space needed by component variables whose size is
dependent on the particular problem instance being solved, the space needed by referenced
variables (to the extent that it depends on instance characteristics), and the recursion stack space.
 The space requirement s(p) of any algorithm p may therefore be written as,
S(P) = c+ Sp(Instance characteristics) where “c‟ is a constant.
Example 1:
Algorithm sum(A,n)
{
s=0.0;
for I=1 to n do
s= s+A[I];
return s;
}
In the above algorithm, n and s occupy one word each and array “A‟ occupies n number of
words, so S(P)>=n+3
Example 2:
Algorithm for Sum of Numbers using Recursion:
Algorithm RSum(A, n)
{
if(n<=0) then
return 0.0;
else
return RSum(A,n-1)+A[n];
}

The space complexity for above algorithm is:


 In the above recursion algorithm, the space needed for the values of n, return
address and pointer to an array. The above recursive algorithm depth is (n+1).
 In a recursive call, we require space for the values of n, the return address and the pointer
to the array.
 So the total space occupied by the above algorithm is S(P) >= 3(n+1)
Time Complexity
 The time T(p) taken by a program P is the sum of the compile time and the run
time(execution time)
 The compile time does not depend on the instance characteristics. Also, we may assume that
a compiled program will be run several times without recompilation. This run time is denoted
by tp(instance characteristics).
 The number of steps any problem statement t is assigned depends on the kind of statement.
 For example,
o comments  0 steps.
o Assignment statements  1 steps.

which does not involve any calls to other algorithms.


 Interactive statement such as for, while & repeat-until Control part of the statement.
 We can determine the number of steps needed by a program to solve a particular problem
instance in two ways.
o We introduce a variable, count, into the program statement to increment count with an
initial value of 0. Statements to increment the count by the appropriate amount are
introduced into the program.
o This is done so that each time a statement in the original program is executed, the
count is incremented by the step count of that statement.
Example1:
Algorithm sum(a,n)
{
s= 0.0;
count = count+1;
for I=1 to n do
{
count =count+1;
s=s+a[I];
count=count+1;
}
count=count+1;
count=count+1;
return s;
}

 If the count is zero to start with, then it will be 2n+3 on termination. So each invocation of
sum executes a total of 2n+3 steps.
Example 2:
Algorithm RSum(a,n)
{
count:=count+1;// For the if conditional
if(n<=0)then
{
count:=count+1; //For the return
return 0.0;
}
else
{
count:=count+1; //For the addition,function invocation and return
return RSum(a,n-1)+a[n];
}

}
Example3:
Algorithm for Matrix Addition
Algorithm Add(a,b,c,m,n)
{
for i:=1 to m do
{
count:=count+1; //For 'for i'
for j:=1 to n do
{
count:=count+1; //For 'for j'
c[i,j]=a[i,j]+b[i,j];
count:=count+1; //For the assignment
}
count:=count+1; //For loop initialisation and last time of 'for j'
}

Explanation: count:=count+1; //For loop initialisation and last time of 'for i'
If the count is zero to start with, then it will be 2mn+2m+1 on termination. So, each invocation
of sum execute a total of 2mn+2m+1 steps
2. The second method to determine the step count of an algorithm is to build a table in which we
list the total number of steps contributed by each statement.
 First determine the number of steps per execution (s/e) of the statement and the
total number of times (ie., frequency) each statement is executed.
 By combining these two quantities, the total contribution of all statements,
the step count for the entire algorithm is obtained.
Example 1:

Example 2:
Example 3:

Comparison of Algorithms:
To compare algorithms, let us define few objective measures:
Execution times: Not a good measure as execution times are specific to a particular computer.
Number of statements executed: Not a good measure, since the number of statements varies
with the programming language as well as the style of the individual programmer.
Ideal Solution: Let us assume that we expressed running time of given algorithm as a function
of the input size n (i.e., f(n)) and compare these different functions corresponding to running
times. This kind of comparison is independent of machine time, programming style, etc...
Rate of Growth:
 The rate at which the running time increases as a function of input is called the rate of
growth.
 As an example in the below case, n4, 2n2, 100n and 500 are the individual costs of some
function and approximate it to n4. Since, n4 is the highest rate of growth.

Table. Function values


Figure. Plot of function values
Type of Analysis:
 To analyse the given algorithm, we need to know on what inputs the algorithm is taking
less time (performing well) and on what inputs the algorithm is taking maximum time.
 We have already seen that an algorithm can be represented in the form of an expression.
That means we represent the algorithm with multiple expressions: one for case where it is
taking the less time and other for case where it is taking the more time.
 In general the first case is called the best case and second case is called the worst case of
the algorithm.

There are three types of analysis:


• Worst case
o Defines the input for which the algorithm takes maximum time.
o Input is the one for which the algorithm runs the slower.
• Best case
o Defines the input for which the algorithm takes lowest time.
o Input is the one for which the algorithm runs the fastest.
• Average case
o Provides a prediction about the running time of the algorithm
o Assumes that the input is random

Lower Bound <= Average Time <= Upper Bound


Asymptotic Notation
 To analyze an algorithm Asymptotic notations are used.
 Asymptotic notation of an algorithm is a mathematical representation of the complexity.
 Let us represent the time complexity and the space complexity using the common
function f(n).
 Having the expressions for best, average case and worst cases, for all three cases, we
need to identify the upper and lower bounds.

 Big - Oh notation is used to define the upper bound of an algorithm in terms of Time
Complexity.
 Big - Oh notation always indicates the maximum time required by an algorithm for all
input values.
 Big - Oh notation describes the worst case of an algorithm time complexity.
 It is represented as O(T)

Examples:
2. f(n) = 10n2 + 4n + 2
Let us take g(n) = n2
c = 11
n0 = 6
Let us check the above condition
10n2 + 4n + 2 ≤ 11n2
The condition is satisfied. Hence f(n) = O(n2).

 Omega notation is used to define the lower bound of an algorithm in terms of Time
Complexity.
 Omega notation always indicates the minimum time required by an algorithm for all
input values.
 Omega notation describes the best case of an algorithm time complexity.
 It is represented as Ω (T)
Examples:
1. f(n) = 3n + 2
Let us take g(n) = n
c=3
n0 = 0
Let us check the above condition
3n + 1 ≥ 3n for all n ≥ 0
The condition is satisfied. Hence f(n) = Ω(n)
10n2 + 4n + 2 ≥ 10n2 for all n ≥ 0
The condition is satisfied. Hence f(n) = Ω (n2).

 Theta notation is used to define the average bound of an algorithm in terms of


Time Complexity.
 Theta notation always indicates the average time required by an algorithm for all
input values. Theta notation describes the average case of an algorithm time
complexity.
 It is represented as Θ (T)

Performance Measurement
Performance measurement is concerned with obtaining the space and time requirements of a
particular algorithm. These quantities depend on the compiler and options used, as well as on the
computer on which the algorithm is run.
Performance Measures evaluate:
 Quality of the solution: Time and Space requirements
 Simplicity of the solution
 Performance Improvement: Design or infrastructure.
Running Time Calculation:
To obtain the run time of a program, we need to plan the experiment. The following issues need
to be addressed during the planning stage:
1. What is the accuracy of the clock? How accurate do our results have to be?
2. For each instance size, a repetition factor needs to be determined. This is to be chosen such
that the event time is at least the minimum time that can be clocked with the desired accuracy.
3. Are we measuring worst-case or average performance? Suitable test data needs to be
generated.
4. What is the purpose of the experiment? Are the times being obtained for comparative
purposes, or are they to be used to predict run times?
5. In case the times are to be used to predict run times, then we need to fit a curve through the
points. For this, the asymptotic complexity should be known.
Divide and Conquer: The general method

 In divide and conquer approach, given a function to compute on n inputs, the divide-and-
conquer strategy suggests splitting the inputs into k distinct subsets, 1 < k <= n, yielding
k sub-problems.
 These sub-problems must be solved, and then a method must be found to combine
solutions into a solution of the whole.
Generally, Divide-and-conquer algorithms have three parts:
1. Divide the problem into a number of sub-problems that are smaller instances of the same
problem.
2. Conquer the sub-problems by solving them recursively. If they are small enough, solve
the sub-problems as base cases.
3. Combine the solutions to the sub-problems into the solution for the original problem.

Algorithm DandC(P) //P is a Problem


{
if Small(P) then
return S(P) // solution of p;
else
{
divide P into smaller problems p1, p2, …, pk, k>=1;
Apply DandC to each of the sub-problems DandC(p1), DandC(p2), …,
DandC(pk);
return Combine (S(p1), S(p2), …, S(pk));
}
}
Binary search
Binary Search using Divide and Conquer:
 Binary Search is a fast search algorithm with run-time complexity Ο (log n).
 The binary search algorithm works only on a sorted list.
Procedure: Binary Search Algorithm follows a divide and conquer approach, in which the list is
divided into two halves (Mid-Point).
Binary search begins by comparing a search item with the middle element of the list.
Three Possibilities:
1) If the search item is equal to the middle element, then its position in the array is returned.
2) If the search item is less than the middle element, then the search continues in the lower
half of the array (i.e., Lower Bound to Mid-1).
3) If the search item is greater than the middle element, then the search continues in the
upper half of the array (i.e., Mid+1 to Upper Bound). By doing this, in each iteration, the
algorithm eliminates half of the array in which the search item cannot lie.

Algorithm Bin_Search (a, lb, ub, X)


// Given a sorted array a [lb: ub]
// X is searching item, if X is present return index else return 0.
{
if (lb==ub) then //if Small(P)
{
if(X=a[lb]) then
return lb;
else
return 0;
}
else
{
mid = (lb + ub) / 2;
if(X=a[mid]) then
return mid;
else if(X<a[mid]) then
return Bin_Search (a, lb, mid-1, X);
else
return Bin_Search (a, mid+1, ub, X);
}
}
Time Complexity:
The time complexity of binary search is given by the recurrence relation

Explanation:
Finding Maximum and Minimum

 Finding the maximum and minimum elements in an array is a fundamental task in algorithm
design and data processing.
 It involves traversing the array to compare and track the highest and lowest values. Various
approaches, like linear search, divide and conquer, or pairwise comparison, can be used
depending on the complexity and constraints.

Algorithm StraightMaxMin(a,n,max,min)
{
max=min=a[1];
for i=2 to n do
{
if(a[i]>max) then max=a[i];
if (a[i]<min) then min = a[i];
}
}
The number of comparisons is 2(n-1)

Using the divide and conquer method


Procedure
 Let P = (n, a[ i ],..,a[ j]), denotes ‘n’ is the size of list and the list of elements are: a[i],
…,a[j].
 Let “ Small(P) ” is true when n<=2:
 If n=1 the maximum and minimum is “a[i]”.
 If n=2 problem can be solved by making ‘one comparison’.
Now, if the list has more than 2 elements, then the problem ‘P’ then divided into two sub-
problems as follows:
P1 = (n/2, a[ i ]…..a[n/2]) and
P2 = (n -(n/2), a[n/2+1]…a[j])

Algorithm MaxMin( i, j, max, min)


{
if (i=j) then max:=min:=a[i]; //if Small(P)
else if(i=j -1) then // two elements
{
if(a[i]<a[j]) then
{
max:=a[j]; min:=a[i];
}
else
{
max:=a[i]; min:=a[j];
}
}
else
{
mid=[(i+j)/2];
// Solve the sub problems
MaxMin(i, mid, max, min);
MaxMin(mid+1, j, max1, min1);
// Combine the solutions
if(max<max1) then max:=max1;
if(min>min1) then min:=min1;
}
}
The procedure is initially invoked by the statement MaxMin(l,n,x,y)

Explanation:
Suppose we simulate MaxMin on the following nine elements:
a: [1] [2] [3] [4] [5] [6] [7] [8] [9]
22 13 -5 -8 15 60 17 31 47
 Initially, the root node contains1and 9 as the values of i and j correspond to the initial call to
MaxMin.
 This execution produces two new calls to MaxMin, where i and j have the values 1,5 and 6,9,
respectively, and thus split the set into two subsets of approximately the same size.
 The time complexity of the recurrence relation is given by
T(n) = 0 if n=1
=1 if n=2
= T(n /2)+T(n /2)+2 if n >2

Solving using the Back Substitution Method:


T(n) = T(n/2)+T(n/2)+2
= 2T(n/2)+2
= 2[2T(n/4)+2]+2
= 4T(n/4)+4+2
= 4[2T(n/8)+2]+4+2
= 8T(n/8)+8+4+2
= 2iT(n/2i)+2i+2i-1+….+2
Assume n/2i=2
n= 2i+1
T(n) = 2iT(2)+2i+2i-1+….+2
= 2i.1+2i+2i-1+….+2 (Use sum of n terms using geometric progression)
= 2i + 2(2i-1)/2-1
= 2i+1+2i-2
= n + n/2 – 2
= 3*n/2 – 2
= O(n)
Merge sort
Definition:
Merge sort is a sort algorithm that splits the items to be sorted into two groups, recursively sorts
each group, and merges them into a final sorted sequence.

Algorithm Steps:
1. Divide the array into two halves.
2. Recursively sort each half.
3. Merge the two sorted halves into a single sorted array.

Features: Merge sort


 It is a comparison-based algorithm
 It is a stable algorithm
 It is a perfect example of the divide-and-conquer algorithm design strategy
 It was invented by John Von Neumann

Algorithm MergeSort (low, high)


//a[low:high] is a global array to be sorted
{
if(low<high) then
{
mid:= (low+high)/2; //Divide the ‘p’ into sub problems
//Solve the sub-problems
MergeSort(low,mid);
MergeSort(mid+1,high);
//Combine the sub solutions
Merge(low,mid,high);
}
}
Algorithm Merge (low, mid, high)
//a[low:high] is a global array //which contains sub arrays //a[low:mid] and a[mid+1,high]
{
i:= low, j:= mid+1, k:=low;
while((i<=mid) and (j<=high)) do
{
if(a[i] <= a[j]) then
{
temp[k++] = a[i++];
}
else
{
temp[k++] = a[j++];
}
while(i <= mid)
{
temp[k++]:= a[i++];
}
while(j <= high)
{
temp[k++]:= a[j++];
}
for h:=low to high do
{
a[h]:= temp[h];
}
}
Explanation:

Example: Apply merge sort for the following list of elements: 8,5,89,30,42,92,64
Merging of two sorted arrays:

Merger Sort Analysis:


Time complexity:
Best Case: O(nlogn)
Average Case: O(nlogn)
Worst Case: O(nlogn)
Space Complexity: O(n)

Advantages:
 Number of comparisons performed is nearly optimal.
 Merge sort will never degrade to O(n2)
 It can be applied to files of any size

Limitations:
Uses O(n) additional memory.

Quick Sort
 Quick sort is one of the most powerful sorting algorithms. It works on the Divide and
Conquer design principle.
 Quick sort works by finding an element, called the pivot, in the given input array and
partitions the array into three subarrays such that the left subarray contains all elements that
are less than or equal to the pivot. The middle subarray contains the pivot.
 The right subarray contains all elements that are greater than the pivot. Now, the two
subarrays, namely the left subarray and the right subarray, are sorted recursively.
 The partitioning of the given input array is part of the Divide step. The recursive calls to
sort the subarrays are part of the Conquer step. Since the sorted sub-arrays are already in the
right place, there is no combine step for the Quick sort.

Explanation:
Step1 : Find the pivot element (First element is pivot)
Step2 : i<j ( Swap(i,j) ) // i, j are index Values
Step3 : i>j ( Swap (pivot,j) )

55 45 50 60 65 70 80 75 85
50 45 55 60 65 70 80 75 85
45 50 55 60 65 70 75 80 85
Time complexity:
1. Best case: O(nlogn)
2. Average case: O(nlogn)
3. Worst case: O(n2)

Selection

 The Partition algorithm of Quick Sort can also be used to obtain an efficient solution for the
selection problem.

Problem statement:
 We are given n elements a[1 : n] and are required to determine the kth-smallest element

Procedure for solution:

 After applying the Partitioning algorithm, if the partitioning element v is positioned at a[j]
then j-1 elements are less than or equal to a[j] and n-j elements are greater than or equal to
a[j]
 If k<j , kth smallest element is in a[1:j-1]
 If k>j, kth smallest element is in a[j+1:n]
 If k=j , kth smallest element is in a[j]

Algorithm: Finding the kth-smallest element


Explanation:

The array has the nine elements 65,70, 75,80,85,60,55,50,and 45, with a[10] = ∞.

Case 1: (k=j)
If k = 5, then the first call of Partition will be sufficient since 65 is placed into a[5].
Case 2: (k>j)
Assume that we are looking for the seventh-smallest element of a, that is, k = 7. The next
invocation of Partition is Partition (6,10) (second half after pivot element)

This last call of Partition has uncovered the ninth-smallest element of a. The next invocation is
Partition(6,9)

Case 3: Here k<j


 This time, the sixth element has been found. Since k ≠ j, another call to Partition is made,
Partition(7, 9)

 Now 80 is the partition value and is correctly placed at a[8]. However, Select1 has still
not found the seventh-smallest element.
 It needs one more call to Partition, which is Partition(7, 8). This performs only an
interchange between a[7] and a[8] and returns, having found the correct value.

Analysis:
In analysing Select, we make the same assumptions that were made for Quicksort:
1. The n elements are distinct.
2. The input distribution is such that the partition element can be the ith-smallest element of
a[m:p-1] with an equal probability for each i,1 ≤ i ≤ p-m.

 Partition requires O(p-m) time. On each successive call to Partition, either m increases by at
least one or j decreases by at least one.
 Initially, m = 1 and j = n + 1. Hence, at most n calls to Partition can be made.
Thus, the worst-case complexity of Select1 O(n2).
 The time is Ω(n2), for example, when the input a[l:n] is such that the partitioning element on
the ith call to Partition is the ith-smallest element and k = n. In this case, m increases by one
following each call to Partition and j remains unchanged.
 Hence, n calls are made for a total cost of O(∑ 1n i) = O(n2). The average computing time of
Selectl is, however, only O(n)

Strassen’s Matrix Multiplication

 Let A and B be two nXn matrices. The product matrix C= AB is also an nXn matrix whose i,
jth elements is formed by taking the elements in the ith row of A and the jth column of B and
multiplying them as:

for all i and j between 1 and n.


An example of the matrix multiplication is defined as:

 To compute C(i,j) needs n multiplications. As the matrix C has n2 elements, the time for the
resulting matrix multiplication algorithm requires time complexity O(n3)
 Divide and Conquer approach suggests another way to compute the product of two nXn
matrices. For simplicity, we assume that n is a power of 2, that is, that there exists a non-
negative integer k such that n = 2k.
 In case i is not a power of two, then enough rows and columns of zeros can be added to both
A and B so that the resulting dimensions are a power of two.
 When applying the divide-and-conquer principle to matrix multiplication, consider that A and
B are each partitioned into four square submatrices, each submatrix having dimensions
n/2Xn/2.
 Then the product AB can be computed by using the above formula for the product of 2 x 2
matrices:

Example of matrix multiplication using the divide and conquer principle is as follows:

Time complexity for matrix multiplication using Divide and conquer:


 To compute A*B using the above method, we need to perform eight multiplications of n/2 X
n/2 matrices and four additions of n/2 X n/2 matrices.
 Since two n/2 X n/2 matrices can be added in time cn2 for some constant c, the overall
computing time T(n) of the resulting divide-and-conquer algorithm is given by the recurrence
relation:
where b and c are constants.
 This recurrence can be solved in the same way as previous recurrences to obtain T (n) =
O(n3).
 Hence even after using Divide and Conquer method, no improvement has been made over
the conventional method.

Strassens Matrix Multiplication

 A mathematician Volker Strassen has discovered a way to compute the Cjj’of the divide and
conquer method using only 7 multiplications and 18 additions or subtractions.
 This method involves first computing the seven n/2 X n/2 matrices P, Q, R, S, T, U, and V.
Then the Cij’s are computed using the following formulas. As can be seen,P, Q, R, S, T, U,
and V can be computed using 7 matrix multiplications and 10 matrix additions or
subtractions.
 The Cij’s require an additional 8 additions or subtractions.

 Using the Divide and conquer principle, The Strassen’s matrix multiplication algorithm using
recursion is as follows:
 The time complexity of Strassen’s matrix multiplication with 7 matrix multiplications and 18
additions or subtractions with a recurrence relation given as:

 Although it looks there is slight variation in time complexity but as n increases, we can see
big difference when compared to general divide and conquer method.

You might also like