Data Structures
Fall 2023
3. Complexity Analysis
Comparing Algorithms
• Given two or more algorithms to solve the same problem, how do
we select the best one?
• Some criteria for selecting an algorithm
– Is it easy to implement, understand, modify?
– How long does it take to run it to completion?
– How much of computer memory does it use?
• Software engineering is primarily concerned with the first criteria
• In this course we are interested in the second and third criteria
Data Structures 3 - Complexity Analysis 2
Comparing Algorithms
• Time complexity
– The amount of time that an algorithm needs to run to completion
– Better algorithm is the one which runs faster
Has smaller time complexity
• Space complexity
– The amount of memory an algorithm needs to run
• In this lecture, we will focus on analysis of time complexity
Data Structures 3 - Complexity Analysis 3
How To Calculate Running Time
• Most algorithms transform input objects into output objects
sorting
5 3 1 2 1 2 3 5
algorithm
input object output object
sorting
5 3 1 2 4 6 1 2 3 4 5 6
algorithm
• The running time of an algorithm typically grows with input size
– Idea: analyze running time as a function of input size
Data Structures 3 - Complexity Analysis 4
How To Calculate Running Time
• Most important factor affecting running time is usually the size of
the input
int find_max( int *array, int n ) {
int max = array[0];
for ( int i = 1; i < n; ++i ) {
if ( array[i] > max ) {
max = array[i];
}
}
return max;
}
• Regardless of the size n of an array the time complexity will always
be same
– Every element in the array is checked one time
Data Structures 3 - Complexity Analysis 5
How To Calculate Running Time
• Even on inputs of the same size, running time can be very different
int search(int arr[], int n, int x) {
int i;
for (i = 0; i < n; i++)
if (arr[i] == x)
return i;
return -1;
}
• Example: Search for 1
– Best case: Loop runs 1 times
1 2 3 4 5 6
Data Structures 3 - Complexity Analysis 6
How To Calculate Running Time
• Even on inputs of the same size, running time can be very different
int search(int arr[], int n, int x) {
int i;
for (i = 0; i < n; i++)
if (arr[i] == x)
return i;
return -1;
}
• Example: Search for 1
– Worst case: Loop runs n times
6 5 4 3 2 1
Data Structures 3 - Complexity Analysis 7
How To Calculate Running Time
• Even on inputs of the same size, running time can be very different
int search(int arr[], int n, int x) {
int i;
for (i = 0; i < n; i++)
if (arr[i] == x)
return i;
return -1;
}
• Example: Search for 1
– Average case: Loop runs between 1 and n times
3 2 1 4 5 6
Data Structures 3 - Complexity Analysis 8
How To Calculate Running Time
• Even on inputs of the same size, running time can be very different
int search(int arr[], int n, int x) {
int i;
for (i = 0; i < n; i++)
if (arr[i] == x)
return i;
return -1;
}
• Idea: Analyze running time for different cases
– Best case
– Worst case
– Average case
Data Structures 3 - Complexity Analysis 9
How To Calculate Running Time
best case
• Best case running time is usually
average case
not very useful worst case
120
• Average case time is very useful 100
but often hard to determine
Running Time
80
60
• Worst case running time is easier 40
to analyze
20
– Crucial for real-time applications
such as games, finance and 0
1000 2000 3000 4000
robotics
Input Size
Data Structures 3 - Complexity Analysis 10
Experimental Evaluations of Running Times
• Write a program implementing 9000
the algorithm 8000
7000
• Run the program with inputs of 6000
Time (ms)
varying size 5000
4000
• Use clock methods to get an 3000
accurate measure of the actual 2000
running time
1000
0
• Plot the results 0 50 100
Input Size
Data Structures 3 - Complexity Analysis 11
Limitations Of Experiments
Experimental evaluation of running time is very useful but
• It is necessary to implement the algorithm, which may be difficult
• Results may not be indicative of the running time on other inputs
not included in the experiment
• In order to compare two algorithms, the same hardware and
software environments must be used
Data Structures 3 - Complexity Analysis 12
Theoretical Analysis of Running Time
• Uses a pseudo-code description of the algorithm instead of an
implementation
• Characterizes running time as a function of the input size n
• Takes into account all possible inputs
• Allows us to evaluate the speed of an algorithm independent of the
hardware/software environment
Data Structures 3 - Complexity Analysis 13
Analyzing an Algorithm – Operations
• Each machine instruction is executed in a fixed number of cycles
– We may assume each operation requires a fixed number of cycles
• Idea: Use abstract machine that uses steps of time instead of secs
– Each elementary operation takes 1 steps
• Example of operations
– Retrieving/storing variables from memory
– Variable assignment =
– Integer operations + - * / % ++ --
– Logical operations && || !
– Bitwise operations &|^~
– Relational operations == != < <= => >
– Memory allocation and deallocation new delete
Data Structures 3 - Complexity Analysis 14
Analyzing an Algorithm – Blocks of Operations
• Each operation runs in a step of 1 time unit
• Therefore any fixed number of operations also run in 1 time step
– s1; s2; …. ; sk
– As long as number of operations k is constant
// Swap variables a and b
int tmp = a;
a = b;
b = tmp;
Data Structures 3 - Complexity Analysis 15
Analyzing an Algorithm
// Input: int A[N], array of N integers
// Output: Sum of all numbers in array A
int SumArray(int A[], int n){
int s=0; 1
for (int i=0; i< n; i++)
2 3 4
s = s + A[i];
5
return s;
6 7
} 8
• Operations 1, 2, and 8 are executed once
• Operations 4, 5, 6, and 7: Once per each iteration of for loop n iteration
• Operation 3 is executed n+1 times
• The complexity function of the algorithm is : T(n) = 5n +4
Data Structures 3 - Complexity Analysis 16
Analyzing an Algorithm – Growth Rate
• Estimated running time for different values of n:
– n = 10 => 54 steps
– n = 100 => 504 steps
– n = 1,000 => 5004 steps
– n = 1,000,000 => 5,000,004 steps
• As n grows, number of steps T(n) grow in linear proportion to n
Data Structures 3 - Complexity Analysis 17
Growth Rate
Data Structures 3 - Complexity Analysis 18
Growth Rate
Data Structures 3 - Complexity Analysis 19
Growth Rate
• Changing the hardware/software environment
– Affects T(n) by a constant factor, but
– Does not alter the growth rate of T(n)
• Thus we focus on the big-picture which is the growth rate of an
algorithm
• The linear growth rate of the running time T(n) is an intrinsic
property of algorithm sumArray
Data Structures 3 - Complexity Analysis 20
Constant Factors
• The growth rate is not affected by
– Constant factors or
– Lower-order terms
• Example:
– f(n) = n2
– g(n) = n2 – 3n + 2
Data Structures 3 - Complexity Analysis 21
Growth Rate – Example
• Consider the two functions
– f(n) = n2
– g(n) = n2 – 3n + 2
• Around n = 0, they look very different
Data Structures 3 - Complexity Analysis 22
Growth Rate – Example
• Yet on the range n = [0, 1000], f(n) and g(n) are (relatively)
indistinguishable
Data Structures 3 - Complexity Analysis 23
Growth Rate – Example
• The absolute difference is large, for example,
– f(1000) = 1 000 000
– g(1000) = 997 002
• But the relative difference is very small
f(1000) g(1000)
0.002998 0.3%
f(1000)
– The difference goes to zero as n → ∞
Data Structures 3 - Complexity Analysis 24
Constant Factors
• The growth rate is not affected by
– Constant factors or
– Lower-order terms
• Example:
– f(n) = n2
– g(n) = n2 – 3n + 2
For n = 1
% of running time due to n2 = 1/(1+3+2)*100 = 16.66%
% of running time due to 3n = 3/(1+3+2)*100 = 50%
% of running time due to 2 = 2/(1+3+2)*100 = 33.33%
Data Structures 3 - Complexity Analysis 25
Constant Factors
• How do we get rid of the constant factors to focus on the essential
part of the running time?
– Asymptotic Analysis
Data Structures 3 - Complexity Analysis 26
Upper Bound – Big-O Notation
• Indicates the upper or highest growth rate that the algorithm can
have
– Ignore constant factors and lower order terms
– Focus on main components of a function which affect its growth
• Examples
– 55 = O(1)
– 25c + 32k = O(1) // if c,k are constants
– 5n + 6 = O(n)
– n2 – 3n + 2 = O(n2)
– 7n + 2nlog(5n) = O(nlogn)
Data Structures 3 - Complexity Analysis 27
Analyzing an Algorithm
• Simple Assignment
– a = b
– O(1) // Constant time complexity
• Simple loops
– for (i=0; i<n; i++) { s; }
– O(n) // Linear time complexity
• Nested loops
– for (i=0; i<n; i++)
for (j=0; j<n; j++) { s; }
– O(n2) // Quadratic time complexity
Data Structures 3 - Complexity Analysis 28
Analyzing an Algorithm
• Loop index doesn’t vary linearly
– h = 1;
while ( h <= n ) {
s;
h = 2 * h;
}
– h takes values 1, 2, 4, … until it exceeds n
– There are 1 + log2n iterations
– O(log2 n) // Logarithmic time complexity
Data Structures 3 - Complexity Analysis 29
Analyzing an Algorithm
• Loop index depends on outer loop index
– for (j=0; j<=n; j++)
for (k=0; k<j; k++) { s; }
– Inner loop executed 0, 1, 2, 3, …., n times
𝑛
𝑛 (𝑛 + 1)
𝑖 =
2
𝑖=1
– O(n2)
Data Structures 3 - Complexity Analysis 30
Exercises in Time Complexity Analysis
𝑂(𝑛 + 𝑚)
𝑂(𝑛2 )
𝑂(𝑛 log 2 𝑛)
Data Structures 3 - Complexity Analysis 31
Exercises in Time Complexity Analysis
𝑂(log 2 𝑛)
𝑂(log 𝑘 𝑛)
𝑂(𝑛)
Data Structures 3 - Complexity Analysis 32
Big-O Notation: Mathematical Definition
• Most commonly used notation for specifying asymptotic
complexity—i.e., for estimating the rate of function growth—is the
big-O notation introduced in 1894 by Paul Bachmann
Definition
• Given two positive-valued functions f and g:
𝑓 𝑛 = 𝑂(𝑔 𝑛 )
if there exist positive numbers c and N such that
𝑓 𝑛 ≤ 𝑐 ∙ 𝑔 𝑛 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑛 ≥ 𝑁
• 𝒇 is big-O of 𝒈 if there is a positive number 𝒄 such that 𝒇 is not
larger than 𝒄 ∙ 𝒈 for sufficiently large 𝒏; that is, for all 𝒏 larger than
some number 𝑵
Data Structures 3 - Complexity Analysis 33
Relationship b/w f and g
• The relationship between 𝒇 and 𝒈 can be expressed by stating
either that 𝒈(𝒏) is an upper bound on the value of 𝒇(𝒏) or that, in
the long run, 𝒇 grows at most as fast as 𝒈
𝒄𝒈(𝒏)
𝒇(𝒏)
𝒏
𝑵
𝒄 ∙ 𝒈(𝒏) is an approximation to 𝒇(𝒏), bounding from above
Data Structures 3 - Complexity Analysis 34
Calculating c and g
• Usually infinitely many pairs of 𝒄 and 𝑵 that can be given for the
same pair of functions 𝒇 and 𝒈
𝑓 𝑛 = 2𝑛2 + 3𝑛 + 1 ≤ 𝑐𝑔 𝑛 = 𝑐𝑛2 = 𝑂(𝑛2 )
where 𝒈 𝒏 = 𝒏𝟐 , candidate values for 𝒄 and 𝑵 are
Different values of 𝑐 and 𝑁 for function 𝑓 𝑛 = 2𝑛2 + 3𝑛 + 1 =
𝑂(𝑛2 ) calculated according to the definition of big-O
Data Structures 3 - Complexity Analysis 35
Calculating c and g
• We obtain these values by solving the inequality:
2𝑛2 + 3𝑛 + 1 ≤ 𝑐𝑛2 for different 𝒏
• Because it is one inequality with two unknowns, different pairs of
constants 𝒄 and 𝑵 for the same function 𝒈 = 𝒏𝟐 can be determined
Different values of 𝑐 and 𝑁 for function 𝑓 𝑛 = 2𝑛2 + 3𝑛 + 1 =
𝑂(𝑛2 ) calculated according to the definition of big-O
Data Structures 3 - Complexity Analysis 36
Calculating c and g
• To choose the best 𝒄 and 𝑵, it should be determined for which 𝑵,
a certain term in 𝒇 becomes the largest and stays the largest
• In 𝒇(𝒏), the only candidates for the largest term are 𝟐𝒏𝟐 and 𝟑𝒏;
these terms can be compared using the inequality 𝟐𝒏𝟐 > 𝟑𝒏 that
holds for 𝒏 > 𝟏. 𝟓
𝟑
• Thus, the chosen values are 𝑵 = 𝟐 and 𝒄 ≥ 𝟑
𝟒
Different values of 𝑐 and 𝑁 for function 𝑓 𝑛 = 2𝑛2 + 3𝑛 + 1 =
𝑂(𝑛2 ) calculated according to the definition of big-O
Data Structures 3 - Complexity Analysis 37
Practical Significance
• All of them are related to the same function 𝒈 𝒏 = 𝒏𝟐 and to the
same 𝒇(𝒏)
• The point is that 𝒇 and 𝒈 grow at the same rate
• The definition states, however, that 𝒈 is almost always greater
than or equal to 𝒇 if it is multiplied by a constant 𝒄 where “almost
always” means for all 𝒏 not less than a constant 𝑵
• The crux of the matter is that the value of 𝒄 depends on which 𝑵 is
chosen, and vice versa
Data Structures 3 - Complexity Analysis 38
Practical Significance
• E.g., if 1 is chosen as the value of 𝑵—that is, if 𝒈 is multiplied by 𝒄
so that 𝒄 ∙ 𝒈(𝒏) will not be less than 𝒇 right away—then 𝒄 has to be
equal to 6 or greater
• Or if 𝒄 ∙ 𝒈(𝒏) is greater than or equal to 𝒇(𝒏) starting from 𝒏 = 𝟐,
𝟑
then it is enough that 𝒄 is equal to 𝟑
𝟒
Different values of 𝑐 and 𝑁 for function 𝑓 𝑛 = 2𝑛2 + 3𝑛 + 1 =
𝑂(𝑛2 ) calculated according to the definition of big-O
Data Structures 3 - Complexity Analysis 39
g(n) vs c
N is always a point
where the functions
𝑐𝑔(𝑛) and 𝑓
intersect each other
function g is plotted with different coefficients c
Data Structures 3 - Complexity Analysis 40
Best g(n)
• The inherent imprecision of the big-O notation goes even further,
because there can be infinitely many functions 𝒈 for a given
function 𝒇
• 𝒇 𝒏 = 𝟐𝒏𝟐 + 𝟑𝒏 + 𝟏 is big-O not only of 𝒏𝟐 , but also of
𝒏𝟑 , 𝒏𝟒 , … , 𝒏𝒌 for any 𝒌 ≥ 𝟐
– To avoid this, the smallest function 𝒈 is chosen, i.e., 𝒏𝟐 in this case
• Also, the approximation of 𝒇 can go further
– i.e., 𝒇 𝒏 = 𝟐𝒏𝟐 + 𝟑𝒏 + 𝟏 can be approximated as 𝒇 𝒏 = 𝟐𝒏𝟐 + 𝑶(𝒏)
– Or the function 𝒏𝟐 + 𝟏𝟎𝟎𝒏 + 𝐥𝐨𝐠 𝟏𝟎 𝒏 + 𝟏𝟎𝟎𝟎 can be approximated as
𝒏𝟐 + 𝟏𝟎𝟎𝒏 + 𝑶(𝐥𝐨𝐠 𝟏𝟎 𝒏)
Data Structures 3 - Complexity Analysis 41
Big-O Examples
• Prove that running time T n = n3 + 20n + 1 is 𝑂(𝑛3 )
Proof:
By the Big-O definition, T n is 𝑂(𝑛3 ) if
T n ≤ 𝑐 ∙ 𝑛3 for some 𝑛 ≥ 𝑁
Check the condition: 𝒏𝟑 + 𝟐𝟎𝒏 + 𝟏 ≤ 𝐜 ∙ 𝒏𝟑
𝟐𝟎 𝟏
or equivalently 𝟏 + + ≤𝒄
𝒏𝟐 𝒏𝟑
Therefore, the Big-O condition holds for 𝐧 ≥ 𝐍 = 𝟏 and 𝐜 ≥
𝟐𝟐 (= 1 + 20 + 1)
Larger values of 𝐍 result in smaller factors 𝐜 (e.g., for 𝐍 = 𝟏𝟎,
𝐜 ≥ 𝟏. 𝟐𝟎𝟏 and so on) but in any case the above statement is
valid
Data Structures 3 - Complexity Analysis 42
Big-O Examples
• Prove that running time T n = n3 + 20n + 1 is not 𝑂(𝑛2 )
Proof:
By the Big-O definition, T n is 𝑂(𝑛2 ) if
T n ≤ 𝑐 ∙ 𝑛2 for some 𝑛 ≥ 𝑁
Check the condition: 𝒏𝟑 + 𝟐𝟎𝒏 + 𝟏 ≤ 𝐜 ∙ 𝒏𝟐
𝟐𝟎 𝟏
or equivalently 𝒏 + + 𝟐 ≤𝒄
𝒏 𝒏
Therefore, the Big-O condition cannot hold since the left side of the
latter inequality is growing infinitely, i.e., there is no such
constant factor 𝒄
Data Structures 3 - Complexity Analysis 43
Big-O Examples
• Prove that running time T n = n3 + 20n + 1 is not 𝑂(𝑛2 )
Conclusion:
The left side of the inequality depends on the value of 𝒏, and it is
possible for the left side to be bounded by a constant 𝒄 for certain
𝟐𝟎 𝟏
ranges of 𝒏. However, as 𝒏 gets larger, the terms and 𝟐 become
𝒏 𝒏
smaller and approach zero. This means that there is a limit to how
large 𝒄 can be while still satisfying the inequality.
So, conclusion is that for larger values of 𝒏, the left side of the
inequality becomes smaller due to the decreasing fractions, and there
is no constant 𝒄 that can satisfy the inequality for all 𝒏.
Data Structures 3 - Complexity Analysis 44
Relatives of Big-O
• Big-Omega
– 𝑓(𝑛) is Ω(𝑔 𝑛 )
– If there is a constant 𝑐 > 0 and an integer constant 𝑛0 ≥ 1
– Such that 𝑓 𝑛 ≥ 𝑐 ∙ 𝑔(𝑛) for 𝑛 ≥ 𝑛0
• Big-Theta
– 𝑓(𝑛) is Θ(𝑔 𝑛 )
– if there are constants 𝑐1 > 0 and 𝑐2 > 0 and an integer constant
𝑛0 ≥ 1
– such that 𝑐1 ∙ 𝑔 𝑛 ≤ 𝑓 𝑛 ≤ 𝑐2 ∙ 𝑔(𝑛) for 𝑛 ≥ 𝑛0
Data Structures 3 - Complexity Analysis 45
Intuition for Asymptotic Notation
Big-O – Upper Bound
• 𝑓(𝑛) is 𝑂(𝑔 𝑛 ) if 𝑓(𝑛) is asymptotically less than or equal to 𝑔(𝑛)
Big-Omega – Lower Bound
• 𝑓(𝑛) is Ω(𝑔 𝑛 ) if 𝑓(𝑛) is asymptotically greater than or equal to
𝑔(𝑛)
• Note: 𝑓(𝑛) is Ω(𝑔 𝑛 ) if and only if 𝑔(𝑛) is 𝑂(𝑓 𝑛 )
Big-Theta – Exact Bound
• 𝑓(𝑛) is Θ(𝑔 𝑛 ) if 𝑓(𝑛) is asymptotically equal to 𝑔(𝑛)
• Note: 𝑓(𝑛) is Θ(𝑔 𝑛 ) if and only if
– 𝑔(𝑛) is 𝑂(𝑓 𝑛 ) and
– 𝑓(𝑛) is 𝑂(𝑔 𝑛 )
Data Structures 3 - Complexity Analysis 46
Final Notes
• Even though in this course we focus on the
asymptotic growth using big-Oh notation,
practitioners do care about constant factors
occasionally
Running time
• Suppose we have 2 algorithms
– Algorithm A has running time 30000n A
– Algorithm B has running time 3n2
• Asymptotically, algorithm A is better than B
algorithm B
• However, if the problem size you deal with 10000
is always less than 10000, then the problem size
quadratic one is faster
Data Structures 3 - Complexity Analysis 47
Any Question So Far?
Data Structures 3 - Complexity Analysis 48