DSA MK Lect3 PDF
DSA MK Lect3 PDF
DSA MK Lect3 PDF
Manoj Kumar
DTU, Delhi
Growth Rate
F(n) = O(n2) since f(n) <= 4n2 for all n >=1 (C=4, n0 = 1)
Show:
f(n) = 2n7 - 6n5 + 10n2 – 5 = O(n7)
∑
N
i =1
i ≤ N ⋅ N = O( N 2 )
∑
N 2 2 3
i =1
i ≤ N ⋅ N = O( N )
log N + N = O(N)
logk N = O(N) for any constant k
Big-Omega
f(N) = Ω(g(N))
∃ c , n0 > 0 such that f(N) ≥ c g(N) when N ≥ n0
f(N) grows no slower than c g(N) for “large” N
Big-Omega: examples
Let f(N) = 2N2. Then
f(N) = Ω(N)
f(N) = Ω(N2) (best answer)
Big Theta
f(N) = Θ(g(N))
∃ c1,c2 , n0 > 0 such that c1 g(N) ≤ f(N) ≤ c2 g(N) when N ≥ n0
f(N) grows no slower than c1 g(N) and no faster than c2 g(N)for
“large” N
the growth rate of f(N) is the same as the growth rate of g(N)
Big Theta
f(N) = Θ(g(N)) iff
f(N) = O(g(N)) and f(N) = Ω(g(N))
The growth rate of f(N) equals the growth rate of g(N)
Example: Let f(N)=N2 , g(N)=2N2
Since f(N) = O(g(N)) and f(N) = Ω(g(N)),
thus f(N) = Θ(g(N)).
Big-Theta means the bound is the tightest possible.
Some Rules
If T(N) is a polynomial of degree k, then
T(N) = Θ(Nk).
IF-ELSE statements
if (cond) then O(1)
S1 T1(n)
else
S2 T2(n)
never more than the running time of the test plus the larger of the running times of S1
and S2.
T(n) = O(max (T1(n), T2(n))
General Rules
Method calls
A calls B
B calls C
etc.
A sequence of operations when call sequences are flattened
T(n) = max(TA(n), TB(n), TC(n))
Complexity and Tractability
T(n)
n n n log n n2 n3 n4 n10 2n
10 .01µs .03µs .1µs 1µs 10µs 10s 1µs
20 .02µs .09µs .4µs 8µs 160µs 2.84h 1ms
30 .03µs .15µs .9µs 27µs 810µs 6.83d 1s
40 .04µs .21µs 1.6µs 64µs 2.56ms 121d 18m
50 .05µs .28µs 2.5µs 125µs 6.25ms 3.1y 13d
100 .1µs .66µs 10µs 1ms 100ms 3171y 4×1013y
103 1µs 9.96µs 1ms 1s 16.67m 3.17×1013y 32×10283y
104 10µs 130µs 100ms 16.67m 115.7d 3.17×1023y
105 100µs 1.66ms 10s 11.57d 3171y 3.17×1033y
106 1ms 19.92ms 16.67m 31.71y 3.17×107y 3.17×1043y
50000
10000 n log n
40000
n3 1000 n
30000
100
20000
n log n 10 log n
10000
0
n
1
n log n n
Analysis of Recursive Algorithms
Recursion
A function is defined recursively if it has the
following two parts
An anchor or base case
The function is defined for one or more specific values
of the parameter(s)
An inductive or recursive case
The function's value for current parameter(s) is defined
in terms of previously defined function values and/or
parameter(s)
Recursion:Example
Consider a recursive power function
double power (double x, unsigned n)
{ if ( n == 0 )
return 1.0;
else
return x * power (x, n-1); }
Which is the anchor?
Which is the inductive or recursive part?
How does the anchor keep it from going forever?
Recurrence T(n) = T(n-1) + O(1)
Recursion Example: Towers of Hanoi
Recursive algorithm especially appropriate for
solution by recursion
Task
Move disks from left peg to right peg
When disk moved, must be placed on a peg
Only one disk (top disk on a peg) moved at a time
Larger disk may never be placed on a smaller disk
Recursion Example: Towers of Hanoi
Identify base case:
If there is one disk move from A to C
Inductive solution for n > 1 disks
Move topmost n – 1 disks from A to B, using C for
temporary storage
Move final disk remaining on A to C
Move the n – 1 disk from B to C using A for temporary
storage
View code for solution,
Recursion Example: Towers of Hanoi
CODE
TowerOfHanoi(int n, char peg1, char peg3, char peg2)
{ // transfer n disks from peg1 to peg 3 using peg2
if ( n==1)
printf(“ Move disk from %c to %c\n” , peg1, peg3);
else
{
TowerOfHanoi(n-1, peg1, peg2, peg3);
printf(“Move disk from %c to %c\n”, peg1,peg3);
TowerOfHanoi(n-1, peg2, peg3, peg1);
}
}
24
Solving Recurrences :ITERATION
ITERATION : Example1
T(n) = c + T(n/2)
T(n) = c + T(n/2) T(n/2) = c + T(n/4)
= c + c + T(n/4) T(n/4) = c + T(n/8)
= c + c + c + T(n/8)
Assume n = 2k
T(n) = c + c + … + c + T(1)
k times
= c lg n + T(1)
= Θ(lg n)
Solving Recurrence: ITERATION
• Example2
T(n) = n + 2T(n/2)
T(n) = n + 2T(n/2)
= n + 2(n/2 + 2T(n/4))
= n + n + 4T(n/4)
= n + n + 4(n/4 + 2T(n/8))
= n + n + n + 8T(n/8)
… = in + 2iT(n/2i)
= kn + 2kT(1)
= n lg n + nT(1) = Θ(n lg n)
Substitution method
1. Guess the form of the solution.
2. Verify by induction.
3. Solve for constants.
• Apply only when it is easy to guess the form of
answer
Example: T(n) = 4T(n/2) + 100n
[Assume that T(1) = Θ(1).]
Guess O(n3) . (Prove O and Ω separately.)
Assume that T(k) ≤ ck3 for k < n .
Prove T(n) ≤ cn3 by induction.
Example of substitution
T ( n ) = 4 T ( n / 2 ) + 100n
3
≤ 4c (n /2) + 100n
= ( c / 2 ) n3 + 100n
residual
Example (continued)
• We must also handle the initial conditions, that is,
ground the induction with base cases.
• Base: T(n) = Θ(1) for all n < n0, where n0 is a suitable
constant.
• For 1 ≤ n < n0, we have “Θ(1)” ≤ cn3, if we pick c big
enough.
T ( n ) = 4 T ( n / 2 ) + 100n
≤ cn2 + 100n
≤ cn2
n2
n2
(n/4)2 (n/2)2 5 n2
16
(n/16)2 (n/8)2 (n/8)2 (n/4)2 25 n 2
256
…
Θ(1)
Total =
n
= Θ(n2)
2
( 5 5
1 + 16 + 16
2
( ) +( )
geometric series
5 3
16
+L )
L2.33
The master method
The master method applies to recurrences of the form
T(n) = a T(n/b) + f (n) ,
where a ≥ 1, b > 1, and f is asymptotically positive.
Master method: Case1
Compare f (n) with nlogba:
1. f (n) = O(nlogba – ε) for some constant ε > 0.
f (n) grows polynomially slower than nlogba (by an nε
factor).
Solution: T(n) = Θ(nlogba) .
Masters theorem: Case 2
Compare f (n) with nlogba:
2. f (n) = Θ(nlogba lgkn) for some constant k ≥ 0.
• f (n) and nlogba grow at similar rates.
▫ Solution: T(n) = Θ(nlogba lgk+1n) .
Masters theorem: Case 3
Compare f (n) with nlogba:
3. f (n) = Ω(nlogba + ε) for some constant ε > 0.
• f (n) grows polynomially faster than nlogba (by an nε
factor), and f (n) satisfies the regularity condition that
a f (n/b) ≤ c f (n) for some constant c < 1.
Solution: T(n) = Θ( f (n)) .
Examples
L2.39
Amortized analysis
• An amortized analysis is any strategy for analyzing a
sequence of operations to show that the average cost
per operation is small, even though a single
operation within the sequence might be expensive.
INCREMENT(A)
1. i ← 0
2. while i < length[A] and A[i] = 1
3. do A[i] ← 0 ⊳ reset a bit
4. i←i+1
5. if i < length[A]
6. then A[i] ← 1 ⊳ set a bit
Ctr A[4] A[3] A[2] A[1] A[0] Cost
0 0 0 0 0 0 0
1 0 0 0 0 1 1
2 0 0 0 1 0 3
3 0 0 0 1 1 4
4 0 0 1 0 0 7
5 0 0 1 0 1 8
6 0 0 1 1 0 10
7 0 0 1 1 1 11
8 0 1 0 0 0 15
9 0 1 0 0 1 16
10 0 1 0 1 0 18
11 0 1 0 1 1 19
12 0 1 1 0 0 22
13 0 1 1 0 1 23
14 0 1 1 1 0 25
15 0 1 1 1 1 26
16 1 0 0 0 0 31
Worst-case analysis
• Consider a sequence of n increments. The worst-case
time to execute one increment is Θ(k). Therefore, the
worst-case time for n increments is n · Θ(k) = Θ(n⋅ k).
• WRONG! In fact, the worst-case cost for n
increments is only Θ(n) ≪ Θ(n⋅ k).
• Let’s see why.
Tighter analysis
Ctr A[4] A[3] A[2] A[1] A[0] Cost Total cost of n operations
0 0 0 0 0 0 0
1 0 0 0 0 1 1 A[0] flipped every op n
2 0 0 0 1 0 3 A[1] flipped every 2 ops n/2
3 0 0 0 1 1 4
A[2] flipped every 4 ops n/22
4 0 0 1 0 0 7
5 0 0 1 0 1 8 A[3] flipped every 8 ops n/23
6 0 0 1 1 0 10 … … … … …
7 0 0 1 1 1 11
A[i] flipped every 2i ops n/2i
8 0 1 0 0 0 15
9 0 1 0 0 1 16
10 0 1 0 1 0 18
11 0 1 0 1 1 19
12 0 1 1 0 0 22
13 0 1 1 0 1 23
14 0 1 1 1 0 25
15 0 1 1 1 1 26
16 1 0 0 0 0 31
Tighter analysis…
lg n n
Cost of n increments= ∑
i =1
2 i
∞ 1
< n ∑ i = 2n
i =1 2
= Θ( n )
Thus, the average cost of each .increment
operation is Θ(n)/n = Θ(1).
Accounting method
• We assign differing charges to different operations,
with some operations charged more or less than they
actually cost.
• The amount we charge an operation is called
amortized cost.
• When an operation’s amortized cost exceeds its actual
cost, the difference is called credit.
• Credit can be used later to pay for operations whose
amortized cost is less than their actual cost.
A Simple Example: Accounting method
1
0 0
1 0
1
1 0
1
3 ops: 1 1 1
1 1 1
• When Incrementing,
▫ Amortized cost for line 3 = $0
▫ Amortized cost for line 6 = $2
• Amortized cost for INCREMENT(A) = $2
• Amortized cost for n INCREMENT(A) = $2n =O(n)
The potential method
• Represent prepaid work as “potential energy” or
“potential”, that can be released to pay for future
operations.
• Potential is associated with the data structure as a
whole, rather than with specific object within the data
structure.
The potential method
• Start with an initial data structure D0.
• Operation i transforms Di–1 to Di.
• The actual cost of operation i is ci.
• Define a potential function Φ : {Di} → R,
such that Φ(D0 ) = 0 and Φ(Di ) ≥ 0 for all i.
• The amortized cost ĉi with respect to Φ is defined to
be ĉi = ci + Φ(Di) – Φ(Di–1).
i.e. Amortized cost = actual cost + increase in potential
due to operation.
Understanding potential
ĉi = ci + Φ(Di) – Φ(Di–1)
= (ti + 1) + (1 − ti)
=2
Therefore, n INCREMENTs cost Θ(n) in the worst case.
Disjoint Sets
Manoj Kumar
DTU, Delhi
Disjoint Sets
A disjoint set data structure maintains a collection
S={S1, S2, …, Sk} of disjoint dynamic sets.
Each set is identified by a representative, which is
some member of the set.
Supports following operations:
MAKE_SET(x): creates a new set whose only
member is pointed by x.
UNION(x,y): unite the dynamic sets that contains x
and y, say Sx and Sy.
FIND_SET(x): returns a pointer to the representative
of the set containing x.
Linked-List Implementation
• Each set as a linked-list, with head and tail, and each
node contains value, next node pointer and back-to-
representative pointer.
Pointer to
representative node
Value c h e
Pointer to other
member Set S1={c,h,e}
Node structure
Linked-List for two sets
Set S1={c,h,e}
S1 c h e
Set S2={f, g}
S2 f g
UNION of
two Sets
S=S1 U S2
S f g c h e
Analysis
MAKE_SET(x) takes O(1) time: create a new linked
list whose only object is x.
S f g c h e
Union
A simple implementation: UNION(x,y) just appends x to the
end of y, updates all back-to-representative pointers in x to the
head of y.
Each UNION takes time linear in the x’s length.
Union: amortized cost
Consider sequence of m operations. m=n+q
where q=n-1
Let we have objects x1, x2, … xn.
We execute n MAKE-SET(xi) operations (O(1) each) followed by
q= n-1 UNION
UNION(x1, x2), O(1),
UNION(x2, x3), O(2),
…..
UNION(xn-1, xn), O(n-1) =O(q)
The UNIONs cost 1+2+…+q=Θ(q2)
So total time spent is Θ(n + q2), which is Θ(m2), since n = Θ(m),
and q = Θ(m).
Thus on average, each operation require Θ(m2)/m = Θ(m) time, that
is the amortized time of one operation.
Weighted Union
• If we are appending longer list onto a shorter list; we
must update the pointer to the representative for each
member of the longer list.
• Suppose each representative node also stores length
of list. This can be easily maintained.
• Weighted Union: we always append smaller list onto
the longer list, with ties broken arbitrarily.
Weighted Union: analysis
Result: a sequence of m MAKE-SET, UNION, FIND-SET
operations, n of which are MAKE-SET operations, the running
time is O(m+nlg n). Why???
Count the number of updates to back-to-representative pointer
for any x in a set of n elements. Consider that each time, the
UNION will at least double the length of united set, it will
take at most lg n UNIONS to unite n elements. So each x’s
back-to-representative pointer can be updated at most lg n
times. There are n objects so all Union operations taking n lg n
time.
The UNION operation can stil take Ω(m) time if both sets
have m elements.
Disjoint-Set Implementation: Forests
• Rooted trees, each tree is a set, root is the
representative. Each node points to its parent. Root
points to itself.
c cf cf
c d
h e d
cf a
c cf
c c d a
c
d b g
h e h e b g
Path Compression
Path Compression: used in FIND-SET(x) operation,
make each node in the path from x to the root
directly point to the root. Thus reduce the tree height.
cf
cf
c d a
c
j h c d a
c
h e b g
e b g
j
FIND-SET(j)
Path Compression
f f
e c d e
c
Algorithm for Disjoint-Set Forest
MAKE-SET(x) UNION(x,y)
1. p[x]←x 1. LINK(FIND-SET(x),FIND-SET(y))
2. rank[x]←0
LINK(x,y) FIND-SET(x)
1. if rank[x]>rank[y] 1. if x≠ p[x]
2. then p[y] ←x 2. then p[x] ←FIND-SET(p[x])
3. else p[x] ←y 3. return p[x]
4. if rank[x]=rank[y]
5. then rank[y]++
Running time
• Total m operations.
• n MAKE-SET,
• At most n-1 UNION, and
• f FIND-SET operations
• If we use only Union by Rank, Worst case running time
▫ Θ(n + f log(1+f/n) n) if f ≥ n,
▫ Θ(n + f lg n) if f < n.
• If we use both Union by Rank and path compression,
Worst case running time is O(mα(m,n)) where α(m,n) is
inverse of Ackermann’s function.
• α(m,n) ≤ 4 O(m) running time O(m)/m =O(1) per
operation.