05 Divide and Conquer I
05 Divide and Conquer I
D IVIDE A ND C ONQUER I
‣ mergesort
‣ counting inversions
‣ randomized quicksort
‣ median and selection
‣ closest pair of points
Divide-and-conquer.
Divide up problem into several subproblems (of the same kind).
Solve (conquer) each subproblem recursively.
Combine solutions to subproblems into overall solution.
Most common usage.
Divide problem of size n into two subproblems of size n / 2. O(n) time
Solve (conquer) two subproblems recursively.
Combine two solutions into overall solution. O(n) time
Consequence.
Brute force: Θ(n2).
Divide-and-conquer: O(n log n).
‣ mergesort
‣ counting inversions
‣ randomized quicksort
‣ median and selection
‣ closest pair of points
SECTIONS 5.1–5.2
Sorting problem
4
Sorting applications
Obvious applications.
Organize an MP3 library.
Display Google PageRank results.
List RSS news items in reverse chronological order.
Some problems become easier once elements are sorted.
Identify statistical outliers.
Binary search in a database.
Remove duplicates in a mailing list.
Non-obvious applications.
Convex hull.
Closest pair of points.
Interval scheduling / interval partitioning.
Scheduling to minimize maximum lateness.
Minimum spanning trees (Kruskal’s algorithm).
... 5
Mergesort
input
A L G O R I T H M S
A G L O R I T H M S
A G L O R H I M S T
merge results
A G H I L M O R S T
6
Merging
3 7 10 ai 18 2 11 bj 20 23
2 3 7 10 11
7
Mergesort implementation
MERGE-SORT(L)
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
B ← MERGE-SORT(B). T (n / 2)
RETURN L.
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
8
A useful recurrence relation
9
Divide-and-conquer recurrence: recursion tree
0 n=1 assuming n
T (n) = is a power of 2
<latexit sha1_base64="Z3k0jP42WP//6lnqi9e2pn3k2fw=">AAACo3icbVFNa9tAEF0pbZq4H3HSYy5DnYaUFkcyhaSElEAvPfSQFrsJWMKsViNnyWoldkfBrvAP7S0/pStHhdrpwC6PNzNvZt8mpZKWguC35288ebr5bGu78/zFy1c73d29n7aojMCRKFRhrhNuUUmNI5Kk8Lo0yPNE4VVy+6XJX92hsbLQQ5qXGOd8qmUmBSdHTbq/hkf6HURncN5cnSjBqdS1cIp20YnOAjiEiHBGNcgMDrQrCw9gAVE0DvonmMeuZgDRB3Ayx4NG6L2T0etdn/92dSLUaSs/6faCfrAMeAzCFvRYG5eTXW8vSgtR5ahJKG7tOAxKimtuSAqFbt/KYsnFLZ/i2EHNc7RxvTRpAW8dk0JWGHc0wZL9t6PmubXzPHGVOacbu55ryP/lxhVlp3EtdVkRavEwKKsUUAGN45BKg4LU3AEujHS7grjhhgty/7IyZaldolh5ST2rtBRFimusohkZ3rgYrnv2GIwG/U/98PvH3sVpa+cW22dv2BEL2Qm7YF/ZJRsxwe69TW/H6/qH/jf/hz98KPW9tuc1Wwk//gPFucbd</latexit>
2 T (n/2) + n n>1
T (n) n = n
T (n / 2) T (n / 2) 2 (n/2) = n
T (n / 4) T (n / 4) T (n / 4) T (n / 4) 4 (n/4) = n
log 2 n
T (n / 8) T (n / 8) T (n / 8) T (n / 8) T (n / 8) T (n / 8) T (n / 8) T (n / 8) 8 (n/8) = n
⋮
⋮
T(n) = n log2 n
10
Proof by induction
Proposition. If T (n) satisfies the following recurrence, then T (n) = n log2 n.
0 n=1 assuming n
is a power of 2
T (n) =
<latexit sha1_base64="Z3k0jP42WP//6lnqi9e2pn3k2fw=">AAACo3icbVFNa9tAEF0pbZq4H3HSYy5DnYaUFkcyhaSElEAvPfSQFrsJWMKsViNnyWoldkfBrvAP7S0/pStHhdrpwC6PNzNvZt8mpZKWguC35288ebr5bGu78/zFy1c73d29n7aojMCRKFRhrhNuUUmNI5Kk8Lo0yPNE4VVy+6XJX92hsbLQQ5qXGOd8qmUmBSdHTbq/hkf6HURncN5cnSjBqdS1cIp20YnOAjiEiHBGNcgMDrQrCw9gAVE0DvonmMeuZgDRB3Ayx4NG6L2T0etdn/92dSLUaSs/6faCfrAMeAzCFvRYG5eTXW8vSgtR5ahJKG7tOAxKimtuSAqFbt/KYsnFLZ/i2EHNc7RxvTRpAW8dk0JWGHc0wZL9t6PmubXzPHGVOacbu55ryP/lxhVlp3EtdVkRavEwKKsUUAGN45BKg4LU3AEujHS7grjhhgty/7IyZaldolh5ST2rtBRFimusohkZ3rgYrnv2GIwG/U/98PvH3sVpa+cW22dv2BEL2Qm7YF/ZJRsxwe69TW/H6/qH/jf/hz98KPW9tuc1Wwk//gPFucbd</latexit>
2 T (n/2) + n n>1
Pf. [ by induction on n ]
Base case: when n = 1, T(1) = 0 = n log2 n.
Inductive hypothesis: assume T(n) = n log2 n.
Goal: show that T(2n) = 2n log2 (2n).
recurrence
T(2n) = 2 T(n) + 2n
= 2 n (log2 (2n) – 1) + 2n
= 2 n log2 (2n). ▪
11
Divide-and-conquer: quiz 1
A. T(n) = n ⎣log2 n⎦
B. T(n) = n ⎡log2 n⎤
12
Analysis of mergesort recurrence
Proposition. If T(n) satisfies the following recurrence, then T(n) ≤ n ⎡log2 n⎤.
0 n=1
T (n)
T ( n/2 ) + T ( n/2 ) + n n>1
<latexit sha1_base64="PlaAp4PYCkUIob8GuIy/Wl4iXIM=">AAACynicbVFdb9MwFHUyYKN8deORlys6UBGiJBOiQxNoEi+8IA2pZZPqqHKcm86aY0e2M7WK+sYf4Wfxb3CyTKId98VH51yf+5WWUlgXRX+CcOfe/Qe7ew97jx4/efqsv3/w0+rKcJxyLbW5SJlFKRROnXASL0qDrEglnqdXXxv9/BqNFVpN3KrEpGALJXLBmfPUvP97MlRvgJ4Aldg8PZriQqiae0+77tGTCF4Ddbh0NYgcDhV8hvgQ1kDpbDTGIvEpkyGVudTagHp/BNS0uDF96w0bkaOQndbAW0ltW3+5te5RVFnXw7w/iEZRG3AXxB0YkC7O5vvBAc00rwpUjktm7SyOSpfUzDjBJfqhKosl41dsgTMPFSvQJnW7yzW88kwGuR8m18pBy/77o2aFtasi9ZkFc5d2W2vI/2mzyuXHSS1UWTlU/KZQXklwGprDQCYMcidXHjBuhO8V+CUzjDt/vo0qrXeJfGOSelkpwXWGW6x0S2dYs8V4e2d3wfRo9GkU//gwOD3u1rlHXpCXZEhiMian5Bs5I1PCg93gXfAxGIffQxuuwvomNQy6P8/JRoS//gIlc9Vh</latexit>
no longer assuming n
is a power of 2
Pf. [ by strong induction on n ]
Base case: n = 1.
Define n1 = ⎣n / 2⎦ and n2 = ⎡n / 2⎤ and note that n = n1 + n2.
Induction step: assume true for 1, 2, ... , n – 1.
Comparable[] a = ...;
... can access elements only
Arrays.sort(a); via calls to compareTo()
14
Comparison tree (for 3 distinct keys a, b, and c)
a<b
height of pruned tree =
yes no worst-case number
of compares
code between compares
(e.g., sequence of exchanges)
b<c a<c
yes no yes no
yes no yes no
n! leaves ≤ 2h leaves
16
Sorting lower bound
Note. Lower bound can be extended to include randomized algorithms. 17
SHUFFLING A LINKED LIST
Problem. Given a singly linked list, rearrange its nodes uniformly at random.
Assumption. Access to a perfect random-number generator.
all n! permutations
equally likely
L
input
2♣ 3♣ 4♣ 5♣ 6♣ 7♣ null
L
shuffled
5♣ 6♣ 2♣ 7♣ 3♣ 4♣ null
18
5. D IVIDE AND C ONQUER
‣ mergesort
‣ counting inversions
‣ randomized quicksort
‣ median and selection
‣ closest pair of points
SECTION 5.3
Counting inversions
Brute force: check all Θ(n2) pairs.
20
Counting inversions: applications
Voting theory.
Collaborative filtering.
Measuring the “sortedness” of an array.
Sensitivity analysis of Google’s ranking function.
Rank aggregation for meta-searching on the Web.
Nonparametric statistics (e.g., Kendall’s tau distance).
ABSTRACT
ABSTRACT
1.1 Motivation
1. INTRODUCTION
21
1.1 Motivation
Counting inversions: divide-and-conquer
input
1 5 4 8 10 2 6 9 3 7
1 5 4 8 10 2 6 9 3 7
1 5 4 8 10 2 6 9 3 7
4-2 4-3 5-2 5-3 8-2 8-3 8-6 8-7 10-2 10-3 10-6 10-7 10-9
output 1 + 3 + 13 = 17 22
Counting inversions: how to combine two subproblems?
Warmup algorithm.
Sort A and B.
For each element b ∈ B,
- binary search in A to find how elements in A are greater than b.
list A list B
7 10 18 3 14 20 23 2 11 16
sort A sort B
3 7 10 14 18 2 11 16 20 23
3 7 10 14 18 2 11 16 20 23
5 2 1 0 0
23
Counting inversions: how to combine two subproblems?
3 7 10 ai 18 2 11 bj 20 23
5 2
2 3 7 10 11
24
Counting inversions: divide-and-conquer algorithm implementation
Input. List L.
Output. Number of inversions in L and L in sorted order.
SORT-AND-COUNT(L)
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
(rB, B) ← SORT-AND-COUNT(B). T (n / 2)
RETURN (rA + rB + rAB, L).
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
25
Counting inversions: divide-and-conquer algorithm analysis
(1) n=1
T (n) =
<latexit sha1_base64="3pXfe56Q7VvCgldQ8ZGmWT26i7U=">AAAC2HicbVHLattAFB0pfaTuy0npqpuhTotDwZFCIQmmJdAWukzBbgIaYUbjK3vIaCRmroqN8KKr0m1/pN/Tv+lIUaC2exfD4Zw799xHUihpMQj+eP7Onbv37u8+6Dx89PjJ0+7e/lebl0bAWOQqN1cJt6CkhjFKVHBVGOBZouAyuf5Q65ffwFiZ6xEuC4gzPtMylYKjoybd36O+PqRsSN/VT4clMJO6Eq6iXXXYkI3mgLwfHtLXlCEssKIypQfapYcHdEUZiwYnkMUuddRnKlV5bqg+OqbMNLgu/cYVrkUBUrVaDW+l1kJvWby/tegw0NO2p0m3FwyCJug2CFvQI21cTPa8fTbNRZmBRqG4tVEYFBhX3KAUCtyQpYWCi2s+g8hBzTOwcdVsdkVfOWZKUzdUmmukDfvvj4pn1i6zxGVmHOd2U6vJ/2lRielpXEldlAha3BilpaKY0/pMdCoNCFRLB7gw0vVKxZwbLtAdc82lqV2AWJukWpRainwKG6zCBRpebzHc3Nk2GB8Pzgbhl7e989N2nbvkBXlJ+iQkJ+ScfCYXZEyE99wbeh+9T37kf/d/+D9vUn2v/fOMrIX/6y+lidpE</latexit>
T ( n/2 ) + T ( n/2 ) + (n) n>1
26
5. D IVIDE AND C ONQUER
‣ mergesort
‣ counting inversions
‣ randomized quicksort
‣ median and selection
‣ closest pair of points
SECTION 7.1–7.3
3-WAY PARTITIONING
the array A
7 6 12 3 11 8 9 1 4 10 2
3 1 4 2 6 7 12 11 8 9 10
L M R
28
Randomized quicksort
the array A 7 6 12 3 11 8 9 1 4 10 2
partition A 3 1 4 2 6 7 12 11 8 9 10
sort L 1 2 3 4 6 7 12 11 8 9 10
sort R 1 2 3 4 6 7 8 9 10 11 12
RANDOMIZED-QUICKSORT(A)
_______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
30
Analysis of randomized quicksort
first pivot
(chosen uniformly at random)
first pivot in
left subarray
a9
a3 a11
a1 a8 a10 a13
a2 a4 a12
a6
a5 a7
31
Analysis of randomized quicksort
a3 a11
a1 a8 a10 a13
a2 a4 a12
a6
a5 a7
32
Analysis of randomized quicksort
a3 a11
a1 a8 a10 a13
a2 a4 a12
a6
a5 a7
33
Divide-and-conquer: quiz 2
Given an array of n ≥ 8 distinct elements a1 < a2 < < an, what is the
probability that a7 and a8 are compared during randomized quicksort?
A. 0
B. 1/n
C. 2/n
if two elements are adjacent,
D. 1 one must be an ancestor of the other
a9
a3 a11
a1 a8 a10 a13
a2 a4 a12
a6
a5 a7
34
Divide-and-conquer: quiz 3
Given an array of n ≥ 2 distinct elements a1 < a2 < < an, what is the
probability that a1 and an are compared during randomized quicksort?
A. 0
B. 1/n
a1 and an are compared
C. 2/n iff first pivot is either a1 or an
D. 1
a9
a3 a11
a1 a8 a10 a13
a2 a4 a12
a6
a5 a7
35
Analysis of randomized quicksort
a3 a11
a1 a8 a10 a13
a2 a4 a12
a6
a5 a7
36
Analysis of randomized quicksort
2n
2n (ln
2n n+x dx
1) ▪
<latexit sha1_base64="GjZqOMHt99byYg9EpBf0uRtS8Xc=">AAADBHicjZFdb9MwFIad8DXKx7pxyc0RFWhotIojJDZVk4a44XJIlE2qS+W4TufNcYLtTKusXO/X7Apxy/9A/BmctIi2Q4gjRXn9nHNeJ+ckhRTGRtGPILx1+87dexv3Ww8ePnq82d7a/mTyUjM+YLnM9UlCDZdC8YEVVvKTQnOaJZIfJ+fv6vzxBddG5OqjnRV8lNGpEqlg1Ho0bv8kpszGThzg6rNTFZBXMCdncABi11PVsFRT5uLKnXVFF1fwAkif9H1J84oB/mUT16xbm/2xwt7KH0nrtxWRfC4gVku9S37/09eHHSIVKNgF/HLc7kS9qAm4KfBCdNAijsZbwTaZ5KzMuLJMUmOGOCrsyFFtBZO8apHS8IKyczrlQy8VzbgZuWYLFTz3ZAJprv2jLDR0ucPRzJhZlvjKjNpTs56r4d9yw9KmeyMnVFFartj8orSUYHOoVwoToTmzcuYFZVr4bwV2Sv20rF/8yi2Nd8HZyp+4y1IJlk/4GpX20mpa+Sni9ZndFIO4t9/DH153DvcW49xAT9EztIMweoMO0Xt0hAaIBW+DaVAEX8Kr8Dr8Gn6bl4bBoucJWonw+y/+pen7</latexit>
x=1
x=1 x
= 2n ln n
= 2n ln n
harmonic sum
Remark. Number of compares only decreases if equal elements. 37
Tony Hoare
n u m b e r ) . 9.9 X 10 45 is u s e d to r e p r e s e n t i n f i n i t y . I m a g i n a r y
v a l u e s of x m a y n o t be n e g a t i v e a n d reM v a l u e s of x m a y n o t be
conunent I a n d J are o u t p u t v a r i a b l e s , a n d A is t h e a r r a y ( w i t h
s u b s c r i p t b o u n d s M : N ) w h i c h is o p e r a t e d u p o n b y t h i s p r o c e d u r e .
s m a l l e r t h a n 1. P a r t i t i o n t a k e s t h e v a l u e X of a r a n d o m e l e m e n t of t h e a r r a y A,
V a l u e s of Qd~'(x) m a y be c a l c u l a t e d e a s i l y by h y p e r g e o m e t r i c a n d r e a r r a n g e s t h e v a l u e s of t h e e l e m e n t s of t h e a r r a y in s u c h a
series if x is n o t t o o s m a l l n o r (n - m ) t o o large. Q~m(x) c a n be w a y t h a t t h e r e e x i s t i n t e g e r s I a n d J w i t h t h e following p r o p e r t i e s :
c o m p u t e d f r o m a n a p p r o p r i a t e set of v a l u e s of Pnm(X) if X is n e a r M _-< J < I =< N p r o v i d e d M < N
procedure
[ but couldn’t explain his algorithm or implement it! ]
b o t h large a n d small;
Q L E G ( m , n m a x , x, ri, R, Q); v a l u e In, n m a x , x, ri;
chooses equiprobably a random integer F between M and N, and
also a p r o c e d u r e e x c h a n g e , w h i c h e x c h a n g e s t h e v a l u e s of its t w o
r e a l In, m n a x , x, ri; r e a l a r r a y R , Q; parameters ;
begin
Learned Algol 60 (and recursion).
r e a l t, i, n, q0, s;
n : = 20;
begin real X; integer F;
F : = r a n d o m ( M , N ) ; X : = A[F];
i f n m a x > 13 t h e n I:=M; J:=N;
Implemented quicksort.
n := nmax +7;
i f ri = 0 t h e n
up: for I : = I s t e p 1 u n t i l N d o
i f X < A [I] t h e n g o to d o w n ;
begin ifm = 0then I:=N;
Q[0] : = 0.5 X 10g((x + 1 ) / ( x - 1)) down: f o r J : = J s t e p --1 u n t i l M d o
else if A[J]<X then go to change;
begin t : = - - 1 . 0 / s q r t ( x X x - - 1); J:=M;
q0 : = 0; c h a n g e : i f I < J t h e n b e g i n e x c h a n g e (A[IL A[J]); Tony Hoare
Q[O] : = t ; I := I+ 1;J:= J - 1;
for i : = 1 step 1 until m do g o to u p 1980 Turing Award
begin s := (x+x)X(i-1)Xt end 2: b e g i n
4
×Q[0]+ (3i-i×i-2)×q0; else i f [ < F t h e n b e g i n e x c h a n g e (A[IL A[F]) i i f c >= 0 t h e n
q0 : = Q[0]; I:=I+l 3:begin
Q[0] : = s e n d e n d ; end e:= aXe;f := bXd;goto8
if x = 1 then else i f F < J t l l e n b e g i n e x c h a n g e (A[F], A[J]) ; end 3 ;
Q[0] : = 9.9 I" 45; J:=J-1 e:=bXc;
R [ n + 1] : = x - s q r t ( x X x - 1); end ; ifd ~ 0then
for i : = n s t e p --1 u n t i l 1 d o end partition 4: begin
R[i] : = (i + m ) / ( ( i + i + 1) X x f:=bXd; goto8
+(m-i- 1) X R [ i + l ] ) ; e n d 4;
ALGORITHM 61
g o to t h e e n d ; f:=aXd; goto8
if m = 0 then ALGORITHM
PROCEDURES 64F O R R A N G E ARITHMETIC 5: e n d 2;
b e g i n i f x < 0.5 t b e n QUICKSORT
ALLAN GIBB* ifb > 0 then
Q[0] : = a r c t a n ( x ) - 1.5707963 e l s e U n iA.
C. i t y HOARE
v e r sR. of A l b e r t a , C a l g a r y , A l b e r t a , C a n a d a 6: b e g i n
Q[0] : = - a r e t a n ( 1 / x ) e n d e l s e if d > 0 then
Elliott
b e g i n Brothers Ltd., Borehamwood, Hertfordshire, Eng.
begin t : = 1 / s q r t ( x X x + 1); begin
p r o c e d u r e R A N G E S U M (a, b, c, d, e, f);
q0 : = 0; procedure quicksort (A,M,N); value M,N; e : = M I N ( a X d, b X c);
real a , b , c , d , e , f ;
q[0] := t; a r r a y A; i n t e g e r M , N ; f : = M A X ( a X c , b X d); go t o 8
c o m m e n t The term "range n u m b e r " was used by P. S. Dwyer,
for i : = 2 step 1 until m do comment Q u i c k s o r t is a v e r y f a s t a n d c o n v e n i e n t m e t h o d of e n d 6;
b e g i n s : = (x + x) X (i -- 1) X t X Q[0I
Linear Computations (Wiley, 1951). Machine procedures for
s o r t i n g a n a r r a y in t h e r a n d o m - a c c e s s s t o r e of a c o m p u t e r . T h e e : = b X c; f : = a X c; go t o 8
range arithmetic were developed about 1958 by Ramon Moore,
+(3i+iX i -- 2) × q0; e n t i r e c o n t e n t s of t h e s t o r e m a y be s o r t e d , since no e x t r a s p a c e is e n d 5;
"Automatic Error Analysis in Digital C o m p u t a t i o n , " LMSD
qO : = Q[0]; r e q u i r e d . T h e a v e r a g e n u m b e r of c o m p a r i s o n s m a d e is 2 ( M - - N ) In f:=aXc;
Report 48421, 28 Jan. 1959, Lockheed Missiles and Space Divi-
Q[0] := s e n d e n d ; ( N - - M ) , a n d t h e a v e r a g e n m n b e r of e x c h a n g e s is one s i x t h t h i s i f d _-<O t h e n
sion, Palo Alto, California, 59 pp. If a _< x -< b and c ~ y ~ d,
R[n + 1] : = x - s q r t ( x × x + 1); a m o u n t . S u i t a b l e r e f i n e m e n t s of t h i s m e t h o d will be d e s i r a b l e for 7: b e g i n
then R A N G E S U M yields an interval [e, f] such t h a t e =< (x + y)
for i := n step - 1 until 1 do its i m p l e m e n t a t i o n on a n y a c t u a l c o m p u t e r ; e:=bXd; goto8
f. Because of machine operation (truncation or rounding) the
R[i] : = (i + m ) / ( ( i -- m + 1) × R[i + 1] begin i n t e g e r 1,J ; end 7 ;
machine sums a -4- c and b -4- d may not provide safe end-points
--(i+i+ 1) X x); if M < N then begin partition (A,M,N,I,J); e:=aXd;
of the output interval. Thus R A N G E S U M requires a non-local
for i : = 1 step 2 until nmax do quicksort (A,M,J) ; 8: e : = A D J U S T P R O D (e, - 1 ) ;
real procedure A D J U S T S U M which will compensate for the
Rill : = -- Rill; q u i c k s o r t (A, I, N ) f := A D J U S T P R O D (f, 1)
machine arithmetic. The body of A D J U S T S U M will be de-
the: for i : = 1 step 1 until nnmx do end end RANGEMPY;
pendent upon the type of machine for which it is written and so
Q[i] : = Q[i - 1] X R[i] end quieksort p r o c e d u r e R A N G E D V D (a, b, c, d, e, f) ;
is not given here. (An example, however, appears below.) I t
end QLEG; real a, b, c, d, e, f;
is assumed t h a t A D J U S T S U M has as parameters real v and w,
c o m m e n t If the range divisor includes zero the program
* T h i s p r o c e d u r e w a s d e v e l o p e d in p a r t u n d e r t h e s p o n s o r s h i p and integer i, and is accompanied by a non-local real procedure
exists to a non-local label " z e r o d v s r " . R A N G E D V D assumes a
of t h e A i r F o r c e C a m b r i d g e R e s e a r c h C e n t e r .
Communications of the ACM (July 1961)
C O R R E C T I O N 65
ALGORITHM which gives an upper bound to the magnitude
of the error involved in the machine representation of a number.
non-local real procedure A D J U S T Q U O T which is analogous
FINDThe output A D J U S T S U M provides the left end-point of the
(possibly identical) to A D J U S T P R O D ; 38
C.output A. R.intervalHOARE begin
of R A N G E S U M when A D J U S T S U M is called
NUTS AND BOLTS
‣ mergesort
‣ counting inversions
‣ randomized quicksort
‣ median and selection
‣ closest pair of points
SECTION 9.3
Median and selection problems
Selection. Given n elements from a totally ordered universe, find kth smallest.
Minimum: k = 1; maximum: k = n.
Median: k = ⎣(n + 1) / 2⎦.
O(n) compares for min or max.
O(n log n) compares by sorting.
O(n log k) compares with a binary heap.
max heap with k smallest
43
Randomized quickselect
QUICK-SELECT(A, k)
_____________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
44
Randomized quickselect analysis
Intuition. Split candy bar uniformly ⇒ expected size of larger piece is ¾.
T(n) ≤ T(3 n / 4) + n ⇒ T(n) ≤ 4 n
not rigorous: can’t assume
E[T(i)] ≤ T(E[i])
Proposition. T(n) ≤ 4 n.
Pf. [ by strong induction on n ] can assume we always recur of
larger of two subarrays since T(n)
Assume true for 1, 2, …, n – 1. is monotone non-decreasing
T(n) satisfies the following recurrence:
Goal. Find pivot element p that divides list of n elements into two pieces so
that each piece is guaranteed to have ≤ 7/10 n elements.
Q. How to find approximate median in linear time?
A. Recursively compute median of sample of ≤ 2/10 n elements.
Θ(1) if n = 1
T(n) =
T (7/10 n) + T (2/10 n) + Θ(n) otherwise
two subproblems
of different sizes!
⇒ T(n) = Θ(n)
46
Choosing the pivot element
29 10 38 37 2 55 18 24 34 35 36
22 44 52 11 53 12 13 43 20 4 27
28 23 6 26 40 19 1 46 31 49 8
14 9 5 3 54 30 48 47 32 51 21
45 39 50 15 25 16 41 17 22 7
n = 54 47
Choosing the pivot element
medians
29 10 38 37 2 55 18 24 34 35 36
22 44 52 11 53 12 13 43 20 4 27
28 23 6 26 40 19 1 46 31 49 8
14 9 5 3 54 30 48 47 32 51 21
45 39 50 15 25 16 41 17 22 7
n = 54 48
Choosing the pivot element
median of
medians
29 10 38 37 2 55 18 24 34 35 36
22 44 52 11 53 12 13 43 20 4 27
28 23 6 26 40 19 1 46 31 49 8
14 9 5 3 54 30 48 47 32 51 21
45 39 50 15 25 16 41 17 22 7
n = 54 49
Median-of-medians selection algorithm
MOM-SELECT(A, k)
_____________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
n ← | A |.
IF (n < 50)
RETURN kth smallest of element of A via mergesort.
Group A into ⎣n / 5⎦ groups of 5 elements each (ignore leftovers).
B ← median of each group of 5.
p ← MOM-SELECT(B, ⎣n / 10⎦) median of medians
(L, M, R) ← PARTITION-3-WAY(A, p).
IF (k ≤ | L |) RETURN MOM-SELECT(L, k).
ELSE IF (k > | L | + | M |) RETURN MOM-SELECT(R, k – | L | – | M |)
ELSE RETURN p.
_____________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
50
Analysis of median-of-medians selection algorithm
medians
median of
medians p 29 10 38 37 2 55 18 24 34 35 36
22 44 52 11 53 12 13 43 20 4 27
28 23 6 26 40 19 1 46 31 49 8
14 9 5 3 54 30 48 47 32 51 21
45 39 50 15 25 16 41 17 22 7
n = 54 51
Analysis of median-of-medians selection algorithm
medians ≤ p
median of
medians p 29 10 38 37 2 55 18 24 34 35 36
22 44 52 11 53 12 13 43 20 4 27
28 23 6 26 40 19 1 46 31 49 8
14 9 5 3 54 30 48 47 32 51 21
45 39 50 15 25 16 41 17 22 7
n = 54 52
Analysis of median-of-medians selection algorithm
medians ≤ p
median of
medians p 29 10 38 37 2 55 18 24 34 35 36
22 44 52 11 53 12 13 43 20 4 27
28 23 6 26 40 19 1 46 31 49 8
14 9 5 3 54 30 48 47 32 51 21
45 39 50 15 25 16 41 17 22 7
n = 54 53
Analysis of median-of-medians selection algorithm
medians
median of
medians p 29 10 38 37 2 55 18 24 34 35 36
22 44 52 11 53 12 13 43 20 4 27
28 23 6 26 40 19 1 46 31 49 8
14 9 5 3 54 30 48 47 32 51 21
45 39 50 15 25 16 41 17 22 7
n = 54 54
Analysis of median-of-medians selection algorithm
medians ≥ p
median of
medians p 29 10 38 37 2 55 18 24 34 35 36
22 44 52 11 53 12 13 43 20 4 27
28 23 6 26 40 19 1 46 31 49 8
14 9 5 3 54 30 48 47 32 51 21
45 39 50 15 25 16 41 17 22 7
n = 54 55
Analysis of median-of-medians selection algorithm
medians ≥ p
median of
medians p 29 10 38 37 2 55 18 24 34 35 36
22 44 52 11 53 12 13 43 20 4 27
28 23 6 26 40 19 1 46 31 49 8
14 9 5 3 54 30 48 47 32 51 21
45 39 50 15 25 16 41 17 22 7
n = 54 56
Median-of-medians selection algorithm recurrence
58
Divide-and-conquer: quiz 4
A. Yes, obviously.
D. No.
C(19) = 165
C(20) = 134.2
59
Median-of-medians selection algorithm recurrence
= 44 n. ▪ 60
Divide-and-conquer: quiz 5
C. Both A and B.
D. Neither A nor B.
61
Linear-time selection retrospective
RONALD L . RIVEST, AND ROBERT E. TARJAN
The number of comparisons required to select the i-th smallest of n numbers is shown
to be at most a linear function of n by analysis of a new selection algorithm--PICK.
Specifically, no more than 5.4305 n comparisons are ever required. This bound is
improved for extreme values of i, and a new lower bound on the requisite number
1. INTRODUCTION
Theory. In this paper we present a new selection algorithm, PICK, and derive by an analysis
of its efficiency the (surprising) result that the cost of selection is at most a linear
1883, Lewis Carroll published an article [1] denouncing the unfair method by which
the second-best player is usually determined in a "knockout tournament" -- the loser
of the final match is often not the second-best! (Any of the players who lost only
Practice. Constants too large to be useful.
to the best player may be second-best.) Around 1930, Hugo Steinhaus brought the
62
problem into the realm of algorithmic complexity by asking for the minimum number
of matches required to (correctly) select both the first- and second-best players
5. D IVIDE AND C ONQUER
‣ mergesort
‣ counting inversions
‣ randomized quicksort
‣ median and selection
‣ closest pair of points
SECTION 5.4
Closest pair of points
Closest pair problem. Given n points in the plane, find a pair of points
with the smallest Euclidean distance between them.
64
Closest pair of points
Closest pair problem. Given n points in the plane, find a pair of points
with the smallest Euclidean distance between them.
Brute force. Check all pairs with Θ(n2) distance calculations.
65
Closest pair of points: first attempt
Sorting solution.
Sort by x-coordinate and consider nearby points.
Sort by y-coordinate and consider nearby points.
66
Closest pair of points: first attempt
Sorting solution.
Sort by x-coordinate and consider nearby points.
Sort by y-coordinate and consider nearby points.
67
Closest pair of points: second attempt
68
Closest pair of points: second attempt
69
Closest pair of points: divide-and-conquer algorithm
21
8
12
70
How to find closest pair with one point in each side?
Find closest pair with one point in each side, assuming that distance < δ.
Observation: suffices to consider only those points within δ of line L.
21
12 δ = min(12, 21)
71
δ
How to find closest pair with one point in each side?
Find closest pair with one point in each side, assuming that distance < δ.
Observation: suffices to consider only those points within δ of line L.
Sort points in 2 δ-strip by their y-coordinate.
Check distances of only those points within 7 positions in sorted list!
why?
5
4
21
12 δ = min(12, 21)
3
72
δ
How to find closest pair with one point in each side?
Def. Let si be the point in the 2 δ-strip, with the ith smallest y-coordinate.
Claim. If ⎜j – i⎟ > 7, then the distance between
si and sj is at least δ. L
Pf.
Consider the 2δ-by-δ rectangle R in strip
sj
½δ
At most 1 point per square.
si
At most 7 other points can be in R. ▪
2δ
73
Closest pair of points: divide-and-conquer algorithm
δ ← min { δ1 , δ2 }.
Delete all points further than δ from line L. O(n)
74
Divide-and-conquer: quiz 6
A. T(n) = Θ(n).
D. T(n) = Θ(n2).
75
Refined version of closest-pair algorithm
76
Divide-and-conquer: quiz 7
A. Θ(n).
77
Computational complexity of closest-pair problem
Theorem. [Ben-Or 1983, Yao 1989] In quadratic decision tree model, any
algorithm for closest pair (even in 1D) requires Ω(n log n) quadratic tests.
(x1 - x2)2 + (y1 - y2)2
Theorem. [Rabin 1976] There exists an algorithm to find the closest pair of
points in the plane whose expected running time is O(n).
Volume 8, number 1 IhFCRMATION PROCESSING LE,TTERS January 1979
A NOTE ON RABIN’S NEAREST-NEIGHBOR ALGORITHM * not subject to Ω(n log n) lower bound
because it uses the floor function
Steve FORTUNE and John HOPCROFT
Department of Computer Science, Cornell University,Ithaca, NY, U.S.A.
78
Digression: computational geometry
79
Convex hull
Equivalent definitions.
Smallest area convex polygon enclosing the points.
Intersection of all convex set containing all the points.
80
Farthest pair
Given n points in the plane, find a pair of points with the largest Euclidean
distance between them.
81
Delaunay triangulation
Applications.
Robotics.
VLSI design.
Data mining.
Medical imaging.
Computer vision.
Scientific computing.
airflow around an aircraft wing
Finite-element meshing.
Astronomical simulation.
Models of physical world.
Geographic information systems.
Computer graphics (movies, games, virtual reality).
http://www.ics.uci.edu/~eppstein/geom.html
84