0% found this document useful (0 votes)

120 views5 pages

L11 PDF

The document discusses techniques for dynamic hashing including cuckoo hashing and FKS hashing. Cuckoo hashing maintains a hash table with two hash functions and handles insertions by evicting collided items and finding empty slots for them in O(1) expected time. FKS hashing constructs a static hash table with no collisions in O(n) expected time using multiple levels of hashing with exponentially increasing table sizes. The document also discusses universal hashing families that provide probabilistic guarantees for hash collisions.

Uploaded by

bala_thegame

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

120 views5 pages

L11 PDF

Uploaded by

bala_thegame

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Lecture 11

Hashing: static perfect hashing via FKS, dynamic cuckoo hashing

The Problem: Membership/Dictionary: maintain a set S of n items from a universe U under:

- query(x): x ∈ S? (+ information associated with x)
- insert(x) (dynamic)
- delete(x) (dynamic)

The Solution: A hash function h : U → [m] for some positive integer m < |U |.
– maintain a table T [1 . . . m] of linked lists (chains)
– insert(x): add x to T [h(x)].
– query(x): scan T [h(x)].
– ∀h there exist x 6= y s.t h(x) = h(y) ⇒ our goal is short chains.

Theorem 1. If m > n and h is selected uniformly from all hash functions then insert/delete/query
take O(1) expected time.

However, a random hash function requires |U | lg m bits to represent ⇒ infeasible.

Universal Hashing:

weak universal hashing is enough to obtain O(1) expected time per operation.

Definition 1. A set H of hash functions is a weak universal family if for all x, y ∈ U , x 6= y,

O(1)
Pr [h(x) = h(y)] = .
h←H m
d
- Sometimes called d-universal for probability= m .
- Why is weak universal enough?
n
Pick m so that m = O(1), and randomly pick h ∈ H. Let Iy = 1 iff h(x) = h(y).
 
X X X O(1)
E[chain length] = E  Iy  = E[Iy ] = 1 + Pr[h(x) = h(y)] ≤ 1 + n · = O(1)
m
y∈S y∈S y6=x

- Dictionary construction: randomly choose h ∈ H and hash all elements. If there’s a chain that is
“too long”, pick a new h and rehash (rebuild the table from scratch, we will use this idea a lot).

- An example weak universal family: Hp,m = {ha,b | a ∈ {1, 2, . . . , p}, b ∈ {0, 1, 2, . . . , p}}, for some
prime p > |U |, where ha,b (x) = ((ax + b) mod p) mod m. Proof in CLRS.

Definition 2. H is a strong universal family if for all distinct x, y ∈ U , and for all a, b ∈ [m],
O(1)
Pr [h(x) = a ∧ h(y) = b] = .
h←H m2
Definition 3. H is k-independent if for all k distinct items x1 , . . . , xk ∈ U , and for all a1 , . . . , ak ∈
[m],
O(1)
Pr [h(x1 ) = a1 ∧ h(x2 ) = a2 ∧ . . . ∧ h(xk ) = ak ] = .
h←H mk
- An example k-independent family: again, pick some prime p > |U |.

H = {h | h(x) = (c0 + c1 x + · · · + ck−1 xk−1 ) mod m, for some c0 , c1 , . . . , ck−1 ∈ [p]}.

Theorem 2 (Siegel, 1989). ∀ε > 0, ∃ a nΩ(1) -independent family of hash functions, each repre-
sented in nε space, and evaluated in O(1) time.

Theorem 3 (Pagh, Ostlin, 2003). ∃ a n-independent family of hash functions, each represented
in O(n) words, and evaluated in O(1) time.

Worst-case Guarantees in Static Hashing:

-Universal hashing gives good performance only in expectation ⇒ vulnerable to an adversary.

Theorem 4 (Gonnet, 1981).³Let H´be an n-independent family of hash functions. The expected
length of the longest chain is Θ lglglgnn .
³ ´
lg n
⇒ We can construct a static hash table with Θ lg lg n worst-case query time:

– pick a random h ∈ H, hash every x ∈ S (in O(n) time).

– if longest-chain ≤ 2·expected-length then stop.
– otherwise, pick a new h and start over.
1
Pr(bad hash function) ≤ 2 ⇒ O(1) trials, O(n) expected construction time.

- Mitzenmacher 1996 [3]: By using two hash functions (insert to the shorter list, search is in both
lists) we can get Θ(lg lg n) worst-case query time.

FKS - Static Hashing (Fredman, Komlós, Szemerédi [1])

- Construct static hash table with no collisions in expected O(n) time, O(n) worst-case space, and
O(1) worst-case query time.
- Requires only a weak universal family H
- Easy to implement.

First attempt: If m = Ω(n2 ) and we randomly pick h ∈ H then

X µ ¶
n c 1
E[number of collisions] = Pr[h(x) = h(y)] = · ≤
2 m 2
x,y∈S, x6=y

⇒ After expected O(1) trials, we get a collision-free hash function (total time is O(m) = O(n2 )).

Second attempt: If m = n, the same calculation yields

µ ¶
n c
E[number of collisions] = · = O(n)
2 n

⇒ After expected O(1) trials, we find a function h0 that produces O(n) collisions (total time is
O(n)).
FKS: Use h0 to hash into n buckets, then use hi ’s to hash a bucket of size ni to n2i locations.

1
U 2
h’ n22
1 n1 = 0 - 3
2 n2 = 2 h2 4 1
n3 =1 - 2
3
nm = 3 hm 4
5 nm2
6
7
8
9

Let ni = |{x ∈ S | h0 (x) = i}|.

P ¡ni ¢
(I) The number of collisions is i∈[m] 2 = O(n) because we choose h0 so. Thus,
 
X X µni ¶
n2i = O   = O(n).
2
i∈[m] i∈[m]

(II) We can hash ni elements into a table of size n2i without any collisions in expected O(n2i ) time.
⇒
- The construction takes O(n) + O(n21 ) + . . . + O(n2m ) = O(n) time in expectation
- Worst-case O(n) space.
- Worst-case O(1) query time (two hashes).

Cuckoo - Dynamic Hashing (Pagh and Rodler 2001 [5])

On the nesting habits of the Cuckoo bird...

- O(1) expected time for insert

- O(1) worst-case time for queries/deletes.
- Requires two O(lg n)-independent hash functions, h1 and h2 . (OPEN: same bound using only
O(1)-independent hash family)
- m > 2n (we will use m = 4n).
- Invariant: x is either at T [h1 (x)] or at T [h2 (x)] ⇒ query/delete takes worst-case two probes.

Insertion:

1. Compute h1 (x),
2. If T [h1 (x)] is empty, we put x there, and we are done.
Otherwise, if y ∈ T [h1 (x)], we evict y and put x in T [h1 (x)].
3. We find a new spot for y by looking at T [h1 (y)] or T [h2 (y)] (the one that is not occupy by x).
4. Repeat this process. After 6 lg n steps stop and rehash.
Let x1 , x2 , . . . , xt be the items that are evicted during the process.
- Cuckoo graph G = (V, E), where V = [m] and (h1 (x), h2 (x)) ∈ E for all x ∈ U . Insertion is one of
three possible walks on G:

x7
•o •O
x1 x1
x8 x6
¹ x2 x3 x4 x5 ¸ ²
• /• /• /• /• •^
x2
/• x3
/• /• x4 x5
/•
I_ u ^I_ u ^I_ u
x2 x3 x4

¼ x9 x10 x11 x12

• /• /• /• /•

x7
•o •O
x1
x8 x6
¸ x2 x3 x4 ² x5
•^ /• /• /• /•
I_ u ^I_ u ^I_ u
x2 x3 x4

¼ x9 x10
• /• /•
O
x13 x11
²
•o x12
•

Key observation: our functions are O(lg n)-independent so we can treat them as truly random
functions.

– No cycle: Pr[1st eviction] = Pr[T [h1 (x1 )] is occupied ] ≤

X 1 2n 1
(Pr[h1 (x) = h1 (x1 )] + Pr[h2 (x) = h1 (x1 )]) < 2n = = .
m 4n 2
x∈S,x6=x1

By same reasoning, Pr[2nd ≤ 2−2 , and Pr[tth eviction] ≤ 2−t ⇒ the expected running
P∞ eviction]
time of this case is ≤ t=1 t · 2−t = O(1).

1
Also, Pr[rehash] ≤ 2−6 lg n ≤ n2
(∗)

– One cycle: One of the path parts (solid, dashed

P or dotted) is at least t/3 long.
⇒ the expected running time of this case is ≤ ∞ t=1 t · 2−t/3 = O(1).
1
Also, Pr[rehash] ≤ 2−(6 lg n)/3 = n2
(∗)

– Two cycles: Counting argument. How many two-cycle configurations are there?
• The first item in the sequence is x1 .
• At most nt−1 choices of other items in the sequence.
• At most t choices for where the first loop occurs, t choices for where this loop returns, and
t choices for when the second loop occurs.
• We also have to pick t − 1 hash values to associate with the items.
⇒ At most t3 nt−1 (4n)t−1 configurations.

The probability that a specific configuration occurs is 2t (4n)−2t . Why?

⇒ The probability that some two-cycle configuration occurs is at most

t3 nt−1 (4n)t−1 2t t3
= .
(4n)2t 4n2 2t

⇒ The probability that a two-cycle occurs at all is at most

∞
X ∞ µ ¶
t3 1 X t3 1 1
= = · O(1) = O .(∗)
4n2 2t 4n2 2t 2n2 n2
t=2 t=2

By (∗)’s, Pr[insertion causes rehash] ≤ O(1/n2 ).

⇒ Pr[n insertions cause rehash] ≤ O(1/n).
⇒ Rehashing (n insertions) succeeds with prob. 1 − O(1/n), so after constant number of trials.
- A trial takes n · O(1) + O(lg n) = O(n) time in expectation.
⇒ Rehashing takes O(n) time in expectation.
⇒ The expected running time of an insertion is O(1) + O(1/n2 ) · O(n) = O(1) + O(1/n) = O(1).

References
1. M. Fredman, J. Komlós, E. Szemerédi, Storing a Sparse Table with O(1) Worst Case Access Time, Journal of the
ACM, 31(3):538-544, 1984.
2. G. Gonnet, Expected Length of the Longest Probe Sequence in Hash Code Searching, Journal of the ACM, 28(2):289-
304, 1981.
3. M. Mitzenmacher, The Power of Two Choices in Randomized Load Balancing, Ph.D. Thesis 1996.
4. A. Ostlin, R. Pagh, Uniform hashing in constant time and linear space, 35th STOC, p. 622-628, 2003.
5. R. Pagh, F. Rodler, Cuckoo Hashing, Journal of Algorithms, 51(2004), p. 122-144.
6. A. Siegel, On universal classes of fast hash functions, their time-space tradeoff, and their applications, 30th FOCS,
p. 20-25, Oct. 1989.

10 Dictionaries
No ratings yet
10 Dictionaries
11 pages
03 Hashing
No ratings yet
03 Hashing
21 pages
Cuckoo Hashing For Undergraduates: Rasmus Pagh IT University of Copenhagen March 27, 2006
No ratings yet
Cuckoo Hashing For Undergraduates: Rasmus Pagh IT University of Copenhagen March 27, 2006
6 pages
NITR 2 CuckooHashing
No ratings yet
NITR 2 CuckooHashing
16 pages
History-Independent Cuckoo Hashing
No ratings yet
History-Independent Cuckoo Hashing
20 pages
Hash (4) Cuckoo Hashing
No ratings yet
Hash (4) Cuckoo Hashing
15 pages
Cuckoo Hashing and Universal Hashing
No ratings yet
Cuckoo Hashing and Universal Hashing
31 pages
Lec 31 Handout
No ratings yet
Lec 31 Handout
18 pages
1 Hashing: 1.1 Desired Properties
No ratings yet
1 Hashing: 1.1 Desired Properties
8 pages
Red Black Trees2
No ratings yet
Red Black Trees2
4 pages
Lecture 6
No ratings yet
Lecture 6
57 pages
1 Overview: Lecture 2 - February 3, 2005
No ratings yet
1 Overview: Lecture 2 - February 3, 2005
6 pages
Hash Table 2010
No ratings yet
Hash Table 2010
43 pages
Efficient Hashing for Integers & Strings
No ratings yet
Efficient Hashing for Integers & Strings
17 pages
Brics: Hash and Displace: Efficient Evaluation of Minimal Perfect Hash Functions
No ratings yet
Brics: Hash and Displace: Efficient Evaluation of Minimal Perfect Hash Functions
14 pages
Lect1004 PDF
No ratings yet
Lect1004 PDF
7 pages
CS174: Note07
No ratings yet
CS174: Note07
5 pages
Problem Idea of Universal Hashing
No ratings yet
Problem Idea of Universal Hashing
14 pages
12 Hashing
No ratings yet
12 Hashing
9 pages
Lecture 23 Hash Code Map
No ratings yet
Lecture 23 Hash Code Map
41 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
25 pages
Implementing Dictionaries with Hashing
No ratings yet
Implementing Dictionaries with Hashing
25 pages
hw2 15211
No ratings yet
hw2 15211
8 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
78 pages
Universal Hashing Explained
No ratings yet
Universal Hashing Explained
4 pages
Lecture05 Hash Table
No ratings yet
Lecture05 Hash Table
65 pages
Hash Tables - 1: Comp 122, Spring 2004
No ratings yet
Hash Tables - 1: Comp 122, Spring 2004
24 pages
Dsa 4
No ratings yet
Dsa 4
55 pages
Dsa Merged
No ratings yet
Dsa Merged
339 pages
Hashing
No ratings yet
Hashing
13 pages
Ch-2: Abstract Data Structures
No ratings yet
Ch-2: Abstract Data Structures
8 pages
Hashing
No ratings yet
Hashing
5 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
DS Cheatsheet
No ratings yet
DS Cheatsheet
2 pages
AA Exam 2021 Answers
No ratings yet
AA Exam 2021 Answers
6 pages
Hash Tables
No ratings yet
Hash Tables
35 pages
CS5800 Assignment 6
No ratings yet
CS5800 Assignment 6
10 pages
Data Structures Cheat Sheet
71% (14)
Data Structures Cheat Sheet
2 pages
Handout Lec10
No ratings yet
Handout Lec10
12 pages
6 - Hashing
No ratings yet
6 - Hashing
52 pages
Dictionaries: Collection of Pairs
No ratings yet
Dictionaries: Collection of Pairs
28 pages
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
No ratings yet
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
53 pages
Some Open Questions Related To Cuckoo Hashing
No ratings yet
Some Open Questions Related To Cuckoo Hashing
10 pages
Hashing With Chaining
No ratings yet
Hashing With Chaining
5 pages
Lecture 8 Hashing
No ratings yet
Lecture 8 Hashing
47 pages
AA Exam 2022 Answers
No ratings yet
AA Exam 2022 Answers
5 pages
Data Structures - Cheat Sheet
100% (1)
Data Structures - Cheat Sheet
2 pages
AA Resit 2020 Answers
No ratings yet
AA Resit 2020 Answers
5 pages
DS 8
No ratings yet
DS 8
30 pages
Authorrithm
No ratings yet
Authorrithm
86 pages
Lecture 13 - Hash Tables
No ratings yet
Lecture 13 - Hash Tables
51 pages
DSBD Unit-II 3
No ratings yet
DSBD Unit-II 3
28 pages
1 Hashing: 1.1 Maintaining A Dictionary
No ratings yet
1 Hashing: 1.1 Maintaining A Dictionary
17 pages
06 Hashing
No ratings yet
06 Hashing
6 pages
Thesis Rectangle Packing PDF
No ratings yet
Thesis Rectangle Packing PDF
168 pages
Algorithms For The Two Dimensional Bin Packing Problem With Partial Conflicts
No ratings yet
Algorithms For The Two Dimensional Bin Packing Problem With Partial Conflicts
22 pages
CS513 Spring 2020 Design and Analysis of Data Structures and Algorithms
No ratings yet
CS513 Spring 2020 Design and Analysis of Data Structures and Algorithms
2 pages
CS513 Spring 2020 Design and Analysis of Data Structures and Algorithms
No ratings yet
CS513 Spring 2020 Design and Analysis of Data Structures and Algorithms
2 pages
Introduction to Clos Networks
No ratings yet
Introduction to Clos Networks
1 page
SDN Overview: Concepts and Applications
No ratings yet
SDN Overview: Concepts and Applications
28 pages
Subordinator.: WRAC Handout #264 Independent and Dependent Clauses
No ratings yet
Subordinator.: WRAC Handout #264 Independent and Dependent Clauses
3 pages
Object Oriented Software Design II: C++ Intro
No ratings yet
Object Oriented Software Design II: C++ Intro
47 pages
PHD Cse Program Requirements
No ratings yet
PHD Cse Program Requirements
1 page
Bina Gupta - Cit Consciousness PDF
0% (1)
Bina Gupta - Cit Consciousness PDF
213 pages
Nenevadanu-WHO AM I (Ramana Maharshi)
100% (2)
Nenevadanu-WHO AM I (Ramana Maharshi)
28 pages
The Ultimate Guide To Destroy GRE Reading Comprehension by CrunchPrep
100% (1)
The Ultimate Guide To Destroy GRE Reading Comprehension by CrunchPrep
23 pages
Employee Tax Declaration Form
No ratings yet
Employee Tax Declaration Form
1 page
Manish Chaurasiya
No ratings yet
Manish Chaurasiya
2 pages
Readme EA 5.0 PDF
No ratings yet
Readme EA 5.0 PDF
28 pages
Summary of STRING METHODS
No ratings yet
Summary of STRING METHODS
1 page
The Basic Writings of Sigmund Freud Complete Book Download
100% (14)
The Basic Writings of Sigmund Freud Complete Book Download
16 pages
A Supplication Moulana Naeem Motala SB
No ratings yet
A Supplication Moulana Naeem Motala SB
9 pages
Mohr's Theorems 1011
No ratings yet
Mohr's Theorems 1011
92 pages
Technology Integration Lesson Plan
No ratings yet
Technology Integration Lesson Plan
2 pages
Satyanarayana Puja Stories
No ratings yet
Satyanarayana Puja Stories
33 pages
Suniti Kumar Chatterji: Linguist's Legacy
No ratings yet
Suniti Kumar Chatterji: Linguist's Legacy
5 pages
Grade 4. English
No ratings yet
Grade 4. English
12 pages
The Writing Process: Lecturer:Sahrish Saif
No ratings yet
The Writing Process: Lecturer:Sahrish Saif
9 pages
Nortel CS1000 Overview
No ratings yet
Nortel CS1000 Overview
45 pages
A Grain of Wheat
No ratings yet
A Grain of Wheat
4 pages
The Basic Daily Devotional Practice of The NUR-Fātimīya Ufī Order and The Fātimīya Alāt
100% (2)
The Basic Daily Devotional Practice of The NUR-Fātimīya Ufī Order and The Fātimīya Alāt
80 pages
Answer Key For AP Calculus BC: Practice Exam, Section I
No ratings yet
Answer Key For AP Calculus BC: Practice Exam, Section I
48 pages
The Q
No ratings yet
The Q
4 pages
Tuyển Chọn Đề Thi Học Sinh Giỏi Lớp 8 5. Đề Số 5
No ratings yet
Tuyển Chọn Đề Thi Học Sinh Giỏi Lớp 8 5. Đề Số 5
12 pages
Secure Document Handling Guide
No ratings yet
Secure Document Handling Guide
1,187 pages
The Things That God Cannot Do
No ratings yet
The Things That God Cannot Do
7 pages
10th Maths Centum Question Paper - 1 TM & EM
No ratings yet
10th Maths Centum Question Paper - 1 TM & EM
7 pages
Crab Molting Process Explained
No ratings yet
Crab Molting Process Explained
3 pages
HPR1XX Series DC/DC Converter Guide
No ratings yet
HPR1XX Series DC/DC Converter Guide
5 pages
Activity Codes
No ratings yet
Activity Codes
10 pages
ShareX Log: Config and Hotkey Setup
No ratings yet
ShareX Log: Config and Hotkey Setup
23 pages
Web Development Exam Guide
No ratings yet
Web Development Exam Guide
1 page
Grade 7 - English
100% (1)
Grade 7 - English
8 pages
LAB Set Questions Rdbms
No ratings yet
LAB Set Questions Rdbms
18 pages
Para sa Mga Batang Mag-aaral
No ratings yet
Para sa Mga Batang Mag-aaral
5 pages
0.190 in 32 Unjf 3a 3B - Asmeb1.2
No ratings yet
0.190 in 32 Unjf 3a 3B - Asmeb1.2
4 pages
Mccloskey Donald Now Deirdre The Rhetoric of Economics
No ratings yet
Mccloskey Donald Now Deirdre The Rhetoric of Economics
122 pages

L11 PDF

Uploaded by

L11 PDF

Uploaded by

Lecture 11

Hashing: static perfect hashing via FKS, dynamic cuckoo hashing

The Problem: Membership/Dictionary: maintain a set S of n items from a universe U under:

However, a random hash function requires |U | lg m bits to represent ⇒ infeasible.

Definition 1. A set H of hash functions is a weak universal family if for all x, y ∈ U , x 6= y,

H = {h | h(x) = (c0 + c1 x + · · · + ck−1 xk−1 ) mod m, for some c0 , c1 , . . . , ck−1 ∈ [p]}.

Worst-case Guarantees in Static Hashing:

-Universal hashing gives good performance only in expectation ⇒ vulnerable to an adversary.

– pick a random h ∈ H, hash every x ∈ S (in O(n) time).

FKS - Static Hashing (Fredman, Komlós, Szemerédi [1])

First attempt: If m = Ω(n2 ) and we randomly pick h ∈ H then

Second attempt: If m = n, the same calculation yields

Let ni = |{x ∈ S | h0 (x) = i}|.

Cuckoo - Dynamic Hashing (Pagh and Rodler 2001 [5])

On the nesting habits of the Cuckoo bird...

- O(1) expected time for insert

¼ x9 x10 x11 x12

– No cycle: Pr[1st eviction] = Pr[T [h1 (x1 )] is occupied ] ≤

– One cycle: One of the path parts (solid, dashed

The probability that a specific configuration occurs is 2t (4n)−2t . Why?

⇒ The probability that a two-cycle occurs at all is at most

By (∗)’s, Pr[insertion causes rehash] ≤ O(1/n2 ).

You might also like