CSCI2520 Tutorial 3 (Week 4) : Hashtable: Gjchen@cse - Cuhk.edu - HK
CSCI2520 Tutorial 3 (Week 4) : Hashtable: Gjchen@cse - Cuhk.edu - HK
Hashtable
CHEN Gengjie
gjchen@cse.cuhk.edu.hk
Feb 4, 2016
Outline
Hashable
Hash function
Open addressing
Chaining
CSCI2520 Tutorial 3
Hashtable
Hashtable is a data structure for implementing
dictionary operations: insert, search, and delete.
A hashtable is a symbol table that applies a hash
function on a given key to look up the
corresponding value in the table
Fundamental operations include:
Enter: insert a particular value for a specified key
Look up: retrieve the corresponding value for a
specified key
CSCI2520 Tutorial 3
Hash Function
The hash function transforms the key into the
index (hash values, hash codes) of an element
where the corresponding value is to be sought
Properties of good hash function:
correct (produce a number within the range)
reduce the number of collisions
evenly distributes the keys into buckets
quick to compute
CSCI2520 Tutorial 3
Exercise 1
Discuss the pros and cons of the hash functions
int hash1(char* key, int H_SIZE) {
int hash_val = 0;
for (int i = 0; i < strlen(key); i++)
hash_val += key[i];
return(hash_val % H_SIZE);
}
int hash2(char* key, int H_SIZE) {
return ((key[0] + 27 * key[1] +
729 * key[2]) % H_SIZE);
}
int hash3(char* key, int H_SIZE) {
int hash_val = 0;
for (int i = 0; i < strlen(key); i++)
hash_val = (hash_val << 5) + key[i];
return(hash_val % H_SIZE);
}
CSCI2520 Tutorial 3
Collision in Hashing
Collision: two keys may hash to the same slot
Handle Collision in Hashing
Open Addressing
Chaining
CSCI2520 Tutorial 3
Open Addressing
If a collision occurs, alternate cells are tried
until an empty cell is found
Let h0 = Hash(key, Nbuckets)
If h0 collides,
probe h1 = (h0 + F(1)) % Nbuckets.
If h1 collides,
probe h2 = (h0 + F(2)) % Nbuckets.
If h2 collides,
probe h3 = (h0 + F(3)) % Nbuckets.
CSCI2520 Tutorial 3
Linear Probing
Linear probing is a collision resolution method
using F(i)=i (sequential search).
CSCI2520 Tutorial 3
Exercise 2
Given a hash table of size 10, indexed with 0,
1, , 9. The hash function is h(i) = h(i) % 10. A
list of entries, {89, 18, 49, 58, 69}, enter the
table. Using linear probing to handle collision,
what is the result of the hash table?
CSCI2520 Tutorial 3
Exercise 2 Solution
Bucket
0
Value
49
1
2
3
58
69
4
5
6
7
8
9
18
89
CSCI2520 Tutorial 3
Quadratic Probing
Quadratic probing is a collision resolution method
that eliminates the primary clustering problem of
linear probing. The popular choice is F(i)=i2.
CSCI2520 Tutorial 3
Exercise 3
Given a hash table of size 10, indexed with 0,
1, , 9. The hash function is h(i) = h(i) % 10. A
list of entries, {89, 18, 49, 58, 69}, enter the
table. Using quadratic probing to handle
collision, what is the result of the hash table?
CSCI2520 Tutorial 3
Exercise 3 Solution
Bucket
0
Value
49
1
2
3
58
69
4
5
6
7
8
9
18
89
CSCI2520 Tutorial 3
Double Hashing
Double hashing uses another hash function Hash2
to handle collision
F(i) = i Hash2(key, Nbuckets)
h0 = Hash(key, Nbuckets)
h1 = (h0 + 1 Hash2(key, Nbuckets)) % Nbuckets
h2 = (h0 + 2 Hash2(key, Nbuckets)) % Nbuckets
h3 = (h0 + 3 Hash2(key, Nbuckets)) % Nbuckets
Exercise 4
Given a hash table of size 10, indexed with 0,
1, , 9. The hash function is h(i) = i % 10. A list
of entries, {89, 18, 49, 58, 69}, enter the table.
Using double hashing with hash function h(i)
= 7 (i % 7), to handle collision, what is the
result of the hash table?
What would happen if 23 is added into the
result hash table?
CSCI2520 Tutorial 3
Exercise 4 Solution
Bucket
0
Value
69
1
2
3
58
4
5
6
7
8
9
49
18
89
CSCI2520 Tutorial 3
Chaining
In chaining, we put all elements that hash to
the same slot in a linked list
Slot j contains a header node of the list that
stores all elements that are hashed to j.
buckets
K
CSCI2520 Tutorial 3
Exercise 5
Insert 36, 18, 72, 43, 6, 10, 5, 15 one by one
table size: 8
hash function: key % table size
collision: chaining
CSCI2520 Tutorial 3
Exercise 5 Solution
CSCI2520 Tutorial 3
Exercise 6
How to calculate the intersection of two sets
of integers?
Input: two sets A1, A2
A1 = {4, 5, 6, 7, 1, 2, 300, 8, 9, 10}
A2 = {10, 11, 12, 8, 600}
CSCI2520 Tutorial 3
Exercise 6 Solution
Method 1
sorting + binary search
comparison operations: (n1 log n1 + n2 log n2 + n1
log n2) or (n1 log n1 + n2 log n2 + n2 log n1)
Method 2
sorting + scanning two sets (similar to merge sort)
comparison operations: n1 log n1 + n2 log n2 + n1 +
n2
CSCI2520 Tutorial 3
Exercise 6 Solution
Method 3
build a hashtable H for A1
look up every element of A2 in H
expected comparison operations: n1 + n2 (assuming
uniform hashing)
Summary
Hashtable
Open addressing
Chaining
CSCI2520 Tutorial 3