[go: up one dir, main page]

0% found this document useful (0 votes)
137 views23 pages

CSCI2520 Tutorial 3 (Week 4) : Hashtable: Gjchen@cse - Cuhk.edu - HK

This document summarizes a tutorial on hashtables. It discusses key hashtable concepts like hash functions, collision handling techniques like open addressing and chaining, and examples of hash functions and their pros and cons. It also provides exercises with solutions on inserting elements into hash tables using techniques like linear probing, quadratic probing and double hashing to handle collisions.

Uploaded by

Leung Shing Hei
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
137 views23 pages

CSCI2520 Tutorial 3 (Week 4) : Hashtable: Gjchen@cse - Cuhk.edu - HK

This document summarizes a tutorial on hashtables. It discusses key hashtable concepts like hash functions, collision handling techniques like open addressing and chaining, and examples of hash functions and their pros and cons. It also provides exercises with solutions on inserting elements into hash tables using techniques like linear probing, quadratic probing and double hashing to handle collisions.

Uploaded by

Leung Shing Hei
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

CSCI2520 Tutorial 3 (Week 4):

Hashtable
CHEN Gengjie
gjchen@cse.cuhk.edu.hk
Feb 4, 2016

Outline

Hashable
Hash function
Open addressing
Chaining

CSCI2520 Tutorial 3

Hashtable
Hashtable is a data structure for implementing
dictionary operations: insert, search, and delete.
A hashtable is a symbol table that applies a hash
function on a given key to look up the
corresponding value in the table
Fundamental operations include:
Enter: insert a particular value for a specified key
Look up: retrieve the corresponding value for a
specified key
CSCI2520 Tutorial 3

Hash Function
The hash function transforms the key into the
index (hash values, hash codes) of an element
where the corresponding value is to be sought
Properties of good hash function:
correct (produce a number within the range)
reduce the number of collisions
evenly distributes the keys into buckets
quick to compute
CSCI2520 Tutorial 3

Exercise 1
Discuss the pros and cons of the hash functions
int hash1(char* key, int H_SIZE) {
int hash_val = 0;
for (int i = 0; i < strlen(key); i++)
hash_val += key[i];
return(hash_val % H_SIZE);
}
int hash2(char* key, int H_SIZE) {
return ((key[0] + 27 * key[1] +
729 * key[2]) % H_SIZE);
}
int hash3(char* key, int H_SIZE) {
int hash_val = 0;
for (int i = 0; i < strlen(key); i++)
hash_val = (hash_val << 5) + key[i];
return(hash_val % H_SIZE);
}
CSCI2520 Tutorial 3

Collision in Hashing
Collision: two keys may hash to the same slot
Handle Collision in Hashing
Open Addressing
Chaining

CSCI2520 Tutorial 3

Open Addressing
If a collision occurs, alternate cells are tried
until an empty cell is found
Let h0 = Hash(key, Nbuckets)
If h0 collides,
probe h1 = (h0 + F(1)) % Nbuckets.

If h1 collides,
probe h2 = (h0 + F(2)) % Nbuckets.

If h2 collides,
probe h3 = (h0 + F(3)) % Nbuckets.

CSCI2520 Tutorial 3

Linear Probing
Linear probing is a collision resolution method
using F(i)=i (sequential search).

Probe h0 = Hash(key, Nbuckets).


Probe h1 = (h0 + 1) % Nbuckets.
Probe h2 = (h0 + 2) % Nbuckets.
Probe h3 = (h0 + 3) % Nbuckets.

Until an empty bucket is found.


Insert the entry at the empty bucket.

CSCI2520 Tutorial 3

Exercise 2
Given a hash table of size 10, indexed with 0,
1, , 9. The hash function is h(i) = h(i) % 10. A
list of entries, {89, 18, 49, 58, 69}, enter the
table. Using linear probing to handle collision,
what is the result of the hash table?

CSCI2520 Tutorial 3

Exercise 2 Solution
Bucket
0

Value
49

1
2
3

58
69

4
5
6
7
8
9

18
89
CSCI2520 Tutorial 3

Quadratic Probing
Quadratic probing is a collision resolution method
that eliminates the primary clustering problem of
linear probing. The popular choice is F(i)=i2.

Probe h0 = Hash(key, Nbuckets).


Probe h1 = (h0 + 1) % Nbuckets.
Probe h2 = (h0 + 4) % Nbuckets.
Probe h3 = (h0 + 9) % Nbuckets.
Probe h4 = (h0 + 16) % Nbuckets.

CSCI2520 Tutorial 3

Exercise 3
Given a hash table of size 10, indexed with 0,
1, , 9. The hash function is h(i) = h(i) % 10. A
list of entries, {89, 18, 49, 58, 69}, enter the
table. Using quadratic probing to handle
collision, what is the result of the hash table?

CSCI2520 Tutorial 3

Exercise 3 Solution
Bucket
0

Value
49

1
2
3

58
69

4
5
6
7
8
9

18
89
CSCI2520 Tutorial 3

Double Hashing
Double hashing uses another hash function Hash2
to handle collision
F(i) = i Hash2(key, Nbuckets)

h0 = Hash(key, Nbuckets)
h1 = (h0 + 1 Hash2(key, Nbuckets)) % Nbuckets
h2 = (h0 + 2 Hash2(key, Nbuckets)) % Nbuckets
h3 = (h0 + 3 Hash2(key, Nbuckets)) % Nbuckets

Performance is better than linear and quadratic


probing in general, but it requires 2 hash
functions.
CSCI2520 Tutorial 3

Exercise 4
Given a hash table of size 10, indexed with 0,
1, , 9. The hash function is h(i) = i % 10. A list
of entries, {89, 18, 49, 58, 69}, enter the table.
Using double hashing with hash function h(i)
= 7 (i % 7), to handle collision, what is the
result of the hash table?
What would happen if 23 is added into the
result hash table?
CSCI2520 Tutorial 3

Exercise 4 Solution
Bucket
0

Value
69

1
2
3

58

4
5
6
7
8
9

49
18
89
CSCI2520 Tutorial 3

Chaining
In chaining, we put all elements that hash to
the same slot in a linked list
Slot j contains a header node of the list that
stores all elements that are hashed to j.
buckets

K
CSCI2520 Tutorial 3

Exercise 5
Insert 36, 18, 72, 43, 6, 10, 5, 15 one by one
table size: 8
hash function: key % table size
collision: chaining

CSCI2520 Tutorial 3

Exercise 5 Solution

CSCI2520 Tutorial 3

Exercise 6
How to calculate the intersection of two sets
of integers?
Input: two sets A1, A2
A1 = {4, 5, 6, 7, 1, 2, 300, 8, 9, 10}
A2 = {10, 11, 12, 8, 600}

Output: {6, 8, 10}

CSCI2520 Tutorial 3

Exercise 6 Solution
Method 1
sorting + binary search
comparison operations: (n1 log n1 + n2 log n2 + n1
log n2) or (n1 log n1 + n2 log n2 + n2 log n1)

Method 2
sorting + scanning two sets (similar to merge sort)
comparison operations: n1 log n1 + n2 log n2 + n1 +
n2
CSCI2520 Tutorial 3

Exercise 6 Solution
Method 3
build a hashtable H for A1
look up every element of A2 in H
expected comparison operations: n1 + n2 (assuming
uniform hashing)

What if integers in A1 and A2 are small? Any


better method?
A1 = {4, 5, 6, 7, 1, 2, 300, 8, 9, 10}
A2 = {10, 11, 12, 8, 600}

Possible method 4: direct-address table


CSCI2520 Tutorial 3

Summary
Hashtable
Open addressing
Chaining

CSCI2520 Tutorial 3

You might also like