0% found this document useful (0 votes)

27 views35 pages

Hash Tables

hash tables example

Uploaded by

Ashutosh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views35 pages

Hash Tables

hash tables example

Uploaded by

Ashutosh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Hash Tables

ESO207

Indian Institute of Technology, Kanpur

Introduction

• Consider a dynamic set that supports dictionary

operations, I NSERT, S EARCH and D ELETE.
• Previously, the L IST (and even S TACK and Q UEUE) data
structures can support the above operations, but not
efficiently.
• Hash table is an effective data structure for implementing
dictionaries.
• Searching for an element in a hash table can take as long
as in a linked list – Θ(n) in the worst case, but
• In practice, hashing performs extremely well. Under
reasonable assumptions, expected time to search for an
element in a hash table is O(1).
Introduction

• Consider a dynamic set that supports dictionary

• A hash table generalizes an array’s ability to directly

address by an index into it in O(1) time.
• Notation: Let U be the universe of the possible values of
the keys.
• The dynamic set S at any time is a subset of U.
• For simplicity (although not general), assume that
U = {0, 1, . . . , m − 1} is a set of numbered keys.
• No two elements of the set have the same key.
Direct Addressing
Direct Address Tables

• Direct addressing works well when the universe is small.

• An array called the direct-address table is used and
denoted by T [0, . . . , m − 1].
• Each position or slot of the array corresponds to a key in
the universe U.
• If the dynamic set is S, then, for each k ∈ S, T [k ] (that is,
slot k ) points to the element of the set with key k .
• For each k ∈ U − S, T [k ] is NIL.
Operations: Direct Address Tables

D IRECT-A DDRESS -S EARCH(T , k )

1. return T [k ]

D IRECT-A DDRESS -I NSERT(T , x)

1. T [x.key ] = x

D IRECT-A DDRESS -D ELETE(T , x)

1. T [x.key ] = NIL
Direct Address Tables

• For some applications, direct address table can hold the

elements directly.
• That is, rather than storing the element’s key and satellite
data in an object external to the table, with a pointer from
the slot, we can store the object in the slot itself.
• For this, empty slots will need to be indicated by a special
key value.
Hash Tables

• Down-side of direct addressing:

1. If the universe U is large, storing a table T of size |U| may
be impractical or impossible.
2. Further, the set of keys K actually stored in the table T
may be very small. This would lead to wastage of space.
3. Hash tables typically work with space O(|K |), while still
being able to search in expected O(1) time. [For
Direct-Address, this was worst-case O(1) time].
Hash Tables
• Keep a table T consisting of m slots numbered
T [0, . . . , m − 1].
• A hash function h maps each key value of the universe
U to some slot in T , that is,

h : U → {0, 1, . . . , m − 1}

• We say that an element with key k hashes to slot h(k ).

• In general, m |U| (and that is where the benefit in the
storage requirement of hash tables arises).
• Hence, two keys from U may hash to the same value,
that is,

Collision : h(k1 ) = h(k2 ), k1 , k2 ∈ U, k1 6= k2 .

In such a case we say that the keys k1 and k2 collide

under the hash function h.
Collisions

• Collisions cannot be avoided, since |U| > m.

• One of the goals of the design of hash functions is to
make h appear “random”.
• The very term “to hash” evokes images of random mixing
and chopping.
• We will explain the notion of randomness of hash
functions later — note that a hash function h is
deterministic, that is, given an input key k , the output
h(k ) is the same value.
Collision resolution: by chaining

• Simplest form of resolving collisions is to keep a linked

list of the elements that hash to each bucket.
• Each slot j contains a pointer to the head of the list of all
the stored elements that hash to the slot j.
• If there are no such elements, slot j contains NIL.
Code for Insert, Search and Delete

C HAINED -H ASH -I NSERT(T , x)

1. Insert x at the head of the list T [h(x.key )]

C HAINED -H ASH -S EARCH(T , k )

1. Search for an element with key k in list T [h(k )]

C HAINED -H ASH -D ELETE(T , x)

1. Delete x from the list T [h(x.key )]

• Worst case time for insertion: O(1). Assumes element

being inserted is not already present.
• Otherwise, we can check for this assumption by
searching for an element whose key is x.key before
inserting.
Search, Delete

C HAINED -H ASH -I NSERT(T , x)

1. Insert x at the head of the list T [h(x.key )]

C HAINED -H ASH -S EARCH(T , k )

1. Search for an element with key k in list T [h(k )]

C HAINED -H ASH -D ELETE(T , x)

1. Delete x from the list T [h(x.key )]

• Worst case time for Search: length of the linked list.

• Worst case time for Delete: O(1). Note that deletion
takes (pointer to the element) x to be deleted.
• Since we have doubly-linked lists, deletion is fast.
Comment

Hash tables with open chaining is a very popular data

structure in the industry.
Analysis of Chaining: introduction

• Load factor: Given a hash table T with m slots that

stores n elements, the load factor α for T is n/m, that is,
the average number of elements stored in a chain.
• A well-designed hash table in practice attempts to keep α
close to 1.
• Worst-case performance. In the worst case, the input
may consist of n keys all of whom hash to the same slot.
This would give a chain length of n at this slot, and chain
lengths of 0 at all other slots. The worst-case time for
searching would be O(n).
• Hash tables are not used for their worst-case
performance.
Analysis: introduction

• Hash tables are immensely popular because of their

average-case performance.
• The average case performance of hashing depends on
how well the hash function h distributes the set of keys to
be stored among the m slots {0, 1, . . . , m − 1}, on the
average.
• Idealized assumption about hash functions for analysis:
• Simple uniform hashing. A key value is equally likely
to hash into any of the m slots, independently of where
any other element hashes to.
Analysis
• Corresponding to slot j ∈ {0, 1, . . . , m − 1}, denote the
length of the chain at slot j to be nj . Then,

n = n0 + n1 + . . . + nm−1

• The average chain length is the sum of all chain lengths

divided by the number of slots, so this is
Pm−1
k =0 nk n
E nj = = =α
m m
which is the load factor of the hash table.
• For the analysis, we will make an important assumption,
namely, that

Hash value h(k ) is computed in time O(1).

Analysis

• Expected time required to search for a key in a hash

table.
• Step 1: compute h(k ) in O(1) time. Then search through
the list at slot number h(k ).
• Length of this list is nh(k ) .
• Hence, an unsuccessful search will go over all the keys in
this list requiring time O(nh(k ) ).
Expected time for searching

• By uniform hashing assumption, h(k ) = j with equal

probability 1/m.
• the average time for search is

m−1
1 X
E nh(k ) = nj = α
m
j=0

as calculated before.
• The average case time complexity of search is
O(1 + α), where, the O(1) time is required for computing
the hash value h(k ).
Expectation Analysis: contd.

• One would expect a successful search to take slightly

less on average, since, the entire chain need not be
browsed, rather, the search may halt once the key is
found.
• A slightly detailed analysis shows that the average time
for a successful search is 1 + α/2 − α/(2m) = Θ(1 + α).
• All this analysis shows that if α = n/m is O(1), then, the
search operation requires on the average O(1) time.
• Using doubly linked lists for storing the chains, deletion
takes O(1) time.
• Insertion (without searching for duplicates) takes O(1)
time, hence, all the hash table operations can be
performed in O(1) time, on average, if α = O(1).
Good Hash Functions

• A good hash function should come close to realizing the

idealized assumption made by simple uniform hashing.
• Many hash functions assume that the universe of keys is
the set of natural numbers N = {0, 1, 2, . . .}.
• Under this assumption, either the keys themselves are
natural numbers, or they are transformed into natural
numbers (e.g., strings of characters may be viewed as
long numbers over a larger alphabet).
Good and bad hash functions

• The simplest hash function is of the form

h(k ) = k mod m

• This hash function is very efficient, although it does not

have good uniformity properties.
• It is often not recommended to use m = 2r for some
value of r , since, k mod 2r gives the low order r bits of k .
• Unless we know that the low order r bits of the input keys
are equally likely, we are better off designing a hash
function whose value depends on all the bits of the key.
Example hash function: key modulo prime

• Choosing m to be a prime p that is not too close to a

power of 2 is often a good choice.
• For example, suppose there are n = 2000 keys to be
hashed. Let p = 701, which would give a load factor of
2000/701 ≈ 3 and h(k ) = k mod 701.
Multiplicative method for creating hash functions

• The multiplication method for creating hash functions

works in two steps.
1. Choose a number A in the range 0 < A < 1 and multiply
the key k by A and take the fractional part of kA, that is,
take kA − bkAc.
2. This is also referred to as

kA mod 1 = kA − bkAc

3. Now multiply this by the number of slots m and return the

floor, that is,
h(k ) = bm(kA mod 1)c
Multiplication method

• Advantage: the value of m is not critical.

• An efficient way of implementing the multiplication
method:
1. Choose m = 2p , a power of 2.
2. Suppose the word size of the machine is w bits and a key
value fits into a word.
3. Choose A to be of the form s/2w so that A · 2w = s.
4. Since k and s are both w-bit integers, ks is a 2w-bit
integer that we write as

ks = r1 2w + r0

where, r1 is the high order w-bit of ks and r0 is the low

order w-bit.
Multiplication method

1. Then,
ks r0
kA = w
= r1 + w
2 2
2. Hence the fractional part of kA is r0 /2w (since,
0 < r0 ≤ 2w − 1).
3. So, kA mod 1 = r0 /2w .
4. Hence,
j r k
0
h(k ) = bm(kA mod 1)c = 2p w
2
5. Since, r0 is a w-bit number, multiplying r0 /2w by 2p gives
the top p bits of r0 .
• Why? write r0 as a w-bit binary number:

r0 = b1 + b2 · 2 + b2 · 22 . . . + bw 2w−1

• Then,

r0 · 2p 1 p p+1 w+p−1

= b1 · 2 + b2 · 2 + . . . + bw · 2
2w 2w
1
= w−p b1 + b2 · 2 + . . . + bw−p · 2w−p−1
2
+ bw−p+1 + bw−p+2 · 2 + . . . + bw · 2p−1
• Thus,

r0 · 2p

= bw−p+1 + bw−p+2 · 2 + . . . + bw · 2p−1
2w

which is the number corresponding to the top-p bits of r0 .

Multiplicative method

• Although this method works with any value of the

constant A, experiments show that it works better with
some values than others.
• Knuth suggests that
√
A ≈ ( 5 − 1)/2 = 0.6180339887 . . .

is likely to work reasonably well.

Balanis - Advanced Engineering Electromagnetics - Solutions (Balanis-1989) - Menor PDF
90% (63)
Balanis - Advanced Engineering Electromagnetics - Solutions (Balanis-1989) - Menor PDF
78 pages
Balanis - Advanced Engineering Electromagnetics - Solutions (Balanis-1989) - Menor PDF
90% (63)
Balanis - Advanced Engineering Electromagnetics - Solutions (Balanis-1989) - Menor PDF
78 pages
Cheat Sheet Mysql
No ratings yet
Cheat Sheet Mysql
1 page
06 Hashing
No ratings yet
06 Hashing
6 pages
Dsa 4
No ratings yet
Dsa 4
55 pages
Hashing RPK
No ratings yet
Hashing RPK
61 pages
0.1 Direct-Address Tables
No ratings yet
0.1 Direct-Address Tables
10 pages
Lecture 13 - Hash Tables
No ratings yet
Lecture 13 - Hash Tables
51 pages
unit 1 Hashing
No ratings yet
unit 1 Hashing
61 pages
DSA MK Lect2 PDF
No ratings yet
DSA MK Lect2 PDF
92 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
Lecture 8 Hashing
No ratings yet
Lecture 8 Hashing
47 pages
Hashing
No ratings yet
Hashing
10 pages
L-2005-08-Advance Data Structure Part 1-HS
No ratings yet
L-2005-08-Advance Data Structure Part 1-HS
46 pages
CH 4
No ratings yet
CH 4
58 pages
Hash Table
No ratings yet
Hash Table
24 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
05 Hashing
No ratings yet
05 Hashing
47 pages
Hashing
No ratings yet
Hashing
23 pages
Hashing
No ratings yet
Hashing
38 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
CHAPTER 8 Hashing: Instructors: C. Y. Tang and J. S. Roger Jang
No ratings yet
CHAPTER 8 Hashing: Instructors: C. Y. Tang and J. S. Roger Jang
78 pages
Hashing
No ratings yet
Hashing
37 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
No ratings yet
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
53 pages
Hash
No ratings yet
Hash
5 pages
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
No ratings yet
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
53 pages
Lecture 27 - Hashing
No ratings yet
Lecture 27 - Hashing
48 pages
GROUP 15.Pptx Presentation
No ratings yet
GROUP 15.Pptx Presentation
29 pages
Hashing
No ratings yet
Hashing
20 pages
06 - APS - Hash Table
No ratings yet
06 - APS - Hash Table
28 pages
Hashing 1
No ratings yet
Hashing 1
26 pages
Lecture05 Hash Table
No ratings yet
Lecture05 Hash Table
65 pages
Dsa Merged
No ratings yet
Dsa Merged
339 pages
Hashing PDF
No ratings yet
Hashing PDF
65 pages
L5 HashTables
No ratings yet
L5 HashTables
22 pages
Hash Tables: Dr. Dibakar Saha
No ratings yet
Hash Tables: Dr. Dibakar Saha
26 pages
Hash Tables
No ratings yet
Hash Tables
21 pages
09 Hashtable
No ratings yet
09 Hashtable
53 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
43 pages
Hashing
No ratings yet
Hashing
29 pages
Hashing With Chaining
No ratings yet
Hashing With Chaining
5 pages
Hashing
No ratings yet
Hashing
11 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
MODULE-5
No ratings yet
MODULE-5
33 pages
DSA Lab 11 Hashing
No ratings yet
DSA Lab 11 Hashing
9 pages
Hashing
No ratings yet
Hashing
30 pages
MODULE 5_BCS304_HASHING_Leftisht trees_OBST_Notes
No ratings yet
MODULE 5_BCS304_HASHING_Leftisht trees_OBST_Notes
32 pages
chap-1 ADS
No ratings yet
chap-1 ADS
5 pages
Hashing
No ratings yet
Hashing
56 pages
Hashing
No ratings yet
Hashing
34 pages
Hashing
No ratings yet
Hashing
44 pages
11 Hashtable-1
No ratings yet
11 Hashtable-1
48 pages
Day3.2 DS2 HashTablesHeaps
No ratings yet
Day3.2 DS2 HashTablesHeaps
61 pages
Hashing and Indexing
No ratings yet
Hashing and Indexing
28 pages
ds 5 update
No ratings yet
ds 5 update
26 pages
Hashing: Amar Jukuntla
No ratings yet
Hashing: Amar Jukuntla
22 pages
Lecture 12
No ratings yet
Lecture 12
19 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
25 pages
Hashing and Graphs
No ratings yet
Hashing and Graphs
28 pages
Recursive Analysis
From Everand
Recursive Analysis
R. L. Goodstein
No ratings yet
An Introduction to Linear Algebra and Tensors
From Everand
An Introduction to Linear Algebra and Tensors
M. A. Akivis
1/5 (1)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
A Power Engineer View On Space Based Solar Power
No ratings yet
A Power Engineer View On Space Based Solar Power
11 pages
HO8 Smith - Why50
No ratings yet
HO8 Smith - Why50
16 pages
Nash Bargaining With Asymmetric Bargaining Power
No ratings yet
Nash Bargaining With Asymmetric Bargaining Power
10 pages
Introduction To Advanced Electromagnetism Theory
No ratings yet
Introduction To Advanced Electromagnetism Theory
3 pages
Hold Up Game and Many More Extensive Form
No ratings yet
Hold Up Game and Many More Extensive Form
46 pages
Paper Title: Ajay Kumar Gudla and Saikat Chakrabarti
No ratings yet
Paper Title: Ajay Kumar Gudla and Saikat Chakrabarti
5 pages
Review of Related Literature
No ratings yet
Review of Related Literature
3 pages
IITH Thesis Template v2
50% (2)
IITH Thesis Template v2
16 pages
ls100 Service Manual PDF
No ratings yet
ls100 Service Manual PDF
134 pages
STERIS LIGHT 3686407
No ratings yet
STERIS LIGHT 3686407
8 pages
Contoh Manuskrip Jurnal
No ratings yet
Contoh Manuskrip Jurnal
4 pages
Pre-Quiz - XP
100% (2)
Pre-Quiz - XP
4 pages
P-MX-3212-B-Manual
No ratings yet
P-MX-3212-B-Manual
29 pages
LaserJet Enterprise M607
No ratings yet
LaserJet Enterprise M607
4 pages
O365 Sharepoint Setup and Admin Guide
No ratings yet
O365 Sharepoint Setup and Admin Guide
51 pages
oracle-1z0-1077-23-dumps-by-bradford-28-02-2024-7qa-certsdeals
No ratings yet
oracle-1z0-1077-23-dumps-by-bradford-28-02-2024-7qa-certsdeals
12 pages
OT_SIEM_Implementation_by_NIST_1734367210
No ratings yet
OT_SIEM_Implementation_by_NIST_1734367210
17 pages
Roles and Responsibilities Letter Sample
No ratings yet
Roles and Responsibilities Letter Sample
2 pages
Iamthencr Secure™ Hard Disk Drive Encryption
No ratings yet
Iamthencr Secure™ Hard Disk Drive Encryption
2 pages
Divyansh Mishra - AI Engineer Intern
No ratings yet
Divyansh Mishra - AI Engineer Intern
1 page
Unit 3: Project Planning and Scheduling
No ratings yet
Unit 3: Project Planning and Scheduling
77 pages
Scalability, Consistency, Reliability and Security in SDN Controllers: A Survey of Diverse SDN Controllers
No ratings yet
Scalability, Consistency, Reliability and Security in SDN Controllers: A Survey of Diverse SDN Controllers
59 pages
Faculty of Economic and Management Sciences: Bcom (Statistics and Data Science)
No ratings yet
Faculty of Economic and Management Sciences: Bcom (Statistics and Data Science)
2 pages
Week 3- Day 1
No ratings yet
Week 3- Day 1
30 pages
G25 ETI Microproject Final
No ratings yet
G25 ETI Microproject Final
25 pages
SasGwyDLK2 1Spe010000Approved
100% (1)
SasGwyDLK2 1Spe010000Approved
120 pages
Handbook Combinational Optimization
No ratings yet
Handbook Combinational Optimization
2,410 pages
Arduino Coding and Bluetooth Connection
No ratings yet
Arduino Coding and Bluetooth Connection
14 pages
3.3.3.4 Lab - Using Wireshark To View Network Traffic - ILM
No ratings yet
3.3.3.4 Lab - Using Wireshark To View Network Traffic - ILM
20 pages
Texturing 3D Models - INTL
100% (1)
Texturing 3D Models - INTL
113 pages
Exploded View - Printed Circuits For Mechatronic Unit
No ratings yet
Exploded View - Printed Circuits For Mechatronic Unit
2 pages
DTSX1 - Fiber Optic Heat Detector
No ratings yet
DTSX1 - Fiber Optic Heat Detector
3 pages
Swells WT Helix Patch ReadMe V2 - 0
No ratings yet
Swells WT Helix Patch ReadMe V2 - 0
5 pages
C#13Expansion
No ratings yet
C#13Expansion
10 pages
Data Loading Into Hyperion Planning by FDMEE
No ratings yet
Data Loading Into Hyperion Planning by FDMEE
28 pages