Data Structure & Algorithm
Topic: Hashing
by:
Dr. Shweta Saraswat
Definition of Hashing
• Hashing is a technique or process of mapping
keys, and values into the hash table by using a
hash function. It is done for faster access to
elements. The efficiency of mapping depends on
the efficiency of the hash function used.
example
Let a hash function H(x) maps the value x at the index x%10 in an Array. For
example if the list of values is [11,12,13,14,15] it will be stored at positions
{1,2,3,4,5} in the array or Hash table respectively.
Hash Function
• a hash function is a function that takes an input
(or "key") and maps it to a fixed-size value or
index, typically within an array or a data structure
called a hash table.
• The primary purpose of a hash function is to
quickly determine the location (index) where the
associated data should be stored or retrieved.
• Hash functions are crucial for efficient data
retrieval in data structures like hash tables.
Types of hash function
• Here are some common types of hash functions
used in data structures:
• Division Method:
– In this method, the key is divided by a prime number,
and the remainder is used as the index into the hash
table.
– The formula for this method is typically: hash(key) =
key % table_size. Or k mod n
– example, if the table size is 10 and the key is 23, the
hash value would be 3 (23 % 10 = 3).
Types of hash function
• Multiplication Method:
– In this method, the key is multiplied by a constant
value between 0 and 1, and the fractional part of the
result is used as the index.
– The formula for this method is usually: hash(key) =
floor(table_size * (key * A - floor(key * A))), where A is
a constant between 0 and 1.
Types of hash function
• Folding Method:
– This method involves dividing the key into smaller
parts, summing or concatenating those parts, and
then taking the modulus of the result to find the
index.
– For example, if the key is 12345, you might split it into
12 and 345, sum them (12 + 345 = 357), and take the
modulus of the sum to get the index.
10/8/2023
Types of hash function
• Mid-Square Method:
– In this method, the key is squared, and then a portion
of the squared value is used as the index.
– For example, if the key is 42, squaring it gives 1764,
and if you take the middle digits, such as 76, as the
index.
The hash function must satisfy
the following requirements:
• A good hash function is easy to compute.
• A good hash function never gets stuck in clustering and
distributes keys evenly across the hash table.
• A good hash function avoids collision when two
elements or items get assigned to the same hash value.
• One of the hashing techniques of using a hash function
is used for data integrity. If using a hash function one
change in a message will create a different hash.
Hash Table
• Hashing in data structure uses hash tables to
store the key-value pairs. The hash table then
uses the hash function to generate an index.
Hashing uses this unique index to perform insert,
update, and search operations.
• It can be defined as a bucket where the data are
stored in an array format. These data have their
own index value. If the index values are known
then the process of accessing the data is quicker.
How does Hashing in Data
Structure Works?
• In hashing, the hashing function maps strings or
numbers to a small integer value. Hash tables retrieve
the item from the list using a hashing function. The
objective of hashing technique is to distribute the data
evenly across an array. Hashing assigns all the elements a
unique key. The hash table uses this key to access the
data in the list.
• Hash table stores the data in a key-value pair. The key
acts as an input to the hashing function. Hashing
function then generates a unique index number for each
value stored. The index number keeps the value that
corresponds to that key. The hash function returns a
small integer value as an output. The output of the
hashing function is called the hash value.
Hashing Data Structure
example
• you need to store some items (arranged in a
key-value pair) inside a hash table with 30 cells.
• The values are: (3,21) (1,72) (40,36) (5,30) (11,44)
(15,33) (18,12) (16,80) (38,99)
The hash table will look like
the following:
explanation
• The process of taking any size of data and then
converting that into smaller data value which can
be named as hash value. This hash alue can be
used in an index accessible in hash table. This
process define hashing in data structure.
Collision Resolution
Techniques
Collision
• When the two different values have the same
value, then the problem occurs between the two
values, known as a collision. In the above
example, the value is stored at index 6. If the key
value is 26, then the index would be:
• h(26) = 26%10 = 6
Named as
• Therefore, two values are stored at the same
index, i.e., 6, and this leads to the collision
problem. To resolve these collisions, we have
some techniques known as collision techniques.
• The following are the collision techniques:
• Chaining/Open Hashing: It is also known as
closed addressing.
• Closed Hashing: It is also known as open
addressing.
Collision Resolution
Techniques
• Hashing in data structure falls into a collision if
two keys are assigned the same index number in
the hash table. The collision creates a problem
because each index in a hash table is supposed
to store only one value. Hashing in data
structure uses several collision resolution
techniques to manage the performance of a
hash table.
Chaining/Open Hashing
• In this Technique , each hash table slot contains
a linked list of all the values that have the same
value.
• This is simple and easy to implement but it can
lead to poor performance when the linked list
become too long.
Open hashing
Suppose we have a list of key
values
Suppose we have a list of key
values
So finally
Closed Hashing
• In this technique when a collision occur the
algorithm searches for an empty slot in the hash
table by probing successive slots until an empty
slot is found.
• This technique can be more efficient then
chaining when the load factor is low.
Closed Hashing
• In Closed hashing, three techniques are used to
resolve the collision:
• Linear probing
• Quadratic probing
• Double Hashing technique
Linear Probing
• Hashing in data structure results in an array
index that is already occupied to store a value. In
such a case, hashing performs a search
operation and probes linearly for the next empty
cell.
• Linear probing in hash techniques is known to be
the easiest way to resolve any collisions in hash
tables. A sequential search can be performed to
find any collision that occurred.
Linear Probing Example
• Imagine you have been asked to store some
items inside a hash table of size 30. The items
are already sorted in a key-value pair format. The
values given are: (3,21) (1,72) (63,36) (5,30)
(11,44) (15,33) (18,12) (16,80) (46,99).
• The hash(n) is the index computed using a hash
function and T is the table size. If slot index =
( hash(n) % T) is full, then we look for the next
slot index by adding 1 ((hash(n) + 1) % T). If
(hash(n) + 1) % T is also full, then we try (hash(n)
+ 2) % T. If (hash(n) + 2) % T is also full, then we
try (hash(n) + 3) % T.
Hash table
Quadratic Probing
• In case of linear probing, searching is performed
linearly. In contrast, quadratic probing is an open
addressing technique that uses quadratic
polynomial for searching until a empty slot is
found.
• It can also be defined as that it allows the
insertion ki at first free location from (u+i2)%m
where i=0 to m-1.
example
• A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and
h(k) = 2k+3
• The key values 3, 2, 9, 6 are stored at the indexes
9, 7, 1, 5, respectively. We do not need to apply
the quadratic probing technique on these key
values as there is no occurrence of the collision.
• The index value of 11 is 5, but this location is
already occupied by the 6. So, we apply the
quadratic probing technique.
solution
solution
Double Hashing
• The double hashing technique uses two hash
functions. The second hash function comes into
use when the first function causes a collision. It
provides an offset index to store the value.
• The formula for the double hashing technique is
as follows:
• (firstHash(key) + i * secondHash(key)) %
sizeOfTable
• Where i is the offset value. This offset value
keeps incremented until it finds an empty slot.
Double Hashing
example
example