[go: up one dir, main page]

0% found this document useful (0 votes)
163 views30 pages

Chapter One - Hashing PDF

The document discusses hashing techniques used to store and retrieve data from data structures like arrays and linked lists. Hashing uses a hash function to map data to hash table indexes, allowing constant-time lookups. Common hash functions include truncation, folding, mid-square, and division methods. Collisions occur when different data maps to the same index. Collision resolution techniques like separate chaining and open addressing are used to handle collisions. Separate chaining uses linked lists at each index while open addressing searches for empty slots using probing sequences like linear or quadratic probing.

Uploaded by

Mebratu Asrat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
163 views30 pages

Chapter One - Hashing PDF

The document discusses hashing techniques used to store and retrieve data from data structures like arrays and linked lists. Hashing uses a hash function to map data to hash table indexes, allowing constant-time lookups. Common hash functions include truncation, folding, mid-square, and division methods. Collisions occur when different data maps to the same index. Collision resolution techniques like separate chaining and open addressing are used to handle collisions. Separate chaining uses linked lists at each index while open addressing searches for empty slots using probing sequences like linear or quadratic probing.

Uploaded by

Mebratu Asrat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

HASHING TECHNIQUES

1
Hashing
Hashing:
• The array & linked list implementation of data sets requires on the
average of O(n) steps to execute a single insert, delete or find operations.
• Hashing is a technique used for performing insertion, deletion & search
operations in constant average time of O(1).
• Hashing is an effective way to store and retrieve the elements in some data
structure.
• Hashing is applied where the array size is large and time taken for
searching an element takes more time.

2 2
Hashing
Hash Function:
• Hash Function is a function which is used to put the data in the hash table.
Same function can be used to retrieve the data.
• A hash table uses a hash function to compute an index into an array of
buckets or slots, from which the correct value can be found.
• The hash function is used to determine the location of any record given its
key value.
• Bucket Array
• Array(A) of size N where each cell of A is a “bucket”
• Bucket=container of (k,e) pairs. (key, value pair)

3 3
Hashing
Hash Table:
• Hash Table is a data structure used for storing and retrieving data very
quickly.
• It is used to implement an associate array (key, value)pair, a structure that
can map key to values.
• Insertion and retrieval in the hash table is based on some key value. Hence
every entry is associated with some key.
• Hash Table is a bucket array together with a hash function.
• The implementation of hash tables are frequently called hashing.

4 4
Hashing
Hash Table:
• Hash Table is a bucket array together with a hash function.
Ex: Phone Book as a hash table
• Key: Name
• Value: Phone #

• Hash Address:
• The value that the hash function returns for a given key is its hash address.

5 5
Types of Hash functions
• Hash Address may be generated using different methods (hash fun) such
as:
Truncation Method
Folding Method
Mid square Method
Division Method
Multiplication Method
Digit Analysis Method

6 6
Types of Hash functions
1. Truncation Method
 Simplest and Easiest method
 The truncation method truncates a part of the given keys depending upon the size of
the hash table.
 If we assume that the size of the hash table is 1000, then the right most (or) left most
three digits are truncated and used as hash table address.
 Ex:
9874567 125978 9876545
Hash table address for the given keys are,
 567 978 545 (right most three digits)

7 7
Types of Hash functions
2. Digit Folding Method
• This method involves two steps:
• Divide the key-value k into a number of parts i.e. k1, k2, k3,….,kn, where each part
has the same number of digits except for the last part that can have lesser digits than
the other parts. m = 1000
• Add the individual parts. The hash value is obtained by ignoring the last carry if any.

8 8
Types of Hash functions
3. Mid Square Method
• In this method, key is multiplied by itself and address is obtained by
selecting appropriate no. of bits or digit from the middle digit.
• Suppose the hash table has 1000 memory locations. So r = 3 because three
digits are required to map the key to the memory location.

9 9
Types of Hash functions
4. Division Method (contd..)
• In this method modulo operation is applied on the key and the remainder of the
operation is taken as hash address.
• The most widely used method of hash function.

10 10
Types of Hash functions
Division Method (contd..)
If the key is a string, add up ASCII values of characters in the string.

A=65, a=97

11 11
Types of Hash functions
5. Multiplication Method
i. Multiply the key ‘k’ by a constant A and extract the fractional part of kA.
ii. Multiply this fractional part by ‘m’ and take floor value.

12 12
Types of Hash functions
6. Digit Analysis Method
Digits from some positions are chosen from the key and extracted digits
are reversed.

13 13
Hashing - collision
The situation in which the hash function returns the same hash address for
more than one record is called as Collision and the two same hash keys
returned for different records is called as Synonym.

14 14
Characteristics of good Hash function
 The hash function should be simple to compute.
 No. of collisions should be less. Ideally no collision should occur. Such a
function is called as Perfect Hash Function.
 Hash keys should be produced that will get distributed uniformly over an
array.
 Hash function should depend upon every bit of the key.

15 15
Collision resolution techniques
 If a collision occurs, then it should be handled by collision resolution
techniques.
1. Separate Chaining (Open Hashing)
2. Open Addressing (Closed Hashing)
i. Linear Probing
ii. Quadratic probing
iii. Double hashing

16 16
1. Separate Chaining
 In Separate Chaining, the strategy is to keep a list of all elements that hash
to the same value.
 The hash table is implemented as an array of linked lists.
 Creating a linked list of elements.
 Inserting / Retrieval / Deletion an item ‘r’ that hashes at index ‘i’ is simply
insertion into the linked list at position ‘i’.

17 17
Separate Chaining - Example
 Hash function for the following Example.
H(key)=key % Table Size

18 18
Separate Chaining (contd..)
Advantages:
• Separate Chaining is simple and efficient.
• Helps to get a uniform & perfect collision resolution hashing
• It allows an unlimited collision to the same hash address
• The elements having the same memory address will be in the same chain and
hence leads to faster searching.
Disadvantages:
• Extra space required for the linked lists.
• Implementation of separate data structure is required.
• For unevenly distributed keys, there may be long lists and many empty spaces
in the table.

19 19
2. Open addressing
• This technique is generally used where storage space is large.
• In Open addressing, if collision occurs, alternative cells are tried until an
empty cell is found.
• Arrays are used here as hash tables.
• All items are stored in the hash table itself
• In addition to the cell data, each cell keeps one of the three states:
EMPTY, OCCUPIED, DELETED
• While inserting, if a collision occurs, alternative cells are tried until an
empty cell is found.
• When key is deleted the cell is marked as DELETED rather than
EMPTY otherwise subsequent searches that hash at the deleted cell will
fail.
20 20
2. Open addressing
Probe Sequence:
• Sequence of array indexes that is followed in searching for an empty cell during
an insertion or in searching for a key during find or delete operations.

21 21
linear probing
• Linear probing resolves hash collisions of values of hash
functions by sequentially searching the hash table for next
free location.
• In Linear Probing Probe Sequence, F(i) = i

22 22
linear probing - example
• Insert 89, 18, 49, 58, 69 into hash table of size 10

Index Status Value O- Occupied


0 E O 49 E- Empty
D-Deleted
1 E O 58
2 E O 69
3 E
4 E
5 E
6 E
7 E
8 E O 18
9 E O 89
Hash0(89) =[Hash(89)+0]%10
= [89+0]%10
=9 23 23
linear probing - example
• Insert 89, 18, 49, 58, 69 into hash table of size 10

Primary Clustering:
• One problem with linear probing is formation of clusters.
• Primary Clustering means that any key that hashes into the cluster will require several
attempts to resolve the collision, and then it will add to the cluster.

24 24
quadratic probing
• Quadratic probing minimize the primary clustering problem of linear
probing.
• It works in the same way as linear probing to avoid collision except for
the change of search sequence which is quadratic. F(i) = i2

Secondary Clustering:
• The entries that collide with an occupied entry use the same probe sequence. This
is called secondary clustering which is a problem with quadratic probing.

25 25
Quadratic probing - example
• Insert 89, 18, 49, 58, 69 into hash table of size 10

Index Status Value


0 E O 49
1 E
2 E O 58
3 E O 69
4 E
5 E
6 E
7 E
8 E O 18
9 E O 89

26
Double Hashing
• Double Hashing eliminates the clustering problem in linear and quadratic
probing.
• It uses a secondary hash function on the keys when collision occurs
Secondary Hash Function
Hash2 (key) = R – (key mod R)
where,
R -> prime number, smaller than table size
• Then jump by Hash2(key) value.

27 27
Double Hashing - example
• Insert 89, 18, 49, 58, 69 into hash table of size 10
Index Status Value
0 E O 69
1 E
2 E
3 E O 58
4 E
5 E
6 E O 49
7 E
8 E O 18
9 E O 89

28 28
Quiz
1. Determine the hash address of the following elements using
multiplication and mid square methods.
 A[100] = {857, 645, 557, 48, 57}
2. Resolve the collision in the array of A[10] = {59, 18, 72, 48, 12, 9, 80,
23} by using:
a. Linear probing
b. Quadratic probing
c. Double hashing

29
1
-
3 30
0

You might also like