Hashing

The document discusses hashing, priority queues, and efficient binary search trees. It explains the process of hashing, types of hash functions, collision resolution techniques, and dynamic hashing methods. Additionally, it defines priority queues, their types, and their implementation using heaps.

Uploaded by

findthevalueofxy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views8 pages

Hashing

Uploaded by

findthevalueofxy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Module 5

Hashing, Priority Queues & Efficient Binary Search Trees

HASHING: Hashing is the process of transforming any key or a string into another fixed
size value usually of smaller size using a hash function.
HASH TABLE REPRESENTATION
In hashing the dictionary pairs are stored in a table, ht, called the hash table. The hash table is
partitioned into b buckets, ht[0],…,ht[b-1]. The address or location of a pair is determined by
a hash function, h, which maps keys into buckets.
 Hash table is a data structure used for storing and retrieving data very quickly. Insertion
of data in the hash table is based on the key value. Hence every entry in the hash table
is associated with some key.
 Using the hash key the required piece of data can be searched in the hash table by few
or more key comparisons.
HASH FUNCTION
 Hash function is a function which is used to store the data into the hash table. Hence
one can use the same hash function to retrieve the data from the hash table. The integer
returned by the hash function is called hash key.
Static Hashing: Refers to the hashing process where the size of the hash table is fixed.

TYPES OF HASH FUNCTIONS:

There are several types of uniform hash functions:
1. Mid-square:
 The middle of square hash function is frequently used in symbol table applications.
The function is computed by squaring the identifier and then using an appropriate
number of bits from the middle of the square to obtain the bucket address.
 Since the middle bits of the square usually depend upon all the characters in an
identifier, there is a high probability that different identifiers will produce different
hash addresses, even when some of the characters are the same. The size of the hash
table should be a power of 2 when this scheme is used.

2. Division:
 In this method, a simple hash function is obtained by using the modulus (%)
operator.
 In this scheme, we divide the identifier x by some number M and use the
remainder as the hash address for x.
 The hash function is:

f(x)= x % M
 This gives bucket addresses that range from 0 to M - 1, where M = the table
size.
3. Folding:
 In this method, the identifier ‘X’ is partitioned into several parts. All parts,
except for the last one have the same length.
 We then add the parts together to obtain the hash address for X.

4. Digit Analysis:
 The last method we will examine, digit analysis, is used with static files. A
static file is one in which all the identifiers are known in advance.
 We first transform the identifiers into numbers using some radix, r. Then
examine the digits of each identifier. Some digits having most skewed
distributions are deleted.
 This deleting of digits is continued until the number of remaining digits is small
enough to give an address in the range of the hash table. Then these digits are
used to calculate the hash address.
COLLISSION
The situation in which the hash function returns the same hash key (home bucket) for
more than one record is called collision and two same hash keys returned for different records
is called synonym.
COLLISION RESOLUTION TECHNIQUES
If collision occurs then it should be handled by applying some techniques. Such a
technique is called collision resolution technique.
There are two methods for detecting collisions and overflows in a static hash table; each
method using a different data structure to represent the hash table. Linear Probing and
Chaining.
1. Open addressing (linear probing)
2. Chaining
3. Rehashing
1. Linear Probing:
 When linear open addressing is used , the hash table is represented as a one dimensional
array with indices that range from 0 to the desired table size - 1.
 The component type of the array is a struct that contains at least a key field. The C
declarations creating the hash table ht with one slot per bucket are:

 Before inserting any elements into this table, the table must be initialized to represent
the situation where all slots are empty. This allows to detect overflows and collisions.
 If the slot at the hash address is empty, we simply place the new element into this slot.
However, if the new element is hashed into a full bucket, we must find another bucket
for it. The simplest solution places the new element in the closest unfilled bucket.

Problem with linear probing:

 One major problem with linear probing is primary clustering. Primary clustering is a
process in which a block of data is formed in the hash table when collision is
resolved.
 These clusters tend to merge as more identifiers are entered into the table, thus leading
to bigger clusters. We can partially curtail the growth of these clusters and hence reduce
the average number of probes by using quadratic probing.
( Refer Lab Prog 12: for implementation).
2. Chaining:
In collision handling method chaining is a concept which introduces an additional field with
data i.e. chain. A separate chain table is maintained for colliding data. When collision occurs
then a linked list(chain) is maintained at the home bucket.
For eg;
Consider the keys to be placed in their home buckets are 131, 3, 4, 21, 61, 7, 97, 8, 9
Hash function as H(key) = key % D, Where D is the size of table.
3. Rehashing: The rehashing method is to use a series of hash functions h1,h2,…,hm, when the
collision occurs.

DYNAMIC HASHING:
• The dynamic hashing method is used to overcome the problems of static hashing like
bucket overflow.
• In this method, data buckets grow or shrink as the records increases or decreases. This
method is also known as Extendable hashing method.
• One of the most important classes of software is the database management system or
DBMS.
• In a DBMS the user enters a query using some language (possibly SQL) and the system
translates it and retrieves the resulting data. Fast access time is essential since a DBMS
is typically used to hold large sets of information.
• Another key characteristic of a DBMS is that the amount of information can vary a
great deal over time.
• Dynamic hashing retains the fast retrieval time of conventional hashing, while
extending the technique so that it can accommodate dynamically increasing and
decreasing file size without penalty.
• We assume that a file, F, is a collection of records, R. Each record has a key field, K,
by which it is identified. Records are stored in buckets, or pages as they are called in
dynamic hashing, whose capacity is p.
i. Dynamic Hashing Using Directories
 Consider an example where an identifier consists of two characters and each
character is represented by 3 bits.

 The identifiers are to be placed into a table that has four pages. Each page can hold no
more than two identifiers, and the pages are indexed by the 2 bit sequence 00, 01, 10, 11,
respectively.
 Depth refers to the number of bits used as the index.
ii. Directoryless Dynamic Hashing:
 Assuming that there exists a contiguous address space which is large enough
to hold all the records, we can eliminate the directory.
 In effect, this leaves it to the operating system to break the address space
into pages, and to manage moving them into and out of memory. This
scheme is referred to as directoryless hashing or linear hashing.
Priority Queues
Definition:
 A priority queue is a collection of zero or more elements. Each element has a
priority or value.
 Unlike the queues, which are FIFO structures, the order of deleting from a priority
queue is determined by the element priority.
 Elements are removed/deleted either in increasing or decreasing order of priority
rather than in the order in which they arrived in the queue.

There are two types of priority queues:

 Min priority queue: Collection of elements in which the items can be inserted
arbitrarily, but only smallest element can be removed.
 Max priority queue: Collection of elements in which insertion of items can be in any
order but only largest element can be removed.
 In priority queue, the elements are arranged in any order and out of which only the
smallest or largest element allowed to delete each time.
 The implementation of priority queue can be done using arrays or linked list. The data
structure heap is used to implement the priority queue effectively.
Leftist Trees.
Optimal BSTS
Notes on the above topics will be shared soon.

Republic of The Philippines Position Description Form DBM-CSC Form No. 1
No ratings yet
Republic of The Philippines Position Description Form DBM-CSC Form No. 1
4 pages
Module 5
No ratings yet
Module 5
72 pages
MODULE 5_BCS304_HASHING_Leftisht trees_OBST_Notes
No ratings yet
MODULE 5_BCS304_HASHING_Leftisht trees_OBST_Notes
32 pages
Implementation Priority Queue Using Array
No ratings yet
Implementation Priority Queue Using Array
3 pages
unit 1 Hashing
No ratings yet
unit 1 Hashing
61 pages
Chapter One - Hashing PDF
No ratings yet
Chapter One - Hashing PDF
30 pages
MODULE-5
No ratings yet
MODULE-5
33 pages
DS Module-X
No ratings yet
DS Module-X
74 pages
Dsa 4
No ratings yet
Dsa 4
55 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
5 pages
Hashing new
No ratings yet
Hashing new
48 pages
Unit-5 2
No ratings yet
Unit-5 2
9 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
Hashing
No ratings yet
Hashing
20 pages
Hashing and Graphs
No ratings yet
Hashing and Graphs
28 pages
Hashing Slide
No ratings yet
Hashing Slide
16 pages
Hashing
No ratings yet
Hashing
75 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
Hashing
No ratings yet
Hashing
56 pages
Unit III-Hashing
100% (1)
Unit III-Hashing
135 pages
Hashing
No ratings yet
Hashing
34 pages
ds-5_removed
No ratings yet
ds-5_removed
16 pages
DSA_240404_220052 (1)
No ratings yet
DSA_240404_220052 (1)
9 pages
Hashing
No ratings yet
Hashing
16 pages
Unit 1 Dsa Hashing
No ratings yet
Unit 1 Dsa Hashing
137 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
Hashing: Amar Jukuntla
No ratings yet
Hashing: Amar Jukuntla
22 pages
DSA MK Lect2 PDF
No ratings yet
DSA MK Lect2 PDF
92 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
Modifed Hash
No ratings yet
Modifed Hash
42 pages
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
No ratings yet
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
24 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
43 pages
05 Hashing
No ratings yet
05 Hashing
47 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
32 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
25 pages
Hashing Part1 - 241021 - 152911
No ratings yet
Hashing Part1 - 241021 - 152911
10 pages
Dsa Hashing (21CS32)
No ratings yet
Dsa Hashing (21CS32)
16 pages
dshash
No ratings yet
dshash
4 pages
Hashing
No ratings yet
Hashing
23 pages
CO4 - Hashing in Data Structure
No ratings yet
CO4 - Hashing in Data Structure
13 pages
GROUP 15.Pptx Presentation
No ratings yet
GROUP 15.Pptx Presentation
29 pages
Study_Material_on_Hashing
No ratings yet
Study_Material_on_Hashing
4 pages
ADS M TECH MID 2
No ratings yet
ADS M TECH MID 2
26 pages
DSA Lab 11 Hashing
No ratings yet
DSA Lab 11 Hashing
9 pages
Search vs. Hashing
No ratings yet
Search vs. Hashing
55 pages
Hashing Techniques
No ratings yet
Hashing Techniques
13 pages
SORTING PROGRAMS - Counting + Bucket + Heap
No ratings yet
SORTING PROGRAMS - Counting + Bucket + Heap
27 pages
CHAPTER 8 Hashing: Instructors: C. Y. Tang and J. S. Roger Jang
No ratings yet
CHAPTER 8 Hashing: Instructors: C. Y. Tang and J. S. Roger Jang
78 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
09 Hashtable
No ratings yet
09 Hashtable
53 pages
UNIT 1- Hashing
No ratings yet
UNIT 1- Hashing
118 pages
Module 5
No ratings yet
Module 5
25 pages
Chapter 4 Hashing and File Structure
No ratings yet
Chapter 4 Hashing and File Structure
46 pages
Unit 3.Docx Dbms
No ratings yet
Unit 3.Docx Dbms
25 pages
Hashing Unit 1
No ratings yet
Hashing Unit 1
91 pages
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
The Tech Interview Playbook: From DSA to System Design
From Everand
The Tech Interview Playbook: From DSA to System Design
Chinmoy Mukherjee
No ratings yet
300+ Python Algorithms: Mastering the Art of Problem-Solving
From Everand
300+ Python Algorithms: Mastering the Art of Problem-Solving
Hernando Abella
5/5 (1)
Anillos Abelianos
No ratings yet
Anillos Abelianos
11 pages
Preventive, Performance Monitoring
No ratings yet
Preventive, Performance Monitoring
3 pages
Masterclass Booklet: Minor Scale Soloing Masterclass Beginner
100% (1)
Masterclass Booklet: Minor Scale Soloing Masterclass Beginner
45 pages
Beauty Industry
No ratings yet
Beauty Industry
20 pages
Arellano University Andres Bonifacio Campus Practical Research 2 Diagnostic Test
No ratings yet
Arellano University Andres Bonifacio Campus Practical Research 2 Diagnostic Test
2 pages
Brief Overview of Steel Authority of India Ltd. (SAIL)
No ratings yet
Brief Overview of Steel Authority of India Ltd. (SAIL)
8 pages
ESAT Summary - Highly Proficient Teachers
100% (1)
ESAT Summary - Highly Proficient Teachers
5 pages
DE SEMANTICS DUNG ON TAP
No ratings yet
DE SEMANTICS DUNG ON TAP
6 pages
Ferolite Nam 39
No ratings yet
Ferolite Nam 39
2 pages
Metropolitan Bank and Trust Co. Vs Naguiat
No ratings yet
Metropolitan Bank and Trust Co. Vs Naguiat
35 pages
Mca C104
No ratings yet
Mca C104
3 pages
TJ v Series Catalog Final V3
No ratings yet
TJ v Series Catalog Final V3
13 pages
Customer No.: 22855256 IFSC Code: DBSS0IN0811 MICR Code: Branch Address
No ratings yet
Customer No.: 22855256 IFSC Code: DBSS0IN0811 MICR Code: Branch Address
12 pages
Tenneco Sanmina ITW Global Automotive Caparo Timken Madras Engineering Industries
No ratings yet
Tenneco Sanmina ITW Global Automotive Caparo Timken Madras Engineering Industries
4 pages
Ccda PDF
No ratings yet
Ccda PDF
96 pages
History of Hot Dog
No ratings yet
History of Hot Dog
8 pages
4 PDF
No ratings yet
4 PDF
2 pages
PETERSEN, COLLIN - /will - We - 'Fix' - The - Weather - Yes. - Should - We - Fix - The - Weather - Hmmm-Teacher-14
100% (1)
PETERSEN, COLLIN - /will - We - 'Fix' - The - Weather - Yes. - Should - We - Fix - The - Weather - Hmmm-Teacher-14
6 pages
The Designing and Implementation of A Problem Based Learning in Collaborative Virtual Environments Using MMOG Technology
No ratings yet
The Designing and Implementation of A Problem Based Learning in Collaborative Virtual Environments Using MMOG Technology
7 pages
Project
No ratings yet
Project
22 pages
Radiopharmacy-9
No ratings yet
Radiopharmacy-9
34 pages
Vogue India May 2023
No ratings yet
Vogue India May 2023
144 pages
Berries in Dermatology
No ratings yet
Berries in Dermatology
2 pages
Possessive Pronouns 7°-8°
No ratings yet
Possessive Pronouns 7°-8°
4 pages
Meril Sunscreen
No ratings yet
Meril Sunscreen
26 pages
Manorama 2017 Book Review
No ratings yet
Manorama 2017 Book Review
2 pages
TelesisGuide 20191111
No ratings yet
TelesisGuide 20191111
31 pages
Project Proposal
No ratings yet
Project Proposal
7 pages
RCoC-B + SKM - Tentative-10 Days
No ratings yet
RCoC-B + SKM - Tentative-10 Days
3 pages

Hashing

Uploaded by

Hashing

Uploaded by

Module 5

Hashing, Priority Queues & Efficient Binary Search Trees

TYPES OF HASH FUNCTIONS:

Problem with linear probing:

There are two types of priority queues:

You might also like