Week 3
Week 3
Hashed List
Searches
Outline
• What is hashing
• What are hashed list searches
• Hashing methods
• Direct
• Subtraction
• Modulo division
• Digit extraction
• Mid square
• Folding
• Rotation
• Pseudorandom
• Collision Resolution
• Collision Resolution Methods
Hashed List Searches
• Ideal case: We know where our data is and we may access it directly
• The key is the address without any algorithmic modification (it starts from 0 or
1 and increments by 1)
• Example:
class Sales {
public:
Sales(const int day=0,
const int amount=0):day(day),amount(amount){}
private:
int day, amount;
};
Sale sale(0,100);
Sale dailySales[10];
dailySales[sale.day] += sale.amount;
Subtraction Method
class Employee {
public:
Employee():id(count++) {}
const int getID() const { return id; }
void setName(const std::string& n) { name = n; }
private:
int id;
string name;
static int count;
};
Employee employees[100];
Employee e;
employees[e.getID()-1000].setName(“Yasir”);
Modulo Division Method
• Divides the key by list/array size and uses the remainder as index
• Works with any size but prime numbers give less collisions
Modulo Division Example
Modulo Division Example
• Extract certain digits and then use the obtained number as address
• Assuming that we have an array size of 1000 (indices from 0 to 999) we
can use three digits
• Example: (extracting 1st , 3rd and 4th digit)
• 145674 => 156
• 344565 => 345
• 132554 => 125
Midsquare Method
• The key or a portion of it is squared and the address is selected from the
mid of the squared number
• Example:
• Assuming we have a three digit key and three digit address
• Apart from direct and subtraction methods, all other hashing methods
suffer from collisions
• Some hashed list terms:
𝑘
• Load factor (): Percentage of the number of elements in list divided by the total list
size ∝= ∗ 100
𝑛
• Clustering: Clustering is a build up of data unevenly across the hashed list due to
collision resolution.
• Primary Clustering: When clustering is at home address
• Secondary Clustering: When clustering is at collision path
Collision Resolution Methods
Open Addressing
• Resolves collision in the prime area (where all home address are stored)
• When collision occurs, prime area is searched for empty cells
• Basic two types
• Linear probe
• Quadratic probe
Linear Probe