0% found this document useful (0 votes)

8 views18 pages

Module 3 (Part 3) Hash

The document discusses string matching algorithms, focusing on Horspool's algorithm, which enhances efficiency by using a shift table for mismatches. It also covers hashing techniques, including open and closed hashing, for efficient data structure implementation. The document outlines the principles of preprocessing and collision handling in hashing, emphasizing their applications in computer science.

Uploaded by

pooravgowda20

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views18 pages

Module 3 (Part 3) Hash

Uploaded by

pooravgowda20

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 18

TIME AND SPACE TRADEOFFS:

Input Enhancement in String Matching: Horspool’s algorithm,

Hashing: Open and Closed hashing.

Prof. Naveen Kulkarni

Assistant Professor
Department of Computer Science and Engineering (Cyber Security)
Dayananda Sagar University, Harohalli, Karnataka.
Email-id: naveenk-cse@dsu.edu.in
Faculty Cabin: A434
Review: String searching by brute force
Pattern: a string of m characters to search for
Text: a (long) string of n characters to search in

Brute force algorithm

Step 1 Align pattern at beginning of text
Step 2 Moving from left to right, compare each character of
pattern to the corresponding character in text until either all characters are found to match
(successful search) or a mismatch is detected
Step 3 While a mismatch is detected and the text is not yet exhausted, realign pattern one position to
the right and repeat Step 2
Worst case efficiency of this algorithm is in the O(nm) class.
Intention is to improve the efficiency.

Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 2
String searching by preprocessing
Several string searching algorithms are based on the input enhancement idea of preprocessing the
pattern

• Knuth-Morris-Pratt (KMP) algorithm preprocesses pattern left to right to get useful

information for later searching.

• Boyer -Moore algorithm preprocesses pattern right to left and store information into two tables.

• Horspool’s algorithm simplifies the Boyer-Moore algorithm by using just one table.

Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 3
Horspool’s Algorithm
A simplified version of Boyer-Moore algorithm:

• Preprocesses pattern to generate a shift table that determines how much to

shift the pattern when a mismatch occurs.

• Always makes a shift based on the text’s character c aligned with the last
character in the pattern according to the shift table’s entry for c.

Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 4
How far to shift?
Look at first (rightmost) character in text that was compared:
• The character is not in the pattern
.....c...................... (c not in pattern)
BAOBAB

• The character is in the pattern (but not the rightmost)

.....O...................... (O occurs once in pattern)
BAOBAB
.....A...................... (A occurs twice in pattern)
BAOBAB

• The rightmost characters do match

.....B......................
BAOBAB
Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 5
Shift table
• Shift sizes can be precomputed by the formula
distance from c’s rightmost occurrence in pattern
among its first m-1 characters to its right end
t(c) = pattern’s length m, otherwise
by scanning pattern before search begins and stored in a table called shift table

• Shift table is indexed by text and pattern alphabet

Eg, for BAOBAB:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

1 2 6 6 6 6 6 6 6 6 6 6 6 6 3 6 6 6 6 6 6 6 6 6 6 6

Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 6
Example of Horspool’s alg. application
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _

1 2 6 6 6 6 6 6 6 6 6 6 6 6 3 6 6 6 6 6 6 6 6 6 6 6 6

BARD LOVED BANANAS

BAOBAB
BAOBAB
BAOBAB
BAOBAB (unsuccessful search)

Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 7
Algorithm ShiftTavble(P[0..m-1])
// Fills the shift table used by Hospool’s and Boyer-Moore algorithms
//Input: Pattern P[0..m-1] and an alphabet of possible characters
//Output: Table[0..size-1] indexed by the alphabet of possible characters
filled with shift sizes computed by formula.
for i  0 to size – 1 do Table[i]  m
for j  0 to m-2 do Table[P[j]]  m-1-j
return Table

Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 8
Horspool’s Algorithm
Step 1 : For a given pattern of length m and the alphabet used in both the pattern and text, construct
the shift table.
Step 2 : Align the pattern against the beginning of the text.
Step 3 : Repeat the following until either a matching substring is found or the pattern reaches beyond
the last character of the text. Starting with the last character in the pattern, compare the
corresponding characters in the pattern and text until either all m characters are matched
(then stop) or a mismatching pair is encountered. In the latter case, retrieve the entry t(c)
from the c’s column of the shift table where c is text’s character currently aligned against the
last character of the pattern, and shift the pattern by t(c) characters to right along the text.

Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 9
ALGORITHM HorspoolMatching(P[0..m − 1], T [0..n − 1])
//Implements Horspool’s algorithm for string matching
//Input: Pattern P[0..m − 1] and text T [0..n − 1]
//Output: The index of the left end of the first matching substring or −1 if there are no matches
ShiftTable(P[0..m − 1]) //generate Table of shifts
i←m−1 //position of the pattern’s right end
while i ≤ n − 1 do
k←0 //number of matched characters
while k ≤ m − 1 and P[m − 1 − k] = T [i − k] do
k←k+1
if k = m
return i − m + 1
else i ← i + Table[T [i]]
return −1
Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 10
Efficiency of Horspool’s Algorithm
• A simple example can demonstrate that the worst-case efficiency of
Horspool’s algorithm is in O(nm).
• But for random texts, it is in Θ(n), and, although in the same efficiency
class, Horspool’s algorithm is obviously faster on average than the
brute-force algorithm

Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 11
Hashing
• Idea : prestructuring — preprocess the input to make accessing its elements easier.
• Hashing is a very efficient method for implementing a dictionary, i.e., a set
with the operations:
• find
• insert
• delete

• Based on representation-change and space-for-time tradeoff ideas

• Important applications:
• symbol tables
• databases (extendible hashing)
Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 12
Hash tables and hash functions
The idea of hashing is to map keys of a given file of size n into a table of size m,
called the hash table, by using a predefined function, called the hash function.
h: K  location (cell) in the hash table
Example: student records, key = SSN. Hash function:
h(K) = K mod m where m is some integer (typically, prime)
If m = 1000, where is record with SSN= 314159265 stored?
Generally, a hash function should:
• be easy to compute
• distribute keys about evenly throughout the hash table

Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 13
Collisions
If h(K1) = h(K2), there is a collision
• Good hash functions result in fewer collisions but some collisions should be expected
(birthday paradox)
• Two principal hashing schemes handle collisions differently:
• Open hashing
– each cell is a header of linked list of all keys hashed to it
• Closed hashing
• one key per cell
• in case of collision, finds another cell by
• linear probing: use next free bucket
• double hashing: use second hash function to compute increment
Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 14
Open hashing (Separate chaining)
Keys are stored in linked lists outside a hash table whose elements serve as the lists’ headers.
Example: A, FOOL, AND, HIS, MONEY, ARE, SOON, PARTED
h(K) = sum of K ‘s letters’ positions in the alphabet MOD 13
Keys A FOOL AND HIS MONEY ARE SOON PARTED

Hash 1 9 6 10 7 11 11 12
addressess

0 1 2 3 4 5 6 7 8 9 10 11 12

A AND MONEY FOOL HIS ARE Parted

Search for KID SOON
Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 15
Open hashing (cont.)
• If hash function distributes keys uniformly, average length of
linked list will be α = n/m. This ratio is called load factor.
• Average number of probes in successful, S, and unsuccessful
searches, U:
S  1+α/2, U=α
• Load α is typically kept small (ideally, about 1)
• Open hashing still works if n > m

Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 16
Closed hashing (Open addressing)

Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 17
Closed hashing (cont.)
• Does not work if n > m
• Avoids pointers
• Deletions are not straightforward
• Number of probes to find/insert/delete a key depends on load factor α = n/m (hash table density)
and collision resolution strategy. For linear probing:
S = (½) (1+ 1/(1- α)) and U = (½) (1+ 1/(1- α)²)
• As the table gets filled (α approaches 1), number of probes in linear probing increases
dramatically:

Slides prepared by: Prof.Naveen Kulkarni, Assistant Professor, Department of CSE (CS), Dayananda Sagar University, Harohalli, Karnataka. 18

Daa Unit 3
No ratings yet
Daa Unit 3
91 pages
CCCS314 - DAA - 22!23!3rd 05 Space and Time Tradeoffs - Modified
No ratings yet
CCCS314 - DAA - 22!23!3rd 05 Space and Time Tradeoffs - Modified
30 pages
Chapter 7
No ratings yet
Chapter 7
29 pages
U3 - SpaceAndTimeTradeoff
No ratings yet
U3 - SpaceAndTimeTradeoff
30 pages
Chapter 7: Space and Time Tradeoffs: Distribution Counting
No ratings yet
Chapter 7: Space and Time Tradeoffs: Distribution Counting
5 pages
Inf715 11
No ratings yet
Inf715 11
57 pages
CTIS264 - Chapter7 - Levitin 2012
No ratings yet
CTIS264 - Chapter7 - Levitin 2012
20 pages
M4 - Chapter 7 1
No ratings yet
M4 - Chapter 7 1
30 pages
String Matching for Programmers
No ratings yet
String Matching for Programmers
4 pages
Module3 - Space For Time Trade Off
No ratings yet
Module3 - Space For Time Trade Off
23 pages
09 Hashtables
No ratings yet
09 Hashtables
25 pages
Unit - 3
No ratings yet
Unit - 3
45 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
Horspool Algorithm PPT A4
No ratings yet
Horspool Algorithm PPT A4
4 pages
Hashing: 15-111 Data Structures Data Structures
No ratings yet
Hashing: 15-111 Data Structures Data Structures
30 pages
U4 Updated
No ratings yet
U4 Updated
155 pages
DSA2 Chapter 5 Hashing
No ratings yet
DSA2 Chapter 5 Hashing
44 pages
Hash Tables for EE Students
No ratings yet
Hash Tables for EE Students
21 pages
Implementing Dictionaries with Hashing
No ratings yet
Implementing Dictionaries with Hashing
25 pages
05 Hashing
No ratings yet
05 Hashing
47 pages
Skip List & Hashing: Cse, Postech
No ratings yet
Skip List & Hashing: Cse, Postech
36 pages
Chapter 5 - Hashing - Part1
No ratings yet
Chapter 5 - Hashing - Part1
28 pages
05 Csai 230 Course 05
No ratings yet
05 Csai 230 Course 05
44 pages
CS2040 Tutorial4 Ans
No ratings yet
CS2040 Tutorial4 Ans
5 pages
Ch-7 Space and Time Tradeoffs - PPSX
No ratings yet
Ch-7 Space and Time Tradeoffs - PPSX
34 pages
11 Hash Tables II
No ratings yet
11 Hash Tables II
18 pages
Chapter10 HashTables
No ratings yet
Chapter10 HashTables
49 pages
CS2040 Tutorial4 Qns
No ratings yet
CS2040 Tutorial4 Qns
3 pages
DAA Unit 4
No ratings yet
DAA Unit 4
34 pages
Ads-Unit I
No ratings yet
Ads-Unit I
16 pages
Hashing and Indexing
No ratings yet
Hashing and Indexing
28 pages
Unit V
No ratings yet
Unit V
5 pages
6 - Hashing
No ratings yet
6 - Hashing
52 pages
Ders7 - Data Structures and Search Algorithms
No ratings yet
Ders7 - Data Structures and Search Algorithms
41 pages
Hash Tables for Computer Science Students
No ratings yet
Hash Tables for Computer Science Students
20 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
25 pages
Vision Cs 2023 Algorithm Chapter 2 Hashing 85
No ratings yet
Vision Cs 2023 Algorithm Chapter 2 Hashing 85
12 pages
Hashing
No ratings yet
Hashing
37 pages
Lecture 23 Hash Code Map
No ratings yet
Lecture 23 Hash Code Map
41 pages
10 Hash Tables
No ratings yet
10 Hash Tables
19 pages
Hashing PDF
No ratings yet
Hashing PDF
61 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
Horspool Algorithm
No ratings yet
Horspool Algorithm
6 pages
c11 Hashing
No ratings yet
c11 Hashing
9 pages
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
No ratings yet
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
53 pages
Hashing
No ratings yet
Hashing
7 pages
Dsa Merged
No ratings yet
Dsa Merged
339 pages
Lecture 8 Hashing
No ratings yet
Lecture 8 Hashing
47 pages
Dsa Module 6 Ktuassist
No ratings yet
Dsa Module 6 Ktuassist
9 pages
Hashing
50% (2)
Hashing
43 pages
Modue 5
No ratings yet
Modue 5
10 pages
Hashing Linear Probing
No ratings yet
Hashing Linear Probing
91 pages
Hassing Dsa
No ratings yet
Hassing Dsa
28 pages
Lecture03 Hashing
No ratings yet
Lecture03 Hashing
12 pages
4
No ratings yet
4
29 pages
Hash Table PDF
No ratings yet
Hash Table PDF
25 pages
CH 4
No ratings yet
CH 4
58 pages
Dynamic Programming & Algorithms
No ratings yet
Dynamic Programming & Algorithms
68 pages
ADA Module 3 - Full
No ratings yet
ADA Module 3 - Full
40 pages
Design and Analysis of Algorithms Lab
No ratings yet
Design and Analysis of Algorithms Lab
62 pages
15 BoyerMoore
No ratings yet
15 BoyerMoore
16 pages
CS251 Unit4 Slides
No ratings yet
CS251 Unit4 Slides
127 pages
DAA Space and TimeTradeoff
No ratings yet
DAA Space and TimeTradeoff
24 pages
Daa Notes Nep
No ratings yet
Daa Notes Nep
153 pages
DAA - Unit IV - Space and Time Tradeoffs - Lecture Slides
No ratings yet
DAA - Unit IV - Space and Time Tradeoffs - Lecture Slides
41 pages
Ch09 Space and Time Tradeoffs
No ratings yet
Ch09 Space and Time Tradeoffs
41 pages
ADA UNIT 3 Complete Notes
No ratings yet
ADA UNIT 3 Complete Notes
59 pages
Introduction To The Design and Analysis of Algorithms - Lecture Notes, Study Material and Important Questions, Answers
No ratings yet
Introduction To The Design and Analysis of Algorithms - Lecture Notes, Study Material and Important Questions, Answers
16 pages

Module 3 (Part 3) Hash

Uploaded by

Module 3 (Part 3) Hash

Uploaded by

TIME AND SPACE TRADEOFFS:

Input Enhancement in String Matching: Horspool’s algorithm,

Prof. Naveen Kulkarni

Brute force algorithm

• Knuth-Morris-Pratt (KMP) algorithm preprocesses pattern left to right to get useful

• Preprocesses pattern to generate a shift table that determines how much to

• The character is in the pattern (but not the rightmost)

• The rightmost characters do match

• Shift table is indexed by text and pattern alphabet

BARD LOVED BANANAS

• Based on representation-change and space-for-time tradeoff ideas

A AND MONEY FOOL HIS ARE Parted

You might also like