KMP Algorithm
#string #pattern-matching
The Knuth-Morris-Pratt (KMP) algorithm efficiently searches for a pattern (substring) s2
within a text s1 by leveraging the information in the pattern itself to avoid redundant
comparisons. It preprocesses the pattern to create an array called the Longest Prefix
Suffix (LPS), which indicates the longest prefix that is also a suffix.
NOTE: KMP Algorithm is an efficient pattern-matching algorithm for finding single
patterns in a string. Consider using Rabin-Karp Algorithm for multiple patterns in a string.
Key Concepts
1. LPS Array:
The LPS array stores the lengths of the longest proper prefix of a substring which
is also a suffix.
Proper prefix: A prefix that is not the whole string.
Proper suffix: A suffix that is not the whole string.
Purpose:
When a mismatch occurs, the LPS array tells us the next position to compare in
the pattern, skipping unnecessary characters.
2. Algorithm Steps:
Preprocess the pattern to build the LPS array.
Use the LPS array to efficiently match the pattern in the text, minimizing
backtracking in the text.
Complexity
Time Complexity:
Preprocessing (LPS construction): O(m), where m is the length of the pattern.
Pattern searching: O(n), where n is the length of the text.
Total: O(n + m).
Space Complexity:
O(m) , due to the LPS array.
Reference
Watch an in-depth explanation and walkthrough of the KMP algorithm: Inside Code: KMP
Algorithm.