Kim et al., 1992 - Google Patents
An approximate string-matching algorithmKim et al., 1992
View PDF- Document ID
- 4756608920964292474
- Author
- Kim J
- Shawe-Taylor J
- Publication year
- Publication venue
- Theoretical Computer Science
External Links
Snippet
An approximate string-matching algorithm is described based on earlier attribute-matching algorithms. The algorithm involves building a trie from the text string which takes time O (N log 2 N), for a text string of length N. Once this data structure has been built any number of …
- 238000004458 analytical method 0 abstract description 19
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30964—Querying
- G06F17/30979—Query processing
- G06F17/30985—Query processing by using string matching techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30533—Other types of queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
- G06F17/30321—Indexing structures
- G06F17/3033—Hash tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30389—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30289—Database design, administration or maintenance
- G06F17/30303—Improving data quality; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/3071—Clustering or classification including class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4162711B2 (en) | System and method for portable document indexing using N-gram word decomposition | |
Boytsov | Indexing methods for approximate dictionary searching: Comparative analysis | |
US7996369B2 (en) | Method and apparatus for improving performance of approximate string queries using variable length high-quality grams | |
US7783660B2 (en) | System and method for enhanced text matching | |
US6018735A (en) | Non-literal textual search using fuzzy finite-state linear non-deterministic automata | |
US5319779A (en) | System for searching information using combinatorial signature derived from bits sets of a base signature | |
US8266152B2 (en) | Hashed indexing | |
US6754650B2 (en) | System and method for regular expression matching using index | |
US20100106713A1 (en) | Method for performing efficient similarity search | |
US20080147642A1 (en) | System for discovering data artifacts in an on-line data object | |
Hon et al. | Space-efficient frameworks for top-k string retrieval | |
US20080147578A1 (en) | System for prioritizing search results retrieved in response to a computerized search query | |
EP1999565A2 (en) | Hyperspace index | |
KR101511656B1 (en) | Ascribing actionable attributes to data that describes a personal identity | |
US20080147641A1 (en) | Method for prioritizing search results retrieved in response to a computerized search query | |
US20080147588A1 (en) | Method for discovering data artifacts in an on-line data object | |
US10552398B2 (en) | Database records associated with a tire | |
WO2014047214A1 (en) | Hierarchical ordering of strings | |
Kim et al. | An approximate string-matching algorithm | |
Zhang et al. | Minsearch: An efficient algorithm for similarity search under edit distance | |
Rogers et al. | Searching for historical word forms in text databases using spelling‐correction methods: Reverse error and phonetic coding methods | |
KR100459832B1 (en) | Systems and methods for indexing portable documents using the N-GRAMWORD decomposition principle | |
Ferragina et al. | Compressed cache-oblivious string B-tree | |
Fontoura et al. | Efficiently encoding term co-occurrences in inverted indexes | |
Sherkat et al. | A new approach for multi-pattern string matching in large text corpora |