[go: up one dir, main page]

Kim et al., 1992 - Google Patents

An approximate string-matching algorithm

Kim et al., 1992

View PDF
Document ID
4756608920964292474
Author
Kim J
Shawe-Taylor J
Publication year
Publication venue
Theoretical Computer Science

External Links

Snippet

An approximate string-matching algorithm is described based on earlier attribute-matching algorithms. The algorithm involves building a trie from the text string which takes time O (N log 2 N), for a text string of length N. Once this data structure has been built any number of …
Continue reading at www.sciencedirect.com (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30943Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
    • G06F17/30964Querying
    • G06F17/30979Query processing
    • G06F17/30985Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/30675Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30613Indexing
    • G06F17/30619Indexing indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30533Other types of queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30312Storage and indexing structures; Management thereof
    • G06F17/30321Indexing structures
    • G06F17/3033Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30943Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
    • G06F17/30946Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30389Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30289Database design, administration or maintenance
    • G06F17/30303Improving data quality; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30705Clustering or classification
    • G06F17/3071Clustering or classification including class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2795Thesaurus; Synonyms

Similar Documents

Publication Publication Date Title
JP4162711B2 (en) System and method for portable document indexing using N-gram word decomposition
Boytsov Indexing methods for approximate dictionary searching: Comparative analysis
US7996369B2 (en) Method and apparatus for improving performance of approximate string queries using variable length high-quality grams
US7783660B2 (en) System and method for enhanced text matching
US6018735A (en) Non-literal textual search using fuzzy finite-state linear non-deterministic automata
US5319779A (en) System for searching information using combinatorial signature derived from bits sets of a base signature
US8266152B2 (en) Hashed indexing
US6754650B2 (en) System and method for regular expression matching using index
US20100106713A1 (en) Method for performing efficient similarity search
US20080147642A1 (en) System for discovering data artifacts in an on-line data object
Hon et al. Space-efficient frameworks for top-k string retrieval
US20080147578A1 (en) System for prioritizing search results retrieved in response to a computerized search query
EP1999565A2 (en) Hyperspace index
KR101511656B1 (en) Ascribing actionable attributes to data that describes a personal identity
US20080147641A1 (en) Method for prioritizing search results retrieved in response to a computerized search query
US20080147588A1 (en) Method for discovering data artifacts in an on-line data object
US10552398B2 (en) Database records associated with a tire
WO2014047214A1 (en) Hierarchical ordering of strings
Kim et al. An approximate string-matching algorithm
Zhang et al. Minsearch: An efficient algorithm for similarity search under edit distance
Rogers et al. Searching for historical word forms in text databases using spelling‐correction methods: Reverse error and phonetic coding methods
KR100459832B1 (en) Systems and methods for indexing portable documents using the N-GRAMWORD decomposition principle
Ferragina et al. Compressed cache-oblivious string B-tree
Fontoura et al. Efficiently encoding term co-occurrences in inverted indexes
Sherkat et al. A new approach for multi-pattern string matching in large text corpora