Garcia et al., 1995 - Google Patents
Error detection in character recognition using pseudosyllable analysisGarcia et al., 1995
- Document ID
- 4381726650377680035
- Author
- Garcia R
- Dimitriadis Y
- Pastor F
- Coronado J
- Publication year
- Publication venue
- Proceedings of 3rd International Conference on Document Analysis and Recognition
External Links
Snippet
In modern document management systems it is difficult to include large vocabularies (more than 150,000 words long) to detect on-line errors. The main drawback lies in the manipulation of the great amounts of data. This difficulty becomes critical if the system …
- 238000001514 detection method 0 title abstract description 21
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30964—Querying
- G06F17/30979—Query processing
- G06F17/30985—Query processing by using string matching techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP3152868B2 (en) | Search device and dictionary / text search method | |
US6738741B2 (en) | Segmentation technique increasing the active vocabulary of speech recognizers | |
US5835888A (en) | Statistical language model for inflected languages | |
WO1995027954A1 (en) | Pattern recognition method and system | |
CN101739143A (en) | Character inputting method and character inputting system | |
CN111444720A (en) | Named entity recognition method for English text | |
Lehal et al. | A shape based post processor for Gurmukhi OCR | |
Adda-Decker | A corpus-based decompounding algorithm for German lexical modeling in LVCSR. | |
Garcia et al. | Error detection in character recognition using pseudosyllable analysis | |
Chaudhuri et al. | OCR error detection and correction of an inflectional indian language script | |
CN101499056A (en) | Backward reference sentence pattern language analysis method | |
CN114881017B (en) | Self-adaptive dynamic word segmentation method | |
Lee | Machine-to-man communication by speech Part 1: Generation of segmental phonemes from text | |
CN113111651A (en) | Chinese word segmentation method and device and search word bank reading method | |
Raza et al. | Saraiki language word prediction and spell correction framework | |
CN1323004A (en) | Automatic conversion method from Chinese braille to Chinese character | |
JP2000105597A (en) | Speech recognition error correction device | |
Hengsanankun et al. | Linguistic Rules-Based Approach for Translating Nyaw Language to the Phonetic Alphabet | |
JP3001334B2 (en) | Language processor for recognition | |
CN1069420C (en) | Method for inputting Chinese characters by using their pronunciations and shapes | |
Chen et al. | PAT-tree-based Language Modeling with Initial Application of Chinese Speech Recognition Output Verification | |
CN101206665A (en) | Multilingual words information searching method | |
JPH0627985A (en) | Speech recognizing method | |
JPH0337764A (en) | Word dictionary retrieving device | |
CN118748009A (en) | Method, device, equipment and medium for processing multiple pronunciation problems in speech recognition |