Saraclar et al., 2004 - Google Patents
Lattice-based search for spoken utterance retrievalSaraclar et al., 2004
View PDF- Document ID
- 8824511377763269970
- Author
- Saraclar M
- Sproat R
- Publication year
- Publication venue
- Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004
External Links
Snippet
Recent work on spoken document retrieval has suggested that it is adequate to take the singlebest output of ASR, and perform text retrieval on this output. This is reasonable enough for the task of retrieving broadcast news stories, where word error rates are …
- 238000000034 method 0 abstract description 17
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/085—Methods for reducing search complexity, pruning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
- G10L15/265—Speech recognisers specially adapted for particular applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Saraclar et al. | Lattice-based search for spoken utterance retrieval | |
US9965552B2 (en) | System and method of lattice-based search for spoken utterance retrieval | |
Yu et al. | A hybrid word/phoneme-based approach for improved vocabulary-independent search in spontaneous speech. | |
Kurimo et al. | Unlimited vocabulary speech recognition for agglutinative languages | |
Can et al. | Effect of pronounciations on OOV queries in spoken term detection | |
Chen et al. | Retrieval of broadcast news speech in Mandarin Chinese collected in Taiwan using syllable-level statistical characteristics | |
Wang | Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese | |
Logan et al. | Confusion-based query expansion for OOV words in spoken document retrieval. | |
Yu et al. | Searching the audio notebook: keyword search in recorded conversation | |
Lim et al. | Developing an automatic speech recognizer for filipino with english code-switching in news broadcast | |
Chelba et al. | Indexing uncertainty for spoken document search. | |
Colthurst et al. | The 2000 BBN Byblos LVCSR system. | |
Kurimo et al. | An evaluation of a spoken document retrieval baseline system in finish. | |
Rose et al. | Subword-based spoken term detection in audio course lectures | |
Vaněk et al. | Gender-dependent acoustic models fusion developed for automatic subtitling of parliament meetings broadcasted by the Czech TV | |
Tucker et al. | Speech-as-data technologies for personal information devices | |
Xu et al. | Robust and Fast Lyric Search based on Phonetic Confusion Matrix. | |
Kurimo et al. | Speech transcription and spoken document retrieval in Finnish | |
Schneider | Holistic vocabulary independent spoken term detection | |
Gauvain et al. | Structuring broadcast audio for information access | |
Shao et al. | Syllable Based Audio Search Using Confusion Network Arc as Indexing Unit | |
Deligne et al. | On the use of lattices for the automatic generation of pronunciations | |
Shao et al. | Fast Vocabulary-Independent Audio Search Based on Syllable Confusion Network Indexing in Mandarin Spontaneous Speech | |
Ramabhadran et al. | Use of metadata to improve recognition of spontaneous speech and named entities. | |
Ajmera et al. | A Cross-Lingual Spoken Content Search System. |