Mamou et al., 2013 - Google Patents
System combination and score normalization for spoken term detectionMamou et al., 2013
View PDF- Document ID
- 13123811827086989849
- Author
- Mamou J
- Cui J
- Cui X
- Gales M
- Kingsbury B
- Knill K
- Mangu L
- Nolden D
- Picheny M
- Ramabhadran B
- Schlüter R
- Sethy A
- Woodland P
- Publication year
- Publication venue
- 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
External Links
Snippet
Spoken content in languages of emerging importance needs to be searchable to provide access to the underlying information. In this paper, we investigate the problem of extending data fusion methodologies from Information Retrieval for Spoken Term Detection on low …
- 238000010606 normalization 0 title abstract description 33
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/085—Methods for reducing search complexity, pruning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Mamou et al. | System combination and score normalization for spoken term detection | |
US9477753B2 (en) | Classifier-based system combination for spoken term detection | |
Miller et al. | Rapid and accurate spoken term detection. | |
Mendels et al. | Improving speech recognition and keyword search for low resource languages using web data | |
Wang et al. | Joint decoding of tandem and hybrid systems for improved keyword spotting on low resource languages | |
Saraclar et al. | An empirical study of confusion modeling in keyword search for low resource languages | |
WO2003010754A1 (en) | Speech input search system | |
JP2004005600A (en) | Method and system for indexing and retrieving document stored in database | |
JP2004133880A (en) | Method for constructing dynamic vocabulary for speech recognizer used in database for indexed document | |
Mangu et al. | Efficient spoken term detection using confusion networks | |
Le Zhang et al. | Enhancing low resource keyword spotting with automatically retrieved web documents | |
Fraga-Silva et al. | Active learning based data selection for limited resource STT and KWS. | |
Rath et al. | Combining tandem and hybrid systems for improved speech recognition and keyword spotting on low resource languages | |
Kaushik et al. | Automatic audio sentiment extraction using keyword spotting. | |
Lee et al. | Graph-based re-ranking using acoustic feature similarity between search results for spoken term detection on low-resource languages. | |
Soto et al. | A comparison of multiple methods for rescoring keyword search lists for low resource languages. | |
Sarı et al. | Fusion of LVCSR and posteriorgram based keyword search | |
Nishizaki et al. | Spoken Term Detection Using Multiple Speech Recognizers' Outputs at NTCIR-9 SpokenDoc STD subtask. | |
Audhkhasi et al. | Keyword search using modified minimum edit distance measure | |
US8639510B1 (en) | Acoustic scoring unit implemented on a single FPGA or ASIC | |
Cui et al. | Automatic keyword selection for keyword search development and tuning | |
Khokhlov et al. | Fast and accurate OOV decoder on high-level features | |
Ramabhadran et al. | Fast decoding for open vocabulary spoken term detection | |
Kaneko et al. | STD based on Hough Transform and SDR using STD results: Experiments at NTCIR-9 SpokenDoc. | |
Hartmann et al. | Cross-word sub-word units for low-resource keyword spotting. |