Mamou et al., 2013 - Google Patents

System combination and score normalization for spoken term detection

Mamou et al., 2013

Document ID: 13123811827086989849
Author: Mamou J; Cui J; Cui X; Gales M; Kingsbury B; Knill K; Mangu L; Nolden D; Picheny M; Ramabhadran B; Schlüter R; Sethy A; Woodland P
Publication year: 2013
Publication venue: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing

External Links

Cited by

Snippet

Spoken content in languages of emerging importance needs to be searchable to provide access to the underlying information. In this paper, we investigate the problem of extending data fusion methodologies from Information Retrieval for Spoken Term Detection on low …

Continue reading at mi.eng.cam.ac.uk (PDF) (other versions)

238000010606 normalization 0 title abstract description 33

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/085—Methods for reducing search complexity, pruning
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification

Similar Documents

Publication	Publication Date	Title
Mamou et al.	2013	System combination and score normalization for spoken term detection
US9477753B2 (en)	2016-10-25	Classifier-based system combination for spoken term detection
Miller et al.	2007	Rapid and accurate spoken term detection.
Mendels et al.	2015	Improving speech recognition and keyword search for low resource languages using web data
Wang et al.	2015	Joint decoding of tandem and hybrid systems for improved keyword spotting on low resource languages
Saraclar et al.	2013	An empirical study of confusion modeling in keyword search for low resource languages
WO2003010754A1 (en)	2003-02-06	Speech input search system
JP2004005600A (en)	2004-01-08	Method and system for indexing and retrieving document stored in database
JP2004133880A (en)	2004-04-30	Method for constructing dynamic vocabulary for speech recognizer used in database for indexed document
Mangu et al.	2014	Efficient spoken term detection using confusion networks
Le Zhang et al.	2015	Enhancing low resource keyword spotting with automatically retrieved web documents
Fraga-Silva et al.	2015	Active learning based data selection for limited resource STT and KWS.
Rath et al.	2014	Combining tandem and hybrid systems for improved speech recognition and keyword spotting on low resource languages
Kaushik et al.	2015	Automatic audio sentiment extraction using keyword spotting.
Lee et al.	2014	Graph-based re-ranking using acoustic feature similarity between search results for spoken term detection on low-resource languages.
Soto et al.	2014	A comparison of multiple methods for rescoring keyword search lists for low resource languages.
Sarı et al.	2015	Fusion of LVCSR and posteriorgram based keyword search
Nishizaki et al.	2011	Spoken Term Detection Using Multiple Speech Recognizers' Outputs at NTCIR-9 SpokenDoc STD subtask.
Audhkhasi et al.	2007	Keyword search using modified minimum edit distance measure
US8639510B1 (en)	2014-01-28	Acoustic scoring unit implemented on a single FPGA or ASIC
Cui et al.	2014	Automatic keyword selection for keyword search development and tuning
Khokhlov et al.	2017	Fast and accurate OOV decoder on high-level features
Ramabhadran et al.	2009	Fast decoding for open vocabulary spoken term detection
Kaneko et al.	2011	STD based on Hough Transform and SDR using STD results: Experiments at NTCIR-9 SpokenDoc.
Hartmann et al.	2014	Cross-word sub-word units for low-resource keyword spotting.