Makhoul et al., 2000 - Google Patents

Speech and language technologies for audio indexing and retrieval

Makhoul et al., 2000

Document ID: 9000923448993147681
Author: Makhoul J; Kubala F; Leek T; Liu D; Nguyen L; Schwartz R; Srivastava A
Publication year: 2000
Publication venue: Proceedings of the IEEE

External Links

Cited by

Snippet

With the advent of essentially unlimited data storage capabilities and with the proliferation of the use of the Internet, it becomes reasonable to imagine a world in which it would be possible to access any of the stored information at will with a few keystrokes or voice …

Continue reading at www.academia.edu (PDF) (other versions)

230000011218 segmentation 0 abstract description 29

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems

Similar Documents

Publication	Publication Date	Title
Makhoul et al.	2000	Speech and language technologies for audio indexing and retrieval
JP3488174B2 (en)	2004-01-19	Method and apparatus for retrieving speech information using content information and speaker information
Chelba et al.	2008	Retrieval and browsing of spoken content
Lee et al.	2005	Spoken document understanding and organization
Hauptmann et al.	1997	Informedia: News-on-demand multimedia information acquisition and retrieval
Waibel et al.	1998	Meeting browser: Tracking and summarizing meetings
US20040204939A1 (en)	2004-10-14	Systems and methods for speaker change detection
Ostendorf et al.	2008	Speech segmentation and spoken document processing
Kubala et al.	2000	Integrated technologies for indexing spoken language
Chen et al.	2004	Lightly supervised and data-driven approaches to mandarin broadcast news transcription
Furui	2005	Recent progress in corpus-based spontaneous speech recognition
Wang	2000	Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese
Nouza et al.	2012	Making czech historical radio archive accessible and searchable for wide public
Moyal et al.	2013	Phonetic search methods for large speech databases
Zhang et al.	2012	Automatic parliamentary meeting minute generation using rhetorical structure modeling
Wang	2000	Mandarin spoken document retrieval based on syllable lattice matching
Nouza et al.	2012	Large-scale processing, indexing and search system for Czech audio-visual cultural heritage archives
Nouza et al.	2006	A system for information retrieval from large records of Czech spoken data
Viswanathan et al.	2000	Multimedia document retrieval using speech and speaker recognition
Wang et al.	2004	The SoVideo Mandarin Chinese broadcast news retrieval system
Schneider	2012	Holistic vocabulary independent spoken term detection
Lo et al.	2004	Multi-scale spoken document retrieval for Cantonese broadcast news
Cerisara	2009	Automatic discovery of topics and acoustic morphemes from speech
Zhou	2003	Audio parsing and rapid speaker adaptation in speech recognition for spoken document retrieval
Akbacak	2009	Robust spoken document retrieval in multilingual and noisy acoustic environments