Clements et al., 2001 - Google Patents
Phonetic searching of digital audioClements et al., 2001
View PDF- Document ID
- 5896876934297273980
- Author
- Clements M
- Cardillo P
- Miller M
- Publication year
- Publication venue
- Proc. Broadcast Engineering Conference
External Links
Snippet
As archives of digital audio and video expand, and people need to find specific information within those archives, it becomes clear that a highly efficient method of searching recorded media is required. The metadata that currently tag audio information (such as title, date of …
- 238000007781 pre-processing 0 description 18
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30743—Audio data retrieval using features automatically derived from the audio content, e.g. descriptors, fingerprints, signatures, MEP-cepstral coefficients, musical score, tempo
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8209171B2 (en) | Methods and apparatus relating to searching of spoken audio data | |
US7953751B2 (en) | System and method for audio hot spotting | |
Foote | An overview of audio information retrieval | |
US9245523B2 (en) | Method and apparatus for expansion of search queries on large vocabulary continuous speech recognition transcripts | |
Hauptmann et al. | Informedia: News-on-demand multimedia information acquisition and retrieval | |
US7487094B1 (en) | System and method of call classification with context modeling based on composite words | |
Chelba et al. | Retrieval and browsing of spoken content | |
Hansen et al. | Speechfind: Advances in spoken document retrieval for a national gallery of the spoken word | |
US7292979B2 (en) | Time ordered indexing of audio data | |
US6990448B2 (en) | Database annotation and retrieval including phoneme data | |
JP3488174B2 (en) | Method and apparatus for retrieving speech information using content information and speaker information | |
US20040204939A1 (en) | Systems and methods for speaker change detection | |
Koumpis et al. | Content-based access to spoken audio | |
Cardillo et al. | Phonetic searching vs. LVCSR: How to find what you really want in audio archives | |
Wilcox et al. | Annotation and segmentation for multimedia indexing and retrieval | |
GB2451938A (en) | Methods and apparatus for searching of spoken audio data | |
Clements et al. | Phonetic searching of digital audio | |
Leavitt | Let's hear it for audio mining | |
Zhou et al. | Speechfind: an experimental on-line spoken document retrieval system for historical audio archives. | |
Hansen et al. | Audio stream phrase recognition for a national gallery of the spoken word:" one small step". | |
Nouza et al. | Large-scale processing, indexing and search system for Czech audio-visual cultural heritage archives | |
Chelba et al. | Speech retrieval | |
Meng et al. | Spoken document retrieval for the languages of Hong Kong | |
Viswanathan et al. | Multimedia document retrieval using speech and speaker recognition | |
Kurimo et al. | Speech transcription and spoken document retrieval in Finnish |