Lanchantin et al., 2015 - Google Patents
The development of the Cambridge University alignment systems for the Multi-Genre Broadcast challengeLanchantin et al., 2015
View PDF- Document ID
- 9859045494677214689
- Author
- Lanchantin P
- Gales M
- Karanasou P
- Liu X
- Qian Y
- Wang L
- Woodland P
- Zhang C
- Publication year
- Publication venue
- 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)
External Links
Snippet
We describe the alignment systems developed both for the preparation of data for the Multi- Genre Broadcast (MGB) challenge and for our participation in the transcription and alignment tasks. Captions of varying quality are aligned with the audio of TV shows that …
- 230000018109 developmental process 0 title description 26
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
- G10L15/265—Speech recognisers specially adapted for particular applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bell et al. | The MGB challenge: Evaluating multi-genre broadcast media recognition | |
Renals et al. | Recognition and understanding of meetings the AMI and AMIDA projects | |
Hazen | Automatic alignment and error correction of human generated transcripts for long speech recordings. | |
US9117450B2 (en) | Combining re-speaking, partial agent transcription and ASR for improved accuracy / human guided ASR | |
US6718303B2 (en) | Apparatus and method for automatically generating punctuation marks in continuous speech recognition | |
Huijbregts | Segmentation, diarization and speech transcription: surprise data unraveled | |
JP5149107B2 (en) | Sound processing apparatus and program | |
JP4869268B2 (en) | Acoustic model learning apparatus and program | |
JP6323947B2 (en) | Acoustic event recognition apparatus and program | |
Lanchantin et al. | The development of the Cambridge University alignment systems for the Multi-Genre Broadcast challenge | |
Bell et al. | Transcription of multi-genre media archives using out-of-domain data | |
Lanchantin et al. | Automatic transcription of multi-genre media archives | |
Levin et al. | Automated closed captioning for Russian live broadcasting. | |
Lanchantin et al. | Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems. | |
Long et al. | Improving lightly supervised training for broadcast transcriptions | |
Nouza et al. | Making czech historical radio archive accessible and searchable for wide public | |
JP2013050605A (en) | Language model switching device and program for the same | |
Motlicek et al. | English spoken term detection in multilingual recordings. | |
Nouza et al. | System for producing subtitles to internet audio-visual documents | |
Lim et al. | Developing an automatic speech recognizer for filipino with english code-switching in news broadcast | |
Brugnara et al. | Analysis of the Characteristics of Talk-show TV Programs. | |
Kubala et al. | Broadcast news transcription | |
Mizera et al. | Impact of irregular pronunciation on phonetic segmentation of nijmegen corpus of casual czech | |
Sárosi et al. | On modeling non-word events in large vocabulary continuous speech recognition | |
JP2010044171A (en) | Subtitle output device, subtitle output method and program |