Lanchantin et al., 2015 - Google Patents

The development of the Cambridge University alignment systems for the Multi-Genre Broadcast challenge

Lanchantin et al., 2015

Document ID: 9859045494677214689
Author: Lanchantin P; Gales M; Karanasou P; Liu X; Qian Y; Wang L; Woodland P; Zhang C
Publication year: 2015
Publication venue: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

External Links

Cited by

Snippet

We describe the alignment systems developed both for the preparation of data for the Multi- Genre Broadcast (MGB) challenge and for our participation in the transcription and alignment tasks. Captions of varying quality are aligned with the audio of TV shows that …

Continue reading at www.researchgate.net (PDF) (other versions)

230000018109 developmental process 0 title description 26

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
- G10L15/265—Speech recognisers specially adapted for particular applications
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

Similar Documents

Publication	Publication Date	Title
Bell et al.	2015	The MGB challenge: Evaluating multi-genre broadcast media recognition
Renals et al.	2007	Recognition and understanding of meetings the AMI and AMIDA projects
Hazen	2006	Automatic alignment and error correction of human generated transcripts for long speech recordings.
US9117450B2 (en)	2015-08-25	Combining re-speaking, partial agent transcription and ASR for improved accuracy / human guided ASR
US6718303B2 (en)	2004-04-06	Apparatus and method for automatically generating punctuation marks in continuous speech recognition
Huijbregts	2008	Segmentation, diarization and speech transcription: surprise data unraveled
JP5149107B2 (en)	2013-02-20	Sound processing apparatus and program
JP4869268B2 (en)	2012-02-08	Acoustic model learning apparatus and program
JP6323947B2 (en)	2018-05-16	Acoustic event recognition apparatus and program
Lanchantin et al.	2015	The development of the Cambridge University alignment systems for the Multi-Genre Broadcast challenge
Bell et al.	2012	Transcription of multi-genre media archives using out-of-domain data
Lanchantin et al.	2013	Automatic transcription of multi-genre media archives
Levin et al.	2014	Automated closed captioning for Russian live broadcasting.
Lanchantin et al.	2016	Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems.
Long et al.	2013	Improving lightly supervised training for broadcast transcriptions
Nouza et al.	2012	Making czech historical radio archive accessible and searchable for wide public
JP2013050605A (en)	2013-03-14	Language model switching device and program for the same
Motlicek et al.	2010	English spoken term detection in multilingual recordings.
Nouza et al.	2015	System for producing subtitles to internet audio-visual documents
Lim et al.	2022	Developing an automatic speech recognizer for filipino with english code-switching in news broadcast
Brugnara et al.	2012	Analysis of the Characteristics of Talk-show TV Programs.
Kubala et al.	1997	Broadcast news transcription
Mizera et al.	2014	Impact of irregular pronunciation on phonetic segmentation of nijmegen corpus of casual czech
Sárosi et al.	2012	On modeling non-word events in large vocabulary continuous speech recognition
JP2010044171A (en)	2010-02-25	Subtitle output device, subtitle output method and program