[go: up one dir, main page]

Lanchantin et al., 2015 - Google Patents

The development of the Cambridge University alignment systems for the Multi-Genre Broadcast challenge

Lanchantin et al., 2015

View PDF
Document ID
9859045494677214689
Author
Lanchantin P
Gales M
Karanasou P
Liu X
Qian Y
Wang L
Woodland P
Zhang C
Publication year
Publication venue
2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

External Links

Snippet

We describe the alignment systems developed both for the preparation of data for the Multi- Genre Broadcast (MGB) challenge and for our participation in the transcription and alignment tasks. Captions of varying quality are aligned with the audio of TV shows that …
Continue reading at www.researchgate.net (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/187Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • G10L15/265Speech recognisers specially adapted for particular applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

Similar Documents

Publication Publication Date Title
Bell et al. The MGB challenge: Evaluating multi-genre broadcast media recognition
Renals et al. Recognition and understanding of meetings the AMI and AMIDA projects
Hazen Automatic alignment and error correction of human generated transcripts for long speech recordings.
US9117450B2 (en) Combining re-speaking, partial agent transcription and ASR for improved accuracy / human guided ASR
US6718303B2 (en) Apparatus and method for automatically generating punctuation marks in continuous speech recognition
Huijbregts Segmentation, diarization and speech transcription: surprise data unraveled
JP5149107B2 (en) Sound processing apparatus and program
JP4869268B2 (en) Acoustic model learning apparatus and program
JP6323947B2 (en) Acoustic event recognition apparatus and program
Lanchantin et al. The development of the Cambridge University alignment systems for the Multi-Genre Broadcast challenge
Bell et al. Transcription of multi-genre media archives using out-of-domain data
Lanchantin et al. Automatic transcription of multi-genre media archives
Levin et al. Automated closed captioning for Russian live broadcasting.
Lanchantin et al. Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems.
Long et al. Improving lightly supervised training for broadcast transcriptions
Nouza et al. Making czech historical radio archive accessible and searchable for wide public
JP2013050605A (en) Language model switching device and program for the same
Motlicek et al. English spoken term detection in multilingual recordings.
Nouza et al. System for producing subtitles to internet audio-visual documents
Lim et al. Developing an automatic speech recognizer for filipino with english code-switching in news broadcast
Brugnara et al. Analysis of the Characteristics of Talk-show TV Programs.
Kubala et al. Broadcast news transcription
Mizera et al. Impact of irregular pronunciation on phonetic segmentation of nijmegen corpus of casual czech
Sárosi et al. On modeling non-word events in large vocabulary continuous speech recognition
JP2010044171A (en) Subtitle output device, subtitle output method and program