Pacheco et al., 2006 - Google Patents

Spectral subtraction for reverberation reduction applied to automatic speech recognition

Pacheco et al., 2006

Document ID: 13992055157562068988
Author: Pacheco F; Seara R
Publication year: 2006
Publication venue: 2006 International Telecommunications Symposium

External Links

Cited by

Snippet

In real conditions, reverberation severely compromises the accuracy of automatic speech recognition (ASR) systems. The use and the assessment of spectral subtraction for reducing the reverberation effect in ASR systems is discussed. For such, a general procedure is …

Continue reading at ieeexplore.ieee.org (other versions)

230000003595 spectral 0 title abstract description 23

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis

Similar Documents

Publication	Publication Date	Title
Tan et al.	2019	Real-time speech enhancement using an efficient convolutional recurrent network for dual-microphone mobile phones in close-talk scenarios
Williamson et al.	2017	Time-frequency masking in the complex domain for speech dereverberation and denoising
Yoshioka et al.	2012	Making machines understand us in reverberant rooms: Robustness against reverberation for automatic speech recognition
JP5738020B2 (en)	2015-06-17	Speech recognition apparatus and speech recognition method
Hermansky et al.	2002	RASTA processing of speech
JP3229335B2 (en)	2001-11-19	Voice processing apparatus and method
Yamamoto et al.	2005	Enhanced robot speech recognition based on microphone array source separation and missing feature theory
Krueger et al.	2010	Model-based feature enhancement for reverberant speech recognition
Garg et al.	2016	A comparative study of noise reduction techniques for automatic speech recognition systems
DEREVERBERATION et al.	2014	REVERB Workshop 2014
Sadjadi et al.	2014	Blind spectral weighting for robust speaker identification under reverberation mismatch
Yen et al.	1999	Adaptive co-channel speech separation and recognition
Matassoni et al.	2002	Hidden Markov model training with contaminated speech material for distant-talking speech recognition
WO2006114101A1 (en)	2006-11-02	Detection of speech present in a noisy signal and speech enhancement making use thereof
O'Shaughnessy	2024	Speech enhancement—A review of modern methods
Martín-Doñas et al.	2017	Dual-channel DNN-based speech enhancement for smartphones
EP3847645A1 (en)	2021-07-14	Determining a room response of a desired source in a reverberant environment
Choi et al.	2014	Dual-microphone voice activity detection technique based on two-step power level difference ratio
Marti et al.	2012	Automatic speech recognition in cocktail-party situations: A specific training for separated speech
Flynn et al.	2008	Combined speech enhancement and auditory modelling for robust distributed speech recognition
Han et al.	2013	Reverberation and noise robust feature compensation based on IMM
Pacheco et al.	2006	Spectral subtraction for reverberation reduction applied to automatic speech recognition
Haeb‐Umbach et al.	2012	Reverberant speech recognition
Lathoud et al.	2006	Channel normalization for unsupervised spectral subtraction
Krishnamoorthy et al.	2009	Application of combined temporal and spectral processing methods for speaker recognition under noisy, reverberant or multi-speaker environments