[go: up one dir, main page]

Wang et al., 2018 - Google Patents

Two-stage enhancement of noisy and reverberant microphone array speech for automatic speech recognition systems trained with only clean speech

Wang et al., 2018

Document ID
16251994478918878122
Author
Wang Q
Wang S
Ge F
Han C
Lee J
Guo L
Lee C
Publication year
Publication venue
2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP)

External Links

Snippet

We propose a two-stage approach to enhancement of far-field microphone array speech collected in reverberant conditions corrupted by interfering speakers and noises. We intend to produce top-quality enhanced speech to be used by a black-box automatic speech …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/18Methods or devices for transmitting, conducting, or directing sound
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups

Similar Documents

Publication Publication Date Title
CN112017681B (en) Method and system for enhancing directional voice
Kinoshita et al. A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research
Schwartz et al. Multi-microphone speech dereverberation and noise reduction using relative early transfer functions
CN105869651B (en) Binary channels Wave beam forming sound enhancement method based on noise mixing coherence
US8880396B1 (en) Spectrum reconstruction for automatic speech recognition
Mertins et al. Room impulse response shortening/reshaping with infinity-and $ p $-norm optimization
CN114694670B (en) A microphone array speech enhancement system and method based on multi-task network
CN110660406A (en) Real-time voice noise reduction method of double-microphone mobile phone in close-range conversation scene
Chen et al. Improving Mask Learning Based Speech Enhancement System with Restoration Layers and Residual Connection.
Hussain et al. Ensemble hierarchical extreme learning machine for speech dereverberation
Nesta et al. A flexible spatial blind source extraction framework for robust speech recognition in noisy environments
Sarabia et al. Spatial Librispeech: An augmented dataset for spatial audio learning
Wang et al. Two-stage enhancement of noisy and reverberant microphone array speech for automatic speech recognition systems trained with only clean speech
Meng et al. Deep Kronecker product beamforming for large-scale microphone arrays
Yang et al. Design and optimization of superdirective beamforming and post-filtering for speech enhancement
Kovalyov et al. Dfsnet: A steerable neural beamformer invariant to microphone array configuration for real-time, low-latency speech enhancement
Mohammadamini et al. Compensate multiple distortions for speaker recognition systems
Jing et al. End-to-end doa-guided speech extraction in noisy multi-talker scenarios
Li et al. Speech separation based on reliable binaural cues with two-stage neural network in noisy-reverberant environments
Xue et al. A study on improving acoustic model for robust and far-field speech recognition
Himawan et al. Dealing with uncertainty in microphone placement in a microphone array speech recognition system
JP2014143570A (en) Sound pick-up device and reproducer
Youssef et al. From monaural to binaural speaker recognition for humanoid robots
Zhou et al. Combined beamforming and deep neural networks for multichannel speech enhancement
Chen et al. Early reflections based speech enhancement