[go: up one dir, main page]

Huang et al., 2018 - Google Patents

A regression approach to speech source localization exploiting deep neural network

Huang et al., 2018

View PDF
Document ID
5562338033814302631
Author
Huang Z
Xu J
Pan J
Publication year
Publication venue
2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM)

External Links

Snippet

This paper presents a data-driven framework to speech source localization (SSL) using deep neural network (DNN), which directly construct the nonlinear regressive transform between the extracted feature and the direction-of-arrival (DOA) of indoor speech source …
Continue reading at www.researchgate.net (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S3/00Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received
    • G01S3/80Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic or infrasonic waves
    • G01S3/802Systems for determining direction or deviation from predetermined direction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones

Similar Documents

Publication Publication Date Title
CN109830245B (en) A method and system for multi-speaker speech separation based on beamforming
Adavanne et al. Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network
Li et al. Online direction of arrival estimation based on deep learning
Chakrabarty et al. Broadband DOA estimation using convolutional neural networks trained with noise signals
Takeda et al. Discriminative multiple sound source localization based on deep neural networks using independent location model
He et al. Deep neural networks for multiple speaker detection and localization
Xiao et al. A learning-based approach to direction of arrival estimation in noisy and reverberant environments
Perotin et al. Regression versus classification for neural network based audio source localization
CN111239680B (en) Direction-of-arrival estimation method based on differential array
Sivasankaran et al. Keyword-based speaker localization: Localizing a target speaker in a multi-speaker environment
Koldovský et al. Semi-blind noise extraction using partially known position of the target source
Huang et al. A regression approach to speech source localization exploiting deep neural network
Bai et al. Time difference of arrival (TDOA)-based acoustic source localization and signal extraction for intelligent audio classification
Dwivedi et al. Long-term temporal audio source localization using sh-crnn
Wang et al. Pseudo-determined blind source separation for ad-hoc microphone networks
Zheng et al. Spectral mask estimation using deep neural networks for inter-sensor data ratio model based robust DOA estimation
Zhang et al. Sound event localization and classification using WASN in Outdoor Environment
Wang et al. U-net based direct-path dominance test for robust direction-of-arrival estimation
Mane et al. Localization of steady sound source and direction detection of moving sound source using CNN
Chen et al. Overlapped Speech Detection Based on Spectral and Spatial Feature Fusion.
Xue et al. Noise robust direction of arrival estimation for speech source with weighted bispectrum spatial correlation matrix
Fu et al. Locate and beamform: Two-dimensional locating all-neural beamformer for multi-channel speech separation
Beit-On et al. Binaural direction-of-arrival estimation in reverberant environments using the direct-path dominance test
Hammer et al. FCN approach for dynamically locating multiple speakers
Pak et al. LOCATA challenge: A deep neural networks-based regression approach for direction-of-arrival estimation