Dwivedi et al., 2022 - Google Patents
Spherical harmonics domain-based approach for source localization in presence of directional interferenceDwivedi et al., 2022
View HTML- Document ID
- 13033064096852300578
- Author
- Dwivedi P
- Routray G
- Hegde R
- Publication year
- Publication venue
- JASA Express Letters
External Links
Snippet
This paper presents a learning-based method for source localization in the presence of directional interference under reverberant and noisy conditions. The proposed method operates on the spherical harmonic decomposition of the spherical microphone array …
- 230000004807 localization 0 title abstract description 17
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/40—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11282505B2 (en) | Acoustic signal processing with neural network using amplitude, phase, and frequency | |
| CN109830245B (en) | A method and system for multi-speaker speech separation based on beamforming | |
| US9100734B2 (en) | Systems, methods, apparatus, and computer-readable media for far-field multi-source tracking and separation | |
| Liu et al. | Deep learning assisted sound source localization using two orthogonal first-order differential microphone arrays | |
| Roman et al. | Binaural segregation in multisource reverberant environments | |
| Huleihel et al. | Spherical array processing for acoustic analysis using room impulse responses and time-domain smoothing | |
| Janský et al. | Auxiliary function-based algorithm for blind extraction of a moving speaker | |
| SongGong et al. | Acoustic source localization in the circular harmonic domain using deep learning architecture | |
| Dadvar et al. | Robust binaural speech separation in adverse conditions based on deep neural network with modified spatial features and training target | |
| Zhang et al. | Deep learning-based direction-of-arrival estimation for multiple speech sources using a small scale array | |
| Dwivedi et al. | Spherical harmonics domain-based approach for source localization in presence of directional interference | |
| Wu et al. | Sound source localization based on multi-task learning and image translation network | |
| Zaken et al. | Neural-Network-Based Direction-of-Arrival Estimation for Reverberant Speech-The Importance of Energetic, Temporal, and Spatial Information | |
| Salvati et al. | Two-microphone end-to-end speaker joint identification and localization via convolutional neural networks | |
| Bai et al. | Audio enhancement and intelligent classification of household sound events using a sparsely deployed array | |
| Dwivedi et al. | Far-field source localization in spherical harmonics domain using acoustic intensity vector | |
| Al-Ali et al. | Enhanced forensic speaker verification performance using the ICA-EBM algorithm under noisy and reverberant environments | |
| Bai et al. | Design and implementation of a space domain spherical microphone array with application to source localization and separation | |
| Nakano et al. | Automatic estimation of position and orientation of an acoustic source by a microphone array network | |
| Yang et al. | A stacked self-attention network for two-dimensional direction-of-arrival estimation in hands-free speech communication | |
| Schwartz et al. | A recursive expectation-maximization algorithm for speaker tracking and separation | |
| Wang et al. | Speech separation and extraction by combining superdirective beamforming and blind source separation | |
| Hammer et al. | FCN approach for dynamically locating multiple speakers | |
| Li et al. | Beamformed feature for learning-based dual-channel speech separation | |
| CN115497495A (en) | Spatial correlation feature extraction in neural network-based audio processing |