Edraki et al., 2024 - Google Patents

Speaker adaptation for enhancement of bone-conducted speech

Edraki et al., 2024

Document ID: 11411193931536919947
Author: Edraki A; Chan W; Jensen J; Fogerty D
Publication year: 2024
Publication venue: ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

External Links

Cited by

Snippet

Deep neural network (DNN)-based speech enhancement models often face challenges in maintaining their performance for speakers not encountered during training. This challenge is exacerbated in applications such as enhancement and bandwidth extension of bone …

Continue reading at ieeexplore.ieee.org (other versions)

230000006978 adaptation 0 title abstract description 43

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2430/00—Signal processing covered by H04R, not provided for in its groups
- H04R2430/03—Synergistic effects of band splitting and sub-band processing
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2225/00—Details of deaf aids covered by H04R25/00, not provided for in any of its subgroups
- H04R2225/43—Signal processing in hearing aids to enhance the speech intelligibility
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Deaf-aid sets providing an auditory perception; Electric tinnitus maskers providing an auditory perception
- H04R25/40—Arrangements for obtaining a desired directivity characteristic
- H04R25/407—Circuits for combining signals of a plurality of transducers
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones

Similar Documents

Publication	Publication Date	Title
Gabbay et al.	2018	Seeing through noise: Visually driven speaker separation and enhancement
Lv et al.	2022	S-dccrn: Super wide band dccrn with learnable complex feature for speech enhancement
Ren et al.	2021	A Causal U-Net Based Neural Beamforming Network for Real-Time Multi-Channel Speech Enhancement.
US20200005770A1 (en)	2020-01-02	Sound processing apparatus
Edraki et al.	2024	Speaker adaptation for enhancement of bone-conducted speech
Akeroyd et al.	2023	The 2nd clarity enhancement challenge for hearing aid speech intelligibility enhancement: Overview and outcomes
Ju et al.	2023	Tea-pse 2.0: Sub-band network for real-time personalized speech enhancement
Strake et al.	2020	INTERSPEECH 2020 Deep Noise Suppression Challenge: A Fully Convolutional Recurrent Network (FCRN) for Joint Dereverberation and Denoising.
Wang et al.	2016	Wavelet speech enhancement based on nonnegative matrix factorization
Li et al.	2018	Single-channel speech dereverberation via generative adversarial training
Rao et al.	2021	INTERSPEECH 2021 ConferencingSpeech challenge: Towards far-field multi-channel speech enhancement for video conferencing
Healy et al.	2021	A causal and talker-independent speaker separation/dereverberation deep learning algorithm: Cost associated with conversion to real-time capable operation
Healy et al.	2020	A talker-independent deep learning algorithm to increase intelligibility for hearing-impaired listeners in reverberant competing talker conditions
CN111968627B (en)	2024-03-29	Bone conduction voice enhancement method based on joint dictionary learning and sparse representation
Shankar et al.	2018	Influence of mvdr beamformer on a speech enhancement based smartphone application for hearing aids
Liu et al.	2023	Gesper: A restoration-enhancement framework for general speech reconstruction
Ohlenbusch et al.	2024	Multi-microphone noise data augmentation for DNN-based own voice reconstruction for hearables in noisy environments
Goehring et al.	2016	Speech enhancement for hearing-impaired listeners using deep neural networks with auditory-model based features
Toloosham et al.	2022	A training framework for stereo-aware speech enhancement using deep neural networks
Gaultier et al.	2024	Recovering speech intelligibility with deep learning and multiple microphones in noisy-reverberant situations for people using cochlear implants
Kashani et al.	2019	Speech enhancement via deep spectrum image translation network
CN111009259B (en)	2022-09-16	Audio processing method and device
Magadum et al.	2023	An Innovative Method for Improving Speech Intelligibility in Automatic Sound Classification Based on Relative-CNN-RNN
Dashtipour et al.	2024	Evaluating the audio-visual speech enhancement challenge (AVSEC) baseline model using an out-of-domain free-flowing corpus
Gergen et al.	2018	Source separation by feature-based clustering of microphones in ad hoc arrays