[go: up one dir, main page]

Edraki et al., 2024 - Google Patents

Speaker adaptation for enhancement of bone-conducted speech

Edraki et al., 2024

Document ID
11411193931536919947
Author
Edraki A
Chan W
Jensen J
Fogerty D
Publication year
Publication venue
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

External Links

Snippet

Deep neural network (DNN)-based speech enhancement models often face challenges in maintaining their performance for speakers not encountered during training. This challenge is exacerbated in applications such as enhancement and bandwidth extension of bone …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/018Audio watermarking, i.e. embedding inaudible data in the audio signal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/03Synergistic effects of band splitting and sub-band processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2225/00Details of deaf aids covered by H04R25/00, not provided for in any of its subgroups
    • H04R2225/43Signal processing in hearing aids to enhance the speech intelligibility
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R25/00Deaf-aid sets providing an auditory perception; Electric tinnitus maskers providing an auditory perception
    • H04R25/40Arrangements for obtaining a desired directivity characteristic
    • H04R25/407Circuits for combining signals of a plurality of transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones

Similar Documents

Publication Publication Date Title
Gabbay et al. Seeing through noise: Visually driven speaker separation and enhancement
Lv et al. S-dccrn: Super wide band dccrn with learnable complex feature for speech enhancement
Ren et al. A Causal U-Net Based Neural Beamforming Network for Real-Time Multi-Channel Speech Enhancement.
US20200005770A1 (en) Sound processing apparatus
Edraki et al. Speaker adaptation for enhancement of bone-conducted speech
Akeroyd et al. The 2nd clarity enhancement challenge for hearing aid speech intelligibility enhancement: Overview and outcomes
Ju et al. Tea-pse 2.0: Sub-band network for real-time personalized speech enhancement
Strake et al. INTERSPEECH 2020 Deep Noise Suppression Challenge: A Fully Convolutional Recurrent Network (FCRN) for Joint Dereverberation and Denoising.
Wang et al. Wavelet speech enhancement based on nonnegative matrix factorization
Li et al. Single-channel speech dereverberation via generative adversarial training
Rao et al. INTERSPEECH 2021 ConferencingSpeech challenge: Towards far-field multi-channel speech enhancement for video conferencing
Healy et al. A causal and talker-independent speaker separation/dereverberation deep learning algorithm: Cost associated with conversion to real-time capable operation
Healy et al. A talker-independent deep learning algorithm to increase intelligibility for hearing-impaired listeners in reverberant competing talker conditions
CN111968627B (en) Bone conduction voice enhancement method based on joint dictionary learning and sparse representation
Shankar et al. Influence of mvdr beamformer on a speech enhancement based smartphone application for hearing aids
Liu et al. Gesper: A restoration-enhancement framework for general speech reconstruction
Ohlenbusch et al. Multi-microphone noise data augmentation for DNN-based own voice reconstruction for hearables in noisy environments
Goehring et al. Speech enhancement for hearing-impaired listeners using deep neural networks with auditory-model based features
Toloosham et al. A training framework for stereo-aware speech enhancement using deep neural networks
Gaultier et al. Recovering speech intelligibility with deep learning and multiple microphones in noisy-reverberant situations for people using cochlear implants
Kashani et al. Speech enhancement via deep spectrum image translation network
CN111009259B (en) Audio processing method and device
Magadum et al. An Innovative Method for Improving Speech Intelligibility in Automatic Sound Classification Based on Relative-CNN-RNN
Dashtipour et al. Evaluating the audio-visual speech enhancement challenge (AVSEC) baseline model using an out-of-domain free-flowing corpus
Gergen et al. Source separation by feature-based clustering of microphones in ad hoc arrays