[go: up one dir, main page]

Liu et al., 2022 - Google Patents

MOS prediction network for non-intrusive speech quality assessment in online conferencing

Liu et al., 2022

View PDF
Document ID
10236130779183135732
Author
Liu W
Xie C
Publication year
Publication venue
Proc. Interspeech 2022

External Links

Snippet

Speech quality is a major indicator of the quality of service that describes the performance of speech communication network. Intrusive speech quality assessment generally requires a clean reference speech for evaluation, which is not available in applications such as online …
Continue reading at www.isca-archive.org (PDF) (other versions)

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/22Supervisory, monitoring, management, i.e. operation, administration, maintenance or testing arrangements
    • H04M3/2236Quality of speech transmission monitoring
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/018Audio watermarking, i.e. embedding inaudible data in the audio signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/005Correction of errors induced by the transmission channel, if related to the coding algorithm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/18Comparators
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition

Similar Documents

Publication Publication Date Title
Reddy et al. DNSMOS P. 835: A non-intrusive perceptual objective speech quality metric to evaluate noise suppressors
Manocha et al. A differentiable perceptual audio metric learned from just noticeable differences
Reddy et al. A scalable noisy speech dataset and online subjective test framework
Mittag et al. NISQA: A deep CNN-self-attention model for multidimensional speech quality prediction with crowdsourced datasets
Mittag et al. Non-intrusive speech quality assessment for super-wideband speech communication networks
Serrà et al. SESQA: semi-supervised learning for speech quality assessment
Fu et al. MetricGAN-U: Unsupervised speech enhancement/dereverberation based only on noisy/reverberated speech
Malfait et al. P. 563—The ITU-T standard for single-ended speech quality assessment
Liu et al. X-SEPFORMER: End-to-end speaker extraction network with explicit optimization on speaker confusion
BR112021012308A2 (en) EQUIPMENT AND METHOD FOR SOURCE SEPARATION USING A SOUND QUALITY ESTIMATE AND CONTROL
Yu et al. Metricnet: Towards improved modeling for non-intrusive speech quality assessment
Kawanaka et al. Stable training of DNN for speech enhancement based on perceptually-motivated black-box cost function
Subakan et al. REAL-M: Towards speech separation on real mixtures
Ristea et al. ICASSP 2024 speech signal improvement challenge
Dubey et al. Non-intrusive speech quality assessment using several combinations of auditory features
Zhang et al. An end-to-end non-intrusive model for subjective and objective real-world speech assessment using a multi-task framework
Cutler et al. ICASSP 2023 speech signal improvement challenge
Diener et al. PLCMOS--a data-driven non-intrusive metric for the evaluation of packet loss concealment algorithms
Mittag et al. Full-reference speech quality estimation with attentional siamese neural networks
Manocha et al. Audio similarity is unreliable as a proxy for audio quality
Rosenbaum et al. Differentiable mean opinion score regularization for perceptual speech enhancement
Manocha et al. SQAPP: No-reference speech quality assessment via pairwise preference
Hajal et al. MOSRA: Joint mean opinion score and room acoustics speech quality assessment
Mumtaz et al. Nonintrusive perceptual audio quality assessment for user-generated content using deep learning
Liu et al. MOS prediction network for non-intrusive speech quality assessment in online conferencing