Liu et al., 2022 - Google Patents

MOS prediction network for non-intrusive speech quality assessment in online conferencing

Liu et al., 2022

Document ID: 10236130779183135732
Author: Liu W; Xie C
Publication year: 2022
Publication venue: Proc. Interspeech 2022

External Links

Cited by

Snippet

Speech quality is a major indicator of the quality of service that describes the performance of speech communication network. Intrusive speech quality assessment generally requires a clean reference speech for evaluation, which is not available in applications such as online …

Continue reading at www.isca-archive.org (PDF) (other versions)

238000001303 quality assessment method 0 title abstract description 8

Classifications

- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/22—Supervisory, monitoring, management, i.e. operation, administration, maintenance or testing arrangements
- H04M3/2236—Quality of speech transmission monitoring
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/56—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/005—Correction of errors induced by the transmission channel, if related to the coding algorithm
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/18—Comparators
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/40—Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition

Similar Documents

Publication	Publication Date	Title
Reddy et al.	2022	DNSMOS P. 835: A non-intrusive perceptual objective speech quality metric to evaluate noise suppressors
Manocha et al.	2020	A differentiable perceptual audio metric learned from just noticeable differences
Reddy et al.	2019	A scalable noisy speech dataset and online subjective test framework
Mittag et al.	2021	NISQA: A deep CNN-self-attention model for multidimensional speech quality prediction with crowdsourced datasets
Mittag et al.	2019	Non-intrusive speech quality assessment for super-wideband speech communication networks
Serrà et al.	2021	SESQA: semi-supervised learning for speech quality assessment
Fu et al.	2022	MetricGAN-U: Unsupervised speech enhancement/dereverberation based only on noisy/reverberated speech
Malfait et al.	2006	P. 563—The ITU-T standard for single-ended speech quality assessment
Liu et al.	2023	X-SEPFORMER: End-to-end speaker extraction network with explicit optimization on speaker confusion
BR112021012308A2 (en)	2021-09-08	EQUIPMENT AND METHOD FOR SOURCE SEPARATION USING A SOUND QUALITY ESTIMATE AND CONTROL
Yu et al.	2021	Metricnet: Towards improved modeling for non-intrusive speech quality assessment
Kawanaka et al.	2020	Stable training of DNN for speech enhancement based on perceptually-motivated black-box cost function
Subakan et al.	2022	REAL-M: Towards speech separation on real mixtures
Ristea et al.	2025	ICASSP 2024 speech signal improvement challenge
Dubey et al.	2013	Non-intrusive speech quality assessment using several combinations of auditory features
Zhang et al.	2021	An end-to-end non-intrusive model for subjective and objective real-world speech assessment using a multi-task framework
Cutler et al.	2024	ICASSP 2023 speech signal improvement challenge
Diener et al.	2023	PLCMOS--a data-driven non-intrusive metric for the evaluation of packet loss concealment algorithms
Mittag et al.	2020	Full-reference speech quality estimation with attentional siamese neural networks
Manocha et al.	2022	Audio similarity is unreliable as a proxy for audio quality
Rosenbaum et al.	2023	Differentiable mean opinion score regularization for perceptual speech enhancement
Manocha et al.	2022	SQAPP: No-reference speech quality assessment via pairwise preference
Hajal et al.	2022	MOSRA: Joint mean opinion score and room acoustics speech quality assessment
Mumtaz et al.	2021	Nonintrusive perceptual audio quality assessment for user-generated content using deep learning
Liu et al.	2022	MOS prediction network for non-intrusive speech quality assessment in online conferencing