[go: up one dir, main page]

PL3493205T3 - Sposób i urządzenie do adaptacyjnego wykrywania aktywności głosowej w wejściowym sygnale audio - Google Patents

Sposób i urządzenie do adaptacyjnego wykrywania aktywności głosowej w wejściowym sygnale audio

Info

Publication number
PL3493205T3
PL3493205T3 PL18214325T PL18214325T PL3493205T3 PL 3493205 T3 PL3493205 T3 PL 3493205T3 PL 18214325 T PL18214325 T PL 18214325T PL 18214325 T PL18214325 T PL 18214325T PL 3493205 T3 PL3493205 T3 PL 3493205T3
Authority
PL
Poland
Prior art keywords
audio signal
input audio
voice activity
adaptively detecting
adaptively
Prior art date
Application number
PL18214325T
Other languages
English (en)
Inventor
Zhe Wang
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Publication of PL3493205T3 publication Critical patent/PL3493205T3/pl

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/0308Voice signal separating characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
    • G10L19/0208Subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/22Mode decision, i.e. based on audio signal content versus external parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/84Detection of presence or absence of voice signals for discriminating voice from noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision
    • G10L2025/786Adaptive threshold

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Noise Elimination (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
  • Time-Division Multiplex Systems (AREA)
  • Circuit For Audible Band Transducer (AREA)
PL18214325T 2010-12-24 2010-12-24 Sposób i urządzenie do adaptacyjnego wykrywania aktywności głosowej w wejściowym sygnale audio PL3493205T3 (pl)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP10861147.6A EP2619753B1 (en) 2010-12-24 2010-12-24 Method and apparatus for adaptively detecting voice activity in input audio signal
EP14156678.6A EP2743924B1 (en) 2010-12-24 2010-12-24 Method and apparatus for adaptively detecting a voice activity in an input audio signal
EP18214325.5A EP3493205B1 (en) 2010-12-24 2010-12-24 Method and apparatus for adaptively detecting a voice activity in an input audio signal
PCT/CN2010/080227 WO2012083555A1 (en) 2010-12-24 2010-12-24 Method and apparatus for adaptively detecting voice activity in input audio signal

Publications (1)

Publication Number Publication Date
PL3493205T3 true PL3493205T3 (pl) 2021-09-20

Family

ID=46313053

Family Applications (1)

Application Number Title Priority Date Filing Date
PL18214325T PL3493205T3 (pl) 2010-12-24 2010-12-24 Sposób i urządzenie do adaptacyjnego wykrywania aktywności głosowej w wejściowym sygnale audio

Country Status (10)

Country Link
US (5) US9368112B2 (pl)
EP (5) EP2743924B1 (pl)
CN (1) CN102959625B9 (pl)
DK (1) DK3493205T3 (pl)
ES (3) ES2987086T3 (pl)
HU (1) HUE053127T2 (pl)
PL (1) PL3493205T3 (pl)
PT (1) PT3493205T (pl)
SI (1) SI3493205T1 (pl)
WO (1) WO2012083555A1 (pl)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2440627C2 (ru) * 2007-02-26 2012-01-20 Долби Лэборетериз Лайсенсинг Корпорейшн Повышение разборчивости речи в звукозаписи развлекательных программ
MX344169B (es) 2012-12-21 2016-12-07 Fraunhofer Ges Forschung Generacion de ruido de confort con alta resolucion espectro-temporal en transmision discontinua de señales de audio.
JP6335190B2 (ja) * 2012-12-21 2018-05-30 フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン 低ビットレートで背景ノイズをモデル化するためのコンフォートノイズ付加
CN106409310B (zh) 2013-08-06 2019-11-19 华为技术有限公司 一种音频信号分类方法和装置
US8990079B1 (en) 2013-12-15 2015-03-24 Zanavox Automatic calibration of command-detection thresholds
CN104916292B (zh) * 2014-03-12 2017-05-24 华为技术有限公司 检测音频信号的方法和装置
CN104036777A (zh) * 2014-05-22 2014-09-10 哈尔滨理工大学 一种语音活动检测方法及装置
KR20240011875A (ko) * 2014-07-28 2024-01-26 삼성전자주식회사 패킷 손실 은닉방법 및 장치와 이를 적용한 복호화방법 및 장치
CN105810214B (zh) * 2014-12-31 2019-11-05 展讯通信(上海)有限公司 语音激活检测方法及装置
US9613640B1 (en) 2016-01-14 2017-04-04 Audyssey Laboratories, Inc. Speech/music discrimination
US10339962B2 (en) 2017-04-11 2019-07-02 Texas Instruments Incorporated Methods and apparatus for low cost voice activity detector
CN107393558B (zh) * 2017-07-14 2020-09-11 深圳永顺智信息科技有限公司 语音活动检测方法及装置
EP3432306A1 (en) * 2017-07-18 2019-01-23 Harman Becker Automotive Systems GmbH Speech signal leveling
CN107895573B (zh) * 2017-11-15 2021-08-24 百度在线网络技术(北京)有限公司 用于识别信息的方法及装置
US11430485B2 (en) * 2019-11-19 2022-08-30 Netflix, Inc. Systems and methods for mixing synthetic voice with original audio tracks
WO2021195429A1 (en) * 2020-03-27 2021-09-30 Dolby Laboratories Licensing Corporation Automatic leveling of speech content
CN114242116B (zh) * 2022-01-05 2024-08-02 成都锦江电子系统工程有限公司 一种语音的话音与非话音的综合判决方法

Family Cites Families (82)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5276765A (en) * 1988-03-11 1994-01-04 British Telecommunications Public Limited Company Voice activity detection
AU633673B2 (en) * 1990-01-18 1993-02-04 Matsushita Electric Industrial Co., Ltd. Signal processing device
US5537509A (en) * 1990-12-06 1996-07-16 Hughes Electronics Comfort noise generation for digital communication systems
US5509102A (en) * 1992-07-01 1996-04-16 Kokusai Electric Co., Ltd. Voice encoder using a voice activity detector
CA2110090C (en) * 1992-11-27 1998-09-15 Toshihiro Hayata Voice encoder
US5450484A (en) * 1993-03-01 1995-09-12 Dialogic Corporation Voice detection
US5459814A (en) * 1993-03-26 1995-10-17 Hughes Aircraft Company Voice activity detector for speech signals in variable background noise
US5659622A (en) * 1995-11-13 1997-08-19 Motorola, Inc. Method and apparatus for suppressing noise in a communication system
FI100840B (fi) * 1995-12-12 1998-02-27 Nokia Mobile Phones Ltd Kohinanvaimennin ja menetelmä taustakohinan vaimentamiseksi kohinaises ta puheesta sekä matkaviestin
US5689615A (en) * 1996-01-22 1997-11-18 Rockwell International Corporation Usage of voice activity detection for efficient coding of speech
JP3255584B2 (ja) * 1997-01-20 2002-02-12 ロジック株式会社 有音検知装置および方法
US6104993A (en) * 1997-02-26 2000-08-15 Motorola, Inc. Apparatus and method for rate determination in a communication system
EP0867856B1 (fr) * 1997-03-25 2005-10-26 Koninklijke Philips Electronics N.V. "Méthode et dispositif de detection d'activité vocale"