[go: up one dir, main page]

DE602005026949D1 - Normierung von cepstralen Merkmalen für die Spracherkennung - Google Patents

Normierung von cepstralen Merkmalen für die Spracherkennung

Info

Publication number
DE602005026949D1
DE602005026949D1 DE602005026949T DE602005026949T DE602005026949D1 DE 602005026949 D1 DE602005026949 D1 DE 602005026949D1 DE 602005026949 T DE602005026949 T DE 602005026949T DE 602005026949 T DE602005026949 T DE 602005026949T DE 602005026949 D1 DE602005026949 D1 DE 602005026949D1
Authority
DE
Germany
Prior art keywords
standardization
speech recognition
cepstral features
cepstral
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
DE602005026949T
Other languages
English (en)
Inventor
Igor Zlokarnik
Laurence S Gillick
Jordan Cohen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Voice Signal Technologies Inc
Original Assignee
Voice Signal Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Voice Signal Technologies Inc filed Critical Voice Signal Technologies Inc
Publication of DE602005026949D1 publication Critical patent/DE602005026949D1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Telephonic Communication Services (AREA)
  • Stereophonic System (AREA)
  • Machine Translation (AREA)
  • Time-Division Multiplex Systems (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
DE602005026949T 2004-01-12 2005-01-10 Normierung von cepstralen Merkmalen für die Spracherkennung Expired - Lifetime DE602005026949D1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US53586304P 2004-01-12 2004-01-12
PCT/US2005/000757 WO2005070130A2 (en) 2004-01-12 2005-01-10 Speech recognition channel normalization utilizing measured energy values from speech utterance

Publications (1)

Publication Number Publication Date
DE602005026949D1 true DE602005026949D1 (de) 2011-04-28

Family

ID=34806967

Family Applications (1)

Application Number Title Priority Date Filing Date
DE602005026949T Expired - Lifetime DE602005026949D1 (de) 2004-01-12 2005-01-10 Normierung von cepstralen Merkmalen für die Spracherkennung

Country Status (6)

Country Link
US (1) US7797157B2 (de)
EP (1) EP1774516B1 (de)
JP (1) JP4682154B2 (de)
CN (1) CN101228577B (de)
DE (1) DE602005026949D1 (de)
WO (1) WO2005070130A2 (de)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7702505B2 (en) * 2004-12-14 2010-04-20 Electronics And Telecommunications Research Institute Channel normalization apparatus and method for robust speech recognition
US20070263848A1 (en) * 2006-04-19 2007-11-15 Tellabs Operations, Inc. Echo detection and delay estimation using a pattern recognition approach and cepstral correlation
WO2008077281A1 (en) * 2006-12-27 2008-07-03 Intel Corporation Method and apparatus for speech segmentation
JP4864783B2 (ja) * 2007-03-23 2012-02-01 Kddi株式会社 パタンマッチング装置、パタンマッチングプログラム、およびパタンマッチング方法
US8930179B2 (en) 2009-06-04 2015-01-06 Microsoft Corporation Recognition using re-recognition and statistical classification
US8768695B2 (en) * 2012-06-13 2014-07-01 Nuance Communications, Inc. Channel normalization using recognition feedback
US9984676B2 (en) * 2012-07-24 2018-05-29 Nuance Communications, Inc. Feature normalization inputs to front end processing for automatic speech recognition
US10376338B2 (en) 2014-05-13 2019-08-13 Covidien Lp Surgical robotic arm support systems and methods of use
US9953661B2 (en) * 2014-09-26 2018-04-24 Cirrus Logic Inc. Neural network voice activity detection employing running range normalization
CN107112011B (zh) * 2014-12-22 2021-11-09 英特尔公司 用于音频特征提取的倒谱方差归一化
US10540990B2 (en) * 2017-11-01 2020-01-21 International Business Machines Corporation Processing of speech signals

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2797949B2 (ja) * 1994-01-31 1998-09-17 日本電気株式会社 音声認識装置
US5604839A (en) * 1994-07-29 1997-02-18 Microsoft Corporation Method and system for improving speech recognition through front-end normalization of feature vectors
GB9419388D0 (en) * 1994-09-26 1994-11-09 Canon Kk Speech analysis
US5677990A (en) * 1995-05-05 1997-10-14 Panasonic Technologies, Inc. System and method using N-best strategy for real time recognition of continuously spelled names
US6633842B1 (en) * 1999-10-22 2003-10-14 Texas Instruments Incorporated Speech recognition front-end feature extraction for noisy speech
US6202047B1 (en) * 1998-03-30 2001-03-13 At&T Corp. Method and apparatus for speech recognition using second order statistics and linear estimation of cepstral coefficients
JPH11311994A (ja) * 1998-04-30 1999-11-09 Sony Corp 情報処理装置および方法、並びに提供媒体
CN1144172C (zh) * 1998-04-30 2004-03-31 松下电器产业株式会社 包括最大似然方法的基于本征音的发言者适应方法
US6173258B1 (en) * 1998-09-09 2001-01-09 Sony Corporation Method for reducing noise distortions in a speech recognition system
US6253175B1 (en) * 1998-11-30 2001-06-26 International Business Machines Corporation Wavelet-based energy binning cepstal features for automatic speech recognition
US6658385B1 (en) * 1999-03-12 2003-12-02 Texas Instruments Incorporated Method for transforming HMMs for speaker-independent recognition in a noisy environment
GB2349259B (en) * 1999-04-23 2003-11-12 Canon Kk Speech processing apparatus and method
JP2001134295A (ja) * 1999-08-23 2001-05-18 Sony Corp 符号化装置および符号化方法、記録装置および記録方法、送信装置および送信方法、復号化装置および符号化方法、再生装置および再生方法、並びに記録媒体
US6502070B1 (en) * 2000-04-28 2002-12-31 Nortel Networks Limited Method and apparatus for normalizing channel specific speech feature elements
DE60110541T2 (de) * 2001-02-06 2006-02-23 Sony International (Europe) Gmbh Verfahren zur Spracherkennung mit geräuschabhängiger Normalisierung der Varianz
US7062433B2 (en) * 2001-03-14 2006-06-13 Texas Instruments Incorporated Method of speech recognition with compensation for both channel distortion and background noise
US7035797B2 (en) * 2001-12-14 2006-04-25 Nokia Corporation Data-driven filtering of cepstral time trajectories for robust speech recognition
IL148592A0 (en) * 2002-03-10 2002-09-12 Ycd Multimedia Ltd Dynamic normalizing
US7117148B2 (en) * 2002-04-05 2006-10-03 Microsoft Corporation Method of noise reduction using correction vectors based on dynamic aspects of speech and noise normalization
US7197456B2 (en) * 2002-04-30 2007-03-27 Nokia Corporation On-line parametric histogram normalization for noise robust speech recognition
JP4239479B2 (ja) * 2002-05-23 2009-03-18 日本電気株式会社 音声認識装置、音声認識方法、および、音声認識プログラム

Also Published As

Publication number Publication date
CN101228577B (zh) 2011-11-23
US7797157B2 (en) 2010-09-14
CN101228577A (zh) 2008-07-23
EP1774516A2 (de) 2007-04-18
EP1774516A4 (de) 2009-11-11
US20050182621A1 (en) 2005-08-18
JP2007536562A (ja) 2007-12-13
WO2005070130A2 (en) 2005-08-04
WO2005070130A3 (en) 2009-04-09
EP1774516B1 (de) 2011-03-16
JP4682154B2 (ja) 2011-05-11

Similar Documents

Publication Publication Date Title
DE60318544D1 (de) Sprachmodell für die Spracherkennung
DE602005024894D1 (de) Verteilte Spracherkennung für mobile Geräte
EP2171710C0 (de) Mosaiken für die automatische spracherkennung (asr)
DE60115738D1 (de) Sprachmodelle für die Spracherkennung
DE602004021716D1 (de) Spracherkennungssystem
DE602006000090D1 (de) Konfidenzmaß für ein Sprachdialogsystem
EP1818909A4 (de) Stimmenerkennungssystem
DE60229095D1 (de) Ausprachen in mehreren Sprachen zur Spracherkennung
DE60323362D1 (de) Spracherkennungseinrichtung
DE602005000628D1 (de) Verfahren und Vorrichtung für die mehrschichtige verteilte Spracherkennung
EP1894186A4 (de) Spracherkennungssystem für sichere informationen
DE60109105D1 (de) Hierarchisierte Wörterbücher für die Spracherkennung
DE60020660D1 (de) Kontextabhängige akustische Modelle für die Spracherkennung mit Eigenstimmenanpassung
EP1799865A4 (de) Verfahren für die verabreichung von iloperidone
DE60126882D1 (de) Hierarchisierte Wörterbücher für die Spracherkennung
EP2260264A4 (de) Spracherkennungs-grammatikauswahl auf der basis des kontexts
EP1747553A4 (de) Erkennung des endes einer äusserung in einem spracherkennungssystem
DE602005026949D1 (de) Normierung von cepstralen Merkmalen für die Spracherkennung
DE602005000896D1 (de) Sprachsegmentierung
DE602006009385D1 (de) Sprachsyntheseverfahren integriert in einem mobilen Endgerät
DE602004030279D1 (de) Formverfahren durch schmieden und formverfahren für gehäuse
EP1504442A4 (de) Sprachsteuerung und spracherkennung für in der hand gehaltene geräte
EP1889464A4 (de) Überwachungssystem mit spracherkennung
DE502004003081D1 (de) Nutzeradaptive dialogunterstützung für sprachdialogsysteme
DE602006002721D1 (de) Sprachsynthesizer