DE602005026949D1 - Normierung von cepstralen Merkmalen für die Spracherkennung - Google Patents
Normierung von cepstralen Merkmalen für die SpracherkennungInfo
- Publication number
- DE602005026949D1 DE602005026949D1 DE602005026949T DE602005026949T DE602005026949D1 DE 602005026949 D1 DE602005026949 D1 DE 602005026949D1 DE 602005026949 T DE602005026949 T DE 602005026949T DE 602005026949 T DE602005026949 T DE 602005026949T DE 602005026949 D1 DE602005026949 D1 DE 602005026949D1
- Authority
- DE
- Germany
- Prior art keywords
- standardization
- speech recognition
- cepstral features
- cepstral
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Telephonic Communication Services (AREA)
- Stereophonic System (AREA)
- Machine Translation (AREA)
- Time-Division Multiplex Systems (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US53586304P | 2004-01-12 | 2004-01-12 | |
PCT/US2005/000757 WO2005070130A2 (en) | 2004-01-12 | 2005-01-10 | Speech recognition channel normalization utilizing measured energy values from speech utterance |
Publications (1)
Publication Number | Publication Date |
---|---|
DE602005026949D1 true DE602005026949D1 (de) | 2011-04-28 |
Family
ID=34806967
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
DE602005026949T Expired - Lifetime DE602005026949D1 (de) | 2004-01-12 | 2005-01-10 | Normierung von cepstralen Merkmalen für die Spracherkennung |
Country Status (6)
Country | Link |
---|---|
US (1) | US7797157B2 (de) |
EP (1) | EP1774516B1 (de) |
JP (1) | JP4682154B2 (de) |
CN (1) | CN101228577B (de) |
DE (1) | DE602005026949D1 (de) |
WO (1) | WO2005070130A2 (de) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7702505B2 (en) * | 2004-12-14 | 2010-04-20 | Electronics And Telecommunications Research Institute | Channel normalization apparatus and method for robust speech recognition |
US20070263848A1 (en) * | 2006-04-19 | 2007-11-15 | Tellabs Operations, Inc. | Echo detection and delay estimation using a pattern recognition approach and cepstral correlation |
WO2008077281A1 (en) * | 2006-12-27 | 2008-07-03 | Intel Corporation | Method and apparatus for speech segmentation |
JP4864783B2 (ja) * | 2007-03-23 | 2012-02-01 | Kddi株式会社 | パタンマッチング装置、パタンマッチングプログラム、およびパタンマッチング方法 |
US8930179B2 (en) | 2009-06-04 | 2015-01-06 | Microsoft Corporation | Recognition using re-recognition and statistical classification |
US8768695B2 (en) * | 2012-06-13 | 2014-07-01 | Nuance Communications, Inc. | Channel normalization using recognition feedback |
US9984676B2 (en) * | 2012-07-24 | 2018-05-29 | Nuance Communications, Inc. | Feature normalization inputs to front end processing for automatic speech recognition |
US10376338B2 (en) | 2014-05-13 | 2019-08-13 | Covidien Lp | Surgical robotic arm support systems and methods of use |
US9953661B2 (en) * | 2014-09-26 | 2018-04-24 | Cirrus Logic Inc. | Neural network voice activity detection employing running range normalization |
CN107112011B (zh) * | 2014-12-22 | 2021-11-09 | 英特尔公司 | 用于音频特征提取的倒谱方差归一化 |
US10540990B2 (en) * | 2017-11-01 | 2020-01-21 | International Business Machines Corporation | Processing of speech signals |
Family Cites Families (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2797949B2 (ja) * | 1994-01-31 | 1998-09-17 | 日本電気株式会社 | 音声認識装置 |
US5604839A (en) * | 1994-07-29 | 1997-02-18 | Microsoft Corporation | Method and system for improving speech recognition through front-end normalization of feature vectors |
GB9419388D0 (en) * | 1994-09-26 | 1994-11-09 | Canon Kk | Speech analysis |
US5677990A (en) * | 1995-05-05 | 1997-10-14 | Panasonic Technologies, Inc. | System and method using N-best strategy for real time recognition of continuously spelled names |
US6633842B1 (en) * | 1999-10-22 | 2003-10-14 | Texas Instruments Incorporated | Speech recognition front-end feature extraction for noisy speech |
US6202047B1 (en) * | 1998-03-30 | 2001-03-13 | At&T Corp. | Method and apparatus for speech recognition using second order statistics and linear estimation of cepstral coefficients |
JPH11311994A (ja) * | 1998-04-30 | 1999-11-09 | Sony Corp | 情報処理装置および方法、並びに提供媒体 |
CN1144172C (zh) * | 1998-04-30 | 2004-03-31 | 松下电器产业株式会社 | 包括最大似然方法的基于本征音的发言者适应方法 |
US6173258B1 (en) * | 1998-09-09 | 2001-01-09 | Sony Corporation | Method for reducing noise distortions in a speech recognition system |
US6253175B1 (en) * | 1998-11-30 | 2001-06-26 | International Business Machines Corporation | Wavelet-based energy binning cepstal features for automatic speech recognition |
US6658385B1 (en) * | 1999-03-12 | 2003-12-02 | Texas Instruments Incorporated | Method for transforming HMMs for speaker-independent recognition in a noisy environment |
GB2349259B (en) * | 1999-04-23 | 2003-11-12 | Canon Kk | Speech processing apparatus and method |
JP2001134295A (ja) * | 1999-08-23 | 2001-05-18 | Sony Corp | 符号化装置および符号化方法、記録装置および記録方法、送信装置および送信方法、復号化装置および符号化方法、再生装置および再生方法、並びに記録媒体 |
US6502070B1 (en) * | 2000-04-28 | 2002-12-31 | Nortel Networks Limited | Method and apparatus for normalizing channel specific speech feature elements |
DE60110541T2 (de) * | 2001-02-06 | 2006-02-23 | Sony International (Europe) Gmbh | Verfahren zur Spracherkennung mit geräuschabhängiger Normalisierung der Varianz |
US7062433B2 (en) * | 2001-03-14 | 2006-06-13 | Texas Instruments Incorporated | Method of speech recognition with compensation for both channel distortion and background noise |
US7035797B2 (en) * | 2001-12-14 | 2006-04-25 | Nokia Corporation | Data-driven filtering of cepstral time trajectories for robust speech recognition |
IL148592A0 (en) * | 2002-03-10 | 2002-09-12 | Ycd Multimedia Ltd | Dynamic normalizing |
US7117148B2 (en) * | 2002-04-05 | 2006-10-03 | Microsoft Corporation | Method of noise reduction using correction vectors based on dynamic aspects of speech and noise normalization |
US7197456B2 (en) * | 2002-04-30 | 2007-03-27 | Nokia Corporation | On-line parametric histogram normalization for noise robust speech recognition |
JP4239479B2 (ja) * | 2002-05-23 | 2009-03-18 | 日本電気株式会社 | 音声認識装置、音声認識方法、および、音声認識プログラム |
-
2005
- 2005-01-10 JP JP2006549503A patent/JP4682154B2/ja not_active Expired - Fee Related
- 2005-01-10 EP EP05705425A patent/EP1774516B1/de not_active Expired - Lifetime
- 2005-01-10 US US11/032,415 patent/US7797157B2/en active Active
- 2005-01-10 WO PCT/US2005/000757 patent/WO2005070130A2/en active Application Filing
- 2005-01-10 DE DE602005026949T patent/DE602005026949D1/de not_active Expired - Lifetime
- 2005-01-10 CN CN2005800022461A patent/CN101228577B/zh not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN101228577B (zh) | 2011-11-23 |
US7797157B2 (en) | 2010-09-14 |
CN101228577A (zh) | 2008-07-23 |
EP1774516A2 (de) | 2007-04-18 |
EP1774516A4 (de) | 2009-11-11 |
US20050182621A1 (en) | 2005-08-18 |
JP2007536562A (ja) | 2007-12-13 |
WO2005070130A2 (en) | 2005-08-04 |
WO2005070130A3 (en) | 2009-04-09 |
EP1774516B1 (de) | 2011-03-16 |
JP4682154B2 (ja) | 2011-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE60318544D1 (de) | Sprachmodell für die Spracherkennung | |
DE602005024894D1 (de) | Verteilte Spracherkennung für mobile Geräte | |
EP2171710C0 (de) | Mosaiken für die automatische spracherkennung (asr) | |
DE60115738D1 (de) | Sprachmodelle für die Spracherkennung | |
DE602004021716D1 (de) | Spracherkennungssystem | |
DE602006000090D1 (de) | Konfidenzmaß für ein Sprachdialogsystem | |
EP1818909A4 (de) | Stimmenerkennungssystem | |
DE60229095D1 (de) | Ausprachen in mehreren Sprachen zur Spracherkennung | |
DE60323362D1 (de) | Spracherkennungseinrichtung | |
DE602005000628D1 (de) | Verfahren und Vorrichtung für die mehrschichtige verteilte Spracherkennung | |
EP1894186A4 (de) | Spracherkennungssystem für sichere informationen | |
DE60109105D1 (de) | Hierarchisierte Wörterbücher für die Spracherkennung | |
DE60020660D1 (de) | Kontextabhängige akustische Modelle für die Spracherkennung mit Eigenstimmenanpassung | |
EP1799865A4 (de) | Verfahren für die verabreichung von iloperidone | |
DE60126882D1 (de) | Hierarchisierte Wörterbücher für die Spracherkennung | |
EP2260264A4 (de) | Spracherkennungs-grammatikauswahl auf der basis des kontexts | |
EP1747553A4 (de) | Erkennung des endes einer äusserung in einem spracherkennungssystem | |
DE602005026949D1 (de) | Normierung von cepstralen Merkmalen für die Spracherkennung | |
DE602005000896D1 (de) | Sprachsegmentierung | |
DE602006009385D1 (de) | Sprachsyntheseverfahren integriert in einem mobilen Endgerät | |
DE602004030279D1 (de) | Formverfahren durch schmieden und formverfahren für gehäuse | |
EP1504442A4 (de) | Sprachsteuerung und spracherkennung für in der hand gehaltene geräte | |
EP1889464A4 (de) | Überwachungssystem mit spracherkennung | |
DE502004003081D1 (de) | Nutzeradaptive dialogunterstützung für sprachdialogsysteme | |
DE602006002721D1 (de) | Sprachsynthesizer |