DE602005026949D1 - Normierung von cepstralen Merkmalen für die Spracherkennung - Google Patents

Normierung von cepstralen Merkmalen für die Spracherkennung

Info

Publication number: DE602005026949D1
Authority: DE; Germany
Prior art keywords: standardization; speech recognition; cepstral features; cepstral; features
Prior art date: 2004-01-12
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.): Expired - Lifetime

Application number

DE602005026949T

Other languages

English (en)

Inventor

Igor Zlokarnik

Laurence S Gillick

Jordan Cohen

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

Voice Signal Technologies Inc

Original Assignee

Voice Signal Technologies Inc

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

2004-01-12

Filing date

2005-01-10

Publication date

2011-04-28

2005-01-10 Application filed by Voice Signal Technologies Inc filed Critical Voice Signal Technologies Inc

2011-04-28 Publication of DE602005026949D1 publication Critical patent/DE602005026949D1/de

2025-01-11 Anticipated expiration legal-status Critical

Status Expired - Lifetime legal-status Critical Current

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum

Landscapes

Engineering & Computer Science (AREA)
Computational Linguistics (AREA)
Health & Medical Sciences (AREA)
Audiology, Speech & Language Pathology (AREA)
Human Computer Interaction (AREA)
Physics & Mathematics (AREA)
Acoustics & Sound (AREA)
Multimedia (AREA)
Computer Vision & Pattern Recognition (AREA)
Telephonic Communication Services (AREA)
Stereophonic System (AREA)
Machine Translation (AREA)
Time-Division Multiplex Systems (AREA)
Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

DE602005026949T 2004-01-12 2005-01-10 Normierung von cepstralen Merkmalen für die Spracherkennung Expired - Lifetime DE602005026949D1 (de)

Applications Claiming Priority (2)

Application Number	Priority Date	Filing Date	Title
US53586304P	2004-01-12	2004-01-12
PCT/US2005/000757 WO2005070130A2 (en)	2004-01-12	2005-01-10	Speech recognition channel normalization utilizing measured energy values from speech utterance

Publications (1)

Publication Number	Publication Date
DE602005026949D1 true DE602005026949D1 (de)	2011-04-28

Family

ID=34806967

Family Applications (1)

Application Number	Title	Priority Date	Filing Date
DE602005026949T Expired - Lifetime DE602005026949D1 (de)	2004-01-12	2005-01-10	Normierung von cepstralen Merkmalen für die Spracherkennung

Country Status (6)

Country	Link
US (1)	US7797157B2 (de)
EP (1)	EP1774516B1 (de)
JP (1)	JP4682154B2 (de)
CN (1)	CN101228577B (de)
DE (1)	DE602005026949D1 (de)
WO (1)	WO2005070130A2 (de)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
US7702505B2 (en) *	2004-12-14	2010-04-20	Electronics And Telecommunications Research Institute	Channel normalization apparatus and method for robust speech recognition
US20070263848A1 (en) *	2006-04-19	2007-11-15	Tellabs Operations, Inc.	Echo detection and delay estimation using a pattern recognition approach and cepstral correlation
WO2008077281A1 (en) *	2006-12-27	2008-07-03	Intel Corporation	Method and apparatus for speech segmentation
JP4864783B2 (ja) *	2007-03-23	2012-02-01	Ｋｄｄｉ株式会社	パタンマッチング装置、パタンマッチングプログラム、およびパタンマッチング方法
US8930179B2 (en)	2009-06-04	2015-01-06	Microsoft Corporation	Recognition using re-recognition and statistical classification
US8768695B2 (en) *	2012-06-13	2014-07-01	Nuance Communications, Inc.	Channel normalization using recognition feedback
US9984676B2 (en) *	2012-07-24	2018-05-29	Nuance Communications, Inc.	Feature normalization inputs to front end processing for automatic speech recognition
US10376338B2 (en)	2014-05-13	2019-08-13	Covidien Lp	Surgical robotic arm support systems and methods of use
US9953661B2 (en) *	2014-09-26	2018-04-24	Cirrus Logic Inc.	Neural network voice activity detection employing running range normalization
CN107112011B (zh) *	2014-12-22	2021-11-09	英特尔公司	用于音频特征提取的倒谱方差归一化
US10540990B2 (en) *	2017-11-01	2020-01-21	International Business Machines Corporation	Processing of speech signals

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
JP2797949B2 (ja) *	1994-01-31	1998-09-17	日本電気株式会社	音声認識装置
US5604839A (en) *	1994-07-29	1997-02-18	Microsoft Corporation	Method and system for improving speech recognition through front-end normalization of feature vectors
GB9419388D0 (en) *	1994-09-26	1994-11-09	Canon Kk	Speech analysis
US5677990A (en) *	1995-05-05	1997-10-14	Panasonic Technologies, Inc.	System and method using N-best strategy for real time recognition of continuously spelled names
US6633842B1 (en) *	1999-10-22	2003-10-14	Texas Instruments Incorporated	Speech recognition front-end feature extraction for noisy speech
US6202047B1 (en) *	1998-03-30	2001-03-13	At&T Corp.	Method and apparatus for speech recognition using second order statistics and linear estimation of cepstral coefficients
JPH11311994A (ja) *	1998-04-30	1999-11-09	Sony Corp	情報処理装置および方法、並びに提供媒体
CN1144172C (zh) *	1998-04-30	2004-03-31	松下电器产业株式会社	包括最大似然方法的基于本征音的发言者适应方法
US6173258B1 (en) *	1998-09-09	2001-01-09	Sony Corporation	Method for reducing noise distortions in a speech recognition system
US6253175B1 (en) *	1998-11-30	2001-06-26	International Business Machines Corporation	Wavelet-based energy binning cepstal features for automatic speech recognition
US6658385B1 (en) *	1999-03-12	2003-12-02	Texas Instruments Incorporated	Method for transforming HMMs for speaker-independent recognition in a noisy environment
GB2349259B (en) *	1999-04-23	2003-11-12	Canon Kk	Speech processing apparatus and method
JP2001134295A (ja) *	1999-08-23	2001-05-18	Sony Corp	符号化装置および符号化方法、記録装置および記録方法、送信装置および送信方法、復号化装置および符号化方法、再生装置および再生方法、並びに記録媒体
US6502070B1 (en) *	2000-04-28	2002-12-31	Nortel Networks Limited	Method and apparatus for normalizing channel specific speech feature elements
DE60110541T2 (de) *	2001-02-06	2006-02-23	Sony International (Europe) Gmbh	Verfahren zur Spracherkennung mit geräuschabhängiger Normalisierung der Varianz
US7062433B2 (en) *	2001-03-14	2006-06-13	Texas Instruments Incorporated	Method of speech recognition with compensation for both channel distortion and background noise
US7035797B2 (en) *	2001-12-14	2006-04-25	Nokia Corporation	Data-driven filtering of cepstral time trajectories for robust speech recognition
IL148592A0 (en) *	2002-03-10	2002-09-12	Ycd Multimedia Ltd	Dynamic normalizing
US7117148B2 (en) *	2002-04-05	2006-10-03	Microsoft Corporation	Method of noise reduction using correction vectors based on dynamic aspects of speech and noise normalization
US7197456B2 (en) *	2002-04-30	2007-03-27	Nokia Corporation	On-line parametric histogram normalization for noise robust speech recognition
JP4239479B2 (ja) *	2002-05-23	2009-03-18	日本電気株式会社	音声認識装置、音声認識方法、および、音声認識プログラム

2005
- 2005-01-10 JP JP2006549503A patent/JP4682154B2/ja not_active Expired - Fee Related
- 2005-01-10 EP EP05705425A patent/EP1774516B1/de not_active Expired - Lifetime
- 2005-01-10 US US11/032,415 patent/US7797157B2/en active Active
- 2005-01-10 WO PCT/US2005/000757 patent/WO2005070130A2/en active Application Filing
- 2005-01-10 DE DE602005026949T patent/DE602005026949D1/de not_active Expired - Lifetime
- 2005-01-10 CN CN2005800022461A patent/CN101228577B/zh not_active Expired - Fee Related

Also Published As

Publication number	Publication date
CN101228577B (zh)	2011-11-23
US7797157B2 (en)	2010-09-14
CN101228577A (zh)	2008-07-23
EP1774516A2 (de)	2007-04-18
EP1774516A4 (de)	2009-11-11
US20050182621A1 (en)	2005-08-18
JP2007536562A (ja)	2007-12-13
WO2005070130A2 (en)	2005-08-04
WO2005070130A3 (en)	2009-04-09
EP1774516B1 (de)	2011-03-16
JP4682154B2 (ja)	2011-05-11

Publication	Publication Date	Title
DE60318544D1 (de)	2008-02-21	Sprachmodell für die Spracherkennung
DE602005024894D1 (de)	2011-01-05	Verteilte Spracherkennung für mobile Geräte
EP2171710C0 (de)	2024-01-03	Mosaiken für die automatische spracherkennung (asr)
DE60115738D1 (de)	2006-01-19	Sprachmodelle für die Spracherkennung
DE602004021716D1 (de)	2009-08-06	Spracherkennungssystem
DE602006000090D1 (de)	2007-10-18	Konfidenzmaß für ein Sprachdialogsystem
EP1818909A4 (de)	2009-10-28	Stimmenerkennungssystem
DE60229095D1 (de)	2008-11-13	Ausprachen in mehreren Sprachen zur Spracherkennung
DE60323362D1 (de)	2008-10-16	Spracherkennungseinrichtung
DE602005000628D1 (de)	2007-04-12	Verfahren und Vorrichtung für die mehrschichtige verteilte Spracherkennung
EP1894186A4 (de)	2009-05-20	Spracherkennungssystem für sichere informationen
DE60109105D1 (de)	2005-04-07	Hierarchisierte Wörterbücher für die Spracherkennung
DE60020660D1 (de)	2005-07-14	Kontextabhängige akustische Modelle für die Spracherkennung mit Eigenstimmenanpassung
EP1799865A4 (de)	2008-03-05	Verfahren für die verabreichung von iloperidone
DE60126882D1 (de)	2007-04-12	Hierarchisierte Wörterbücher für die Spracherkennung
EP2260264A4 (de)	2015-05-06	Spracherkennungs-grammatikauswahl auf der basis des kontexts
EP1747553A4 (de)	2007-11-07	Erkennung des endes einer äusserung in einem spracherkennungssystem
DE602005026949D1 (de)	2011-04-28	Normierung von cepstralen Merkmalen für die Spracherkennung
DE602005000896D1 (de)	2007-05-31	Sprachsegmentierung
DE602006009385D1 (de)	2009-11-05	Sprachsyntheseverfahren integriert in einem mobilen Endgerät
DE602004030279D1 (de)	2011-01-05	Formverfahren durch schmieden und formverfahren für gehäuse
EP1504442A4 (de)	2005-12-21	Sprachsteuerung und spracherkennung für in der hand gehaltene geräte
EP1889464A4 (de)	2010-09-01	Überwachungssystem mit spracherkennung
DE502004003081D1 (de)	2007-04-12	Nutzeradaptive dialogunterstützung für sprachdialogsysteme
DE602006002721D1 (de)	2008-10-23	Sprachsynthesizer