AU2001294974A1 - Perceptual harmonic cepstral coefficients as the front-end for speech recognition - Google Patents
Perceptual harmonic cepstral coefficients as the front-end for speech recognitionInfo
- Publication number
- AU2001294974A1 AU2001294974A1 AU2001294974A AU9497401A AU2001294974A1 AU 2001294974 A1 AU2001294974 A1 AU 2001294974A1 AU 2001294974 A AU2001294974 A AU 2001294974A AU 9497401 A AU9497401 A AU 9497401A AU 2001294974 A1 AU2001294974 A1 AU 2001294974A1
- Authority
- AU
- Australia
- Prior art keywords
- speech recognition
- cepstral coefficients
- perceptual
- perceptual harmonic
- harmonic cepstral
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/90—Pitch determination of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
- G10L2025/935—Mixed voiced class; Transitions
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US23728500P | 2000-10-02 | 2000-10-02 | |
US60237285 | 2000-10-02 | ||
PCT/US2001/030909 WO2002029782A1 (en) | 2000-10-02 | 2001-10-02 | Perceptual harmonic cepstral coefficients as the front-end for speech recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
AU2001294974A1 true AU2001294974A1 (en) | 2002-04-15 |
Family
ID=22893097
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU2001294974A Abandoned AU2001294974A1 (en) | 2000-10-02 | 2001-10-02 | Perceptual harmonic cepstral coefficients as the front-end for speech recognition |
Country Status (3)
Country | Link |
---|---|
US (2) | US7337107B2 (en) |
AU (1) | AU2001294974A1 (en) |
WO (1) | WO2002029782A1 (en) |
Families Citing this family (59)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2001294974A1 (en) * | 2000-10-02 | 2002-04-15 | The Regents Of The University Of California | Perceptual harmonic cepstral coefficients as the front-end for speech recognition |
US20030004720A1 (en) * | 2001-01-30 | 2003-01-02 | Harinath Garudadri | System and method for computing and transmitting parameters in a distributed voice recognition system |
US7177304B1 (en) * | 2002-01-03 | 2007-02-13 | Cisco Technology, Inc. | Devices, softwares and methods for prioritizing between voice data packets for discard decision purposes |
SG120121A1 (en) * | 2003-09-26 | 2006-03-28 | St Microelectronics Asia | Pitch detection of speech signals |
JP4318119B2 (en) * | 2004-06-18 | 2009-08-19 | 国立大学法人京都大学 | Acoustic signal processing method, acoustic signal processing apparatus, acoustic signal processing system, and computer program |
WO2006006366A1 (en) * | 2004-07-13 | 2006-01-19 | Matsushita Electric Industrial Co., Ltd. | Pitch frequency estimation device, and pitch frequency estimation method |
US20060025991A1 (en) * | 2004-07-23 | 2006-02-02 | Lg Electronics Inc. | Voice coding apparatus and method using PLP in mobile communications terminal |
KR100619893B1 (en) * | 2004-07-23 | 2006-09-19 | 엘지전자 주식회사 | Improved Low Bit Rate Linear Prediction Coding Apparatus and Method for Mobile Devices |
CN1776807A (en) * | 2004-11-15 | 2006-05-24 | 松下电器产业株式会社 | Voice recognition system and safety device with the system |
KR20060066483A (en) * | 2004-12-13 | 2006-06-16 | 엘지전자 주식회사 | Feature Vector Extraction Method for Speech Recognition |
JP4407538B2 (en) * | 2005-03-03 | 2010-02-03 | ヤマハ株式会社 | Microphone array signal processing apparatus and microphone array system |
US7788101B2 (en) * | 2005-10-31 | 2010-08-31 | Hitachi, Ltd. | Adaptation method for inter-person biometrics variability |
US7603275B2 (en) * | 2005-10-31 | 2009-10-13 | Hitachi, Ltd. | System, method and computer program product for verifying an identity using voiced to unvoiced classifiers |
KR100653643B1 (en) * | 2006-01-26 | 2006-12-05 | 삼성전자주식회사 | Pitch detection method and pitch detection device using ratio of harmonic and harmonic |
KR100790110B1 (en) * | 2006-03-18 | 2008-01-02 | 삼성전자주식회사 | Morphology-based speech signal codec method and device |
KR100762596B1 (en) * | 2006-04-05 | 2007-10-01 | 삼성전자주식회사 | Voice signal preprocessing system and voice signal feature information extraction method |
EP2128858B1 (en) * | 2007-03-02 | 2013-04-10 | Panasonic Corporation | Encoding device and encoding method |
US7877252B2 (en) * | 2007-05-18 | 2011-01-25 | Stmicroelectronics S.R.L. | Automatic speech recognition method and apparatus, using non-linear envelope detection of signal power spectra |
JP5089295B2 (en) * | 2007-08-31 | 2012-12-05 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Speech processing system, method and program |
US8688441B2 (en) * | 2007-11-29 | 2014-04-01 | Motorola Mobility Llc | Method and apparatus to facilitate provision and use of an energy value to determine a spectral envelope shape for out-of-signal bandwidth content |
US8433582B2 (en) * | 2008-02-01 | 2013-04-30 | Motorola Mobility Llc | Method and apparatus for estimating high-band energy in a bandwidth extension system |
US20090201983A1 (en) * | 2008-02-07 | 2009-08-13 | Motorola, Inc. | Method and apparatus for estimating high-band energy in a bandwidth extension system |
US8463412B2 (en) * | 2008-08-21 | 2013-06-11 | Motorola Mobility Llc | Method and apparatus to facilitate determining signal bounding frequencies |
US8155967B2 (en) * | 2008-12-08 | 2012-04-10 | Begel Daniel M | Method and system to identify, quantify, and display acoustic transformational structures in speech |
US8463599B2 (en) * | 2009-02-04 | 2013-06-11 | Motorola Mobility Llc | Bandwidth extension method and apparatus for a modified discrete cosine transform audio coder |
US9055374B2 (en) * | 2009-06-24 | 2015-06-09 | Arizona Board Of Regents For And On Behalf Of Arizona State University | Method and system for determining an auditory pattern of an audio segment |
JP5316896B2 (en) * | 2010-03-17 | 2013-10-16 | ソニー株式会社 | Encoding device, encoding method, decoding device, decoding method, and program |
WO2012038998A1 (en) * | 2010-09-21 | 2012-03-29 | 三菱電機株式会社 | Noise suppression device |
US9208799B2 (en) * | 2010-11-10 | 2015-12-08 | Koninklijke Philips N.V. | Method and device for estimating a pattern in a signal |
US8849663B2 (en) * | 2011-03-21 | 2014-09-30 | The Intellisis Corporation | Systems and methods for segmenting and/or classifying an audio signal from transformed audio information |
US9142220B2 (en) | 2011-03-25 | 2015-09-22 | The Intellisis Corporation | Systems and methods for reconstructing an audio signal from transformed audio information |
US8620646B2 (en) * | 2011-08-08 | 2013-12-31 | The Intellisis Corporation | System and method for tracking sound pitch across an audio signal using harmonic envelope |
US8548803B2 (en) | 2011-08-08 | 2013-10-01 | The Intellisis Corporation | System and method of processing a sound signal including transforming the sound signal into a frequency-chirp domain |
US9183850B2 (en) | 2011-08-08 | 2015-11-10 | The Intellisis Corporation | System and method for tracking sound pitch across an audio signal |
EP2795884A4 (en) * | 2011-12-20 | 2015-07-29 | Nokia Corp | AUDIOCONFERENCING |
US9173025B2 (en) | 2012-02-08 | 2015-10-27 | Dolby Laboratories Licensing Corporation | Combined suppression of noise, echo, and out-of-location signals |
US8712076B2 (en) | 2012-02-08 | 2014-04-29 | Dolby Laboratories Licensing Corporation | Post-processing including median filtering of noise suppression gains |
US9384759B2 (en) * | 2012-03-05 | 2016-07-05 | Malaspina Labs (Barbados) Inc. | Voice activity detection and pitch estimation |
US9076446B2 (en) * | 2012-03-22 | 2015-07-07 | Qiguang Lin | Method and apparatus for robust speaker and speech recognition |
CN103325384A (en) | 2012-03-23 | 2013-09-25 | 杜比实验室特许公司 | Harmonicity estimation, audio classification, pitch definition and noise estimation |
EP2828855B1 (en) | 2012-03-23 | 2016-04-27 | Dolby Laboratories Licensing Corporation | Determining a harmonicity measure for voice processing |
CN103426441B (en) | 2012-05-18 | 2016-03-02 | 华为技术有限公司 | Detect the method and apparatus of the correctness of pitch period |
CN103928029B (en) | 2013-01-11 | 2017-02-08 | 华为技术有限公司 | Audio signal coding method, audio signal decoding method, audio signal coding apparatus, and audio signal decoding apparatus |
EP2984649B1 (en) * | 2013-04-11 | 2020-07-29 | Cetin CETINTURK | Extraction of acoustic relative excitation features |
US9058820B1 (en) | 2013-05-21 | 2015-06-16 | The Intellisis Corporation | Identifying speech portions of a sound model using various statistics thereof |
US9484044B1 (en) | 2013-07-17 | 2016-11-01 | Knuedge Incorporated | Voice enhancement and/or speech features extraction on noisy audio signals using successively refined transforms |
US9530434B1 (en) | 2013-07-18 | 2016-12-27 | Knuedge Incorporated | Reducing octave errors during pitch determination for noisy audio signals |
US9208794B1 (en) | 2013-08-07 | 2015-12-08 | The Intellisis Corporation | Providing sound models of an input signal using continuous and/or linear fitting |
PL3696816T3 (en) | 2014-05-01 | 2021-10-25 | Nippon Telegraph And Telephone Corporation | Periodic-combined-envelope-sequence generation device, periodic-combined-envelope-sequence generation method, periodic-combined-envelope-sequence generation program and recording medium |
US9870785B2 (en) | 2015-02-06 | 2018-01-16 | Knuedge Incorporated | Determining features of harmonic signals |
US9842611B2 (en) | 2015-02-06 | 2017-12-12 | Knuedge Incorporated | Estimating pitch using peak-to-peak distances |
US9922668B2 (en) | 2015-02-06 | 2018-03-20 | Knuedge Incorporated | Estimating fractional chirp rate with multiple frequency representations |
US9997161B2 (en) | 2015-09-11 | 2018-06-12 | Microsoft Technology Licensing, Llc | Automatic speech recognition confidence classifier |
US10706852B2 (en) | 2015-11-13 | 2020-07-07 | Microsoft Technology Licensing, Llc | Confidence features for automated speech recognition arbitration |
US10468050B2 (en) * | 2017-03-29 | 2019-11-05 | Microsoft Technology Licensing, Llc | Voice synthesized participatory rhyming chat bot |
CN108022588B (en) * | 2017-11-13 | 2022-03-29 | 河海大学 | Robust speech recognition method based on dual-feature model |
US11138334B1 (en) * | 2018-10-17 | 2021-10-05 | Medallia, Inc. | Use of ASR confidence to improve reliability of automatic audio redaction |
CN112863517B (en) * | 2021-01-19 | 2023-01-06 | 苏州大学 | Speech Recognition Method Based on Convergence Rate of Perceptual Spectrum |
US20240282326A1 (en) * | 2023-02-17 | 2024-08-22 | Resonant Cavity LLC | Harmonic coefficient setting mechanism |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3649765A (en) * | 1969-10-29 | 1972-03-14 | Bell Telephone Labor Inc | Speech analyzer-synthesizer system employing improved formant extractor |
US4885790A (en) * | 1985-03-18 | 1989-12-05 | Massachusetts Institute Of Technology | Processing of acoustic waveforms |
US5596680A (en) * | 1992-12-31 | 1997-01-21 | Apple Computer, Inc. | Method and apparatus for detecting speech activity using cepstrum vectors |
JP2812184B2 (en) * | 1994-02-23 | 1998-10-22 | 日本電気株式会社 | Complex Cepstrum Analyzer for Speech |
AU696092B2 (en) * | 1995-01-12 | 1998-09-03 | Digital Voice Systems, Inc. | Estimation of excitation parameters |
US5774837A (en) * | 1995-09-13 | 1998-06-30 | Voxware, Inc. | Speech coding system and method using voicing probability determination |
US6233550B1 (en) * | 1997-08-29 | 2001-05-15 | The Regents Of The University Of California | Method and apparatus for hybrid coding of speech at 4kbps |
US6963833B1 (en) * | 1999-10-26 | 2005-11-08 | Sasken Communication Technologies Limited | Modifications in the multi-band excitation (MBE) model for generating high quality speech at low bit rates |
US6377916B1 (en) * | 1999-11-29 | 2002-04-23 | Digital Voice Systems, Inc. | Multiband harmonic transform coder |
AU2001294974A1 (en) * | 2000-10-02 | 2002-04-15 | The Regents Of The University Of California | Perceptual harmonic cepstral coefficients as the front-end for speech recognition |
-
2001
- 2001-10-02 AU AU2001294974A patent/AU2001294974A1/en not_active Abandoned
- 2001-10-02 US US10/363,523 patent/US7337107B2/en not_active Expired - Lifetime
- 2001-10-02 WO PCT/US2001/030909 patent/WO2002029782A1/en active Application Filing
-
2008
- 2008-02-01 US US12/012,334 patent/US7756700B2/en not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
US7756700B2 (en) | 2010-07-13 |
WO2002029782A1 (en) | 2002-04-11 |
US20080162122A1 (en) | 2008-07-03 |
US7337107B2 (en) | 2008-02-26 |
US20040128130A1 (en) | 2004-07-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2001294974A1 (en) | Perceptual harmonic cepstral coefficients as the front-end for speech recognition | |
AU7348000A (en) | Voice recognition for internet navigation | |
AU2003295682A1 (en) | Multilingual speech recognition | |
AU2001291307A1 (en) | Structured speech recognition | |
AU1149501A (en) | Speech recognition | |
AU2002218916A1 (en) | Hierarchical language models for speech recognition | |
AU5451800A (en) | Phrase-based dialogue modeling with particular application to creating recognition grammars for voice-controlled user interfaces | |
AU2001275319A1 (en) | Load-adjusted speech recognition | |
AU2002236034A1 (en) | Spoken language interface | |
AU2001279172A1 (en) | Computer-implemented speech recognition system training | |
AU2003235868A1 (en) | Speech recognition device | |
AU3165000A (en) | Client-server speech recognition | |
AU2002363991A1 (en) | Distributed speech recognition with configurable front-end | |
AU5647899A (en) | Speech recognizer | |
EP1251489A3 (en) | Training the parameters of a speech recognition system for the recognition of pronunciation variations | |
AU2002325930A1 (en) | Method for automatic speech recognition | |
AU1157401A (en) | Speech recognition | |
AU2000276400A1 (en) | Search method based on single triphone tree for large vocabulary continuous speech recognizer | |
AU2001262407A1 (en) | Dynamic language models for speech recognition | |
AU2001238103A1 (en) | Electrolaryngeal speech enhancement for telephony | |
AU2002231046A1 (en) | Context-responsive spoken language instruction | |
AU2003273357A1 (en) | Speech recognition system | |
AU2002302651A1 (en) | Voice recognition method | |
AU2002354792A1 (en) | Grammars for speech recognition | |
AU3381299A (en) | Multiple stage speech recognizer |