Deng et al., 1996 - Google Patents
Hierarchical partition of the articulatory state space for overlapping-feature based speech recognitionDeng et al., 1996
View PDF- Document ID
- 9228607139010310134
- Author
- Deng L
- Wu J
- Publication year
- Publication venue
- Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP'96
External Links
Snippet
Describes our recent work on improving an overlapping articulatory feature (sub-phonemic) based speech recognizer with robustness to the requirement of training data. A new decision-tree algorithm is developed and applied to the recognizer design which results in …
- 238000005192 partition 0 title description 18
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G10L15/144—Training of HMMs
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6296—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10249294B2 (en) | Speech recognition system and method | |
DE69519297T2 (en) | METHOD AND DEVICE FOR VOICE RECOGNITION BY MEANS OF OPTIMIZED PARTIAL BUNDLING OF LIKELIHOOD MIXTURES | |
US6912499B1 (en) | Method and apparatus for training a multilingual speech model set | |
US7257532B2 (en) | Apparatus and method for speech recognition | |
US5812975A (en) | State transition model design method and voice recognition method and apparatus using same | |
US5502791A (en) | Speech recognition by concatenating fenonic allophone hidden Markov models in parallel among subwords | |
US5956679A (en) | Speech processing apparatus and method using a noise-adaptive PMC model | |
US7062436B1 (en) | Word-specific acoustic models in a speech recognition system | |
Bahl et al. | A method for the construction of acoustic Markov models for words | |
Morgan et al. | Hybrid neural network/hidden markov model systems for continuous speech recognition | |
De Mori et al. | Parallel algorithms for syllable recognition in continuous speech | |
Zweig | Bayesian network structures and inference techniques for automatic speech recognition | |
Pakoci et al. | Improvements in Serbian speech recognition using sequence-trained deep neural networks | |
Paul | Extensions to phone-state decision-tree clustering: single tree and tagged clustering | |
Rosdi et al. | Isolated malay speech recognition using Hidden Markov Models | |
Gillick et al. | A rapid match algorithm for continuous speech recognition | |
Deng et al. | Hierarchical partition of the articulatory state space for overlapping-feature based speech recognition | |
Shen et al. | Automatic selection of phonetically distributed sentence sets for speaker adaptation with application to large vocabulary Mandarin speech recognition | |
JP2982689B2 (en) | Standard pattern creation method using information criterion | |
Viszlay et al. | Alternative phonetic class definition in linear discriminant analysis of speech | |
Han et al. | Trajectory clustering for solving the trajectory folding problem in automatic speech recognition | |
Gao et al. | Class-triphone acoustic modeling based on decision tree for Mandarin continuous speech recognition | |
De Mori et al. | Search and learning strategies for improving hidden Markov models | |
Serridge | Context-dependent modeling in a segment-based speech recognition system | |
Devillers et al. | Hybrid system combining expert-TDNNs and HMMs for continuous speech recognition |