Jansen et al., 2008 - Google Patents

A hierarchical point process model for speech recognition

Jansen et al., 2008

Document ID: 12810435851121771792
Author: Jansen A; Niyogi P
Publication year: 2008
Publication venue: 2008 IEEE International Conference on Acoustics, Speech and Signal Processing

External Links

Cited by

Snippet

In this paper, we present a computational framework to engage distinctive feature-based theories of speech perception. Our approach involves:(i) transforming the signal into a collection of marked point processes, each consisting of distinctive feature landmarks …

Continue reading at scholar.archive.org (PDF) (other versions)

238000000034 method 0 title abstract description 16

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G10L15/144—Training of HMMs
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models

Similar Documents

Publication	Publication Date	Title
US10424289B2 (en)	2019-09-24	Speech recognition system using machine learning to classify phone posterior context information and estimate boundaries in speech from combined boundary posteriors
Hinton et al.	2012	Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
Lopes et al.	2011	Phone recognition on the TIMIT database
Saon et al.	2012	Large-vocabulary continuous speech recognition systems: A look at some recent advances
EP1610301B1 (en)	2008-10-08	Speech recognition method based on word duration modelling
US8886533B2 (en)	2014-11-11	System and method for combining frame and segment level processing, via temporal pooling, for phonetic classification
Scanlon et al.	2007	Using broad phonetic group experts for improved speech recognition
OstendorfÝ et al.	2003	Prosody models for conversational speech recognition
King et al.	2000	Speech recognition via phonetically-featured syllables
Liu	2018	Deep convolutional and LSTM neural networks for acoustic modelling in automatic speech recognition
Jung et al.	2019	Additional shared decoder on Siamese multi-view encoders for learning acoustic word embeddings
Carofilis et al.	2023	Improvement of accent classification models through grad-transfer from spectrograms and gradient-weighted class activation mapping
Soltau et al.	2017	Reducing the computational complexity for whole word models
Wöllmer et al.	2013	Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory
US20070203700A1 (en)	2007-08-30	Speech Recognition Apparatus And Speech Recognition Method
Ben Messaoud et al.	2011	Combining formant frequency based on variable order LPC coding with acoustic features for TIMIT phone recognition
Ketabdar et al.	2007	Detection of out-of-vocabulary words in posterior based ASR.
Zhou et al.	2020	Extracting unit embeddings using sequence-to-sequence acoustic models for unit selection speech synthesis
Jansen et al.	2008	A hierarchical point process model for speech recognition
Jansen et al.	2008	Modeling the temporal dynamics of distinctive feature landmark detectors for speech recognition
Pandey et al.	2019	Keyword spotting in continuous speech using spectral and prosodic information fusion
Nguyen et al.	2015	Improving acoustic model for vietnamese large vocabulary continuous speech recognition system using deep bottleneck features
Jansen et al.	2007	A probabilistic speech recognition framework based on the temporal dynamics of distinctive feature landmark detectors
De Mori et al.	1993	Speaker‐independent consonant classification in continuous speech with distinctive features and neural networks
Kaur et al.	2021	Classification approaches for automatic speech recognition system