Saychum et al., 2016 - Google Patents
Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling.Saychum et al., 2016
View PDF- Document ID
- 8012835494548906338
- Author
- Saychum S
- Kongyoung S
- Rugchatjaroen A
- Chootrakool P
- Kasuriya S
- Wutiwiwatchai C
- Publication year
- Publication venue
- INTERSPEECH
External Links
Snippet
This paper presents the successful results of applying joint sequence modeling in Thai grapheme-to-phoneme conversion. The proposed method utilizes Conditional Random Fields (CRFs) in two-stage prediction. The first CRF is used for textual syllable segmentation …
- 238000006243 chemical reaction 0 title abstract description 9
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/2775—Phrasal analysis, e.g. finite state techniques, chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/085—Methods for reducing search complexity, pruning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102375115B1 (en) | Phoneme-Based Contextualization for Cross-Language Speech Recognition in End-to-End Models | |
Black et al. | Issues in building general letter to sound rules | |
Le et al. | Automatic speech recognition for under-resourced languages: application to Vietnamese language | |
Xu et al. | A deep neural network approach for sentence boundary detection in broadcast news. | |
US20080027725A1 (en) | Automatic Accent Detection With Limited Manually Labeled Data | |
Kirchhoff et al. | Novel speech recognition models for Arabic | |
Hasegawa-Johnson et al. | Grapheme-to-phoneme transduction for cross-language ASR | |
Arısoy et al. | A unified language model for large vocabulary continuous speech recognition of Turkish | |
Raza et al. | Design and development of phonetically rich Urdu speech corpus | |
Alsharhan et al. | Evaluating the effect of using different transcription schemes in building a speech recognition system for Arabic | |
Rugchatjaroen et al. | Efficient two-stage processing for joint sequence model-based Thai grapheme-to-phoneme conversion | |
Lőrincz et al. | RoLEX: The development of an extended Romanian lexical dataset and its evaluation at predicting concurrent lexical information | |
Chotimongkol et al. | Statistically trained orthographic to sound models for Thai. | |
Singh et al. | MECOS: A bilingual Manipuri–English spontaneous code-switching speech corpus for automatic speech recognition | |
Karim et al. | On the training of deep neural networks for automatic arabic-text diacritization | |
Alghamdi et al. | Automatic restoration of Arabic diacritics: a simple, purely statistical approach | |
Kirov et al. | Context-aware transliteration of romanized south asian languages | |
Saychum et al. | Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling. | |
JP2022515048A (en) | Transliteration for speech recognition training and scoring | |
Wilkinson et al. | Deriving Phonetic Transcriptions and Discovering Word Segmentations for Speech-to-Speech Translation in Low-Resource Settings. | |
Thangthai et al. | Automatic syllable-pattern induction in statistical Thai text-to-phone transcription. | |
Núñez et al. | Phonetic normalization for machine translation of user generated content | |
Rajendran et al. | Text processing for developing unrestricted Tamil text to speech synthesis system | |
Lyes et al. | Building a pronunciation dictionary for the Kabyle language | |
KR101777141B1 (en) | Apparatus and method for inputting chinese and foreign languages based on hun min jeong eum using korean input keyboard |