[go: up one dir, main page]

Saychum et al., 2016 - Google Patents

Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling.

Saychum et al., 2016

View PDF
Document ID
8012835494548906338
Author
Saychum S
Kongyoung S
Rugchatjaroen A
Chootrakool P
Kasuriya S
Wutiwiwatchai C
Publication year
Publication venue
INTERSPEECH

External Links

Snippet

This paper presents the successful results of applying joint sequence modeling in Thai grapheme-to-phoneme conversion. The proposed method utilizes Conditional Random Fields (CRFs) in two-stage prediction. The first CRF is used for textual syllable segmentation …
Continue reading at www.isca-archive.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/187Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2863Processing of non-latin text
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • G06F17/2775Phrasal analysis, e.g. finite state techniques, chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2872Rule based translation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • G06F17/2217Character encodings
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/085Methods for reducing search complexity, pruning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • G10L13/10Prosody rules derived from text; Stress or intonation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/06Elementary speech units used in speech synthesisers; Concatenation rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification

Similar Documents

Publication Publication Date Title
KR102375115B1 (en) Phoneme-Based Contextualization for Cross-Language Speech Recognition in End-to-End Models
Black et al. Issues in building general letter to sound rules
Le et al. Automatic speech recognition for under-resourced languages: application to Vietnamese language
Xu et al. A deep neural network approach for sentence boundary detection in broadcast news.
US20080027725A1 (en) Automatic Accent Detection With Limited Manually Labeled Data
Kirchhoff et al. Novel speech recognition models for Arabic
Hasegawa-Johnson et al. Grapheme-to-phoneme transduction for cross-language ASR
Arısoy et al. A unified language model for large vocabulary continuous speech recognition of Turkish
Raza et al. Design and development of phonetically rich Urdu speech corpus
Alsharhan et al. Evaluating the effect of using different transcription schemes in building a speech recognition system for Arabic
Rugchatjaroen et al. Efficient two-stage processing for joint sequence model-based Thai grapheme-to-phoneme conversion
Lőrincz et al. RoLEX: The development of an extended Romanian lexical dataset and its evaluation at predicting concurrent lexical information
Chotimongkol et al. Statistically trained orthographic to sound models for Thai.
Singh et al. MECOS: A bilingual Manipuri–English spontaneous code-switching speech corpus for automatic speech recognition
Karim et al. On the training of deep neural networks for automatic arabic-text diacritization
Alghamdi et al. Automatic restoration of Arabic diacritics: a simple, purely statistical approach
Kirov et al. Context-aware transliteration of romanized south asian languages
Saychum et al. Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling.
JP2022515048A (en) Transliteration for speech recognition training and scoring
Wilkinson et al. Deriving Phonetic Transcriptions and Discovering Word Segmentations for Speech-to-Speech Translation in Low-Resource Settings.
Thangthai et al. Automatic syllable-pattern induction in statistical Thai text-to-phone transcription.
Núñez et al. Phonetic normalization for machine translation of user generated content
Rajendran et al. Text processing for developing unrestricted Tamil text to speech synthesis system
Lyes et al. Building a pronunciation dictionary for the Kabyle language
KR101777141B1 (en) Apparatus and method for inputting chinese and foreign languages based on hun min jeong eum using korean input keyboard