Bijankhan et al., 2003 - Google Patents
Tfarsdat-the telephone farsi speech database.Bijankhan et al., 2003
View PDF- Document ID
- 3353258474743494585
- Author
- Bijankhan M
- Sheykhzadegan J
- Roohani M
- Zarrintare R
- Ghasemi S
- Ghasedi M
- Publication year
- Publication venue
- INTERSPEECH
External Links
Snippet
This paper describes an ongoing research to create an acoustic phonetic based telephone Farsi speech database, called “Tfarsdat”. It is compared with two LDC Farsi corpora, OGI and Call friend in terms of corpus dialectology. Up to now, we have recorded about 8 hours …
- 230000002269 spontaneous 0 abstract description 17
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/027—Concept to speech synthesisers; Generation of natural phrases from machine-based concepts
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ostendorf et al. | The Boston University radio news corpus | |
Klatt | The Klattalk text-to-speech conversion system | |
Lee et al. | Spoken language resources for Cantonese speech processing | |
Neto et al. | Free tools and resources for Brazilian Portuguese speech recognition | |
Dutoit | A short introduction to text-to-speech synthesis | |
Tomokiyo | Recognizing non-native speech: characterizing and adapting to non-native usage in LVCSR | |
Arısoy et al. | A unified language model for large vocabulary continuous speech recognition of Turkish | |
Tseng | Syllable contractions in a Mandarin conversational dialogue corpus | |
Yoo et al. | The performance evaluation of continuous speech recognition based on Korean phonological rules of cloud-based speech recognition open API | |
Bijankhan et al. | Tfarsdat-the telephone farsi speech database. | |
Demenko et al. | JURISDIC: Polish Speech Database for Taking Dictation of Legal Texts. | |
Maamouri et al. | Dialectal Arabic telephone speech corpus: Principles, tool design, and transcription conventions | |
Lamel et al. | Speech recognition of European languages | |
Yoshino et al. | Parallel speech corpora of Japanese dialects | |
Levow | Adaptations in spoken corrections: Implications for models of conversational speech | |
Shahid et al. | Subjective testing of urdu text-to-speech (tts) system | |
Marasek et al. | Multi-level annotation in SpeeCon Polish speech database | |
Martinčić–Ipšić et al. | Acoustic modelling for Croatian speech recognition and synthesis | |
Büyük | Sub-world language modelling for Turkish speech recognition | |
Lamel et al. | Spoken language processing in a multilingual context | |
Khusainov et al. | Speech analysis and synthesis systems for the tatar language | |
Khaw et al. | Preparation of MaDiTS corpus for Malay dialect translation and speech synthesis system. | |
Kalith et al. | Comparison of Syllable and Phoneme Modelling of Agglutinative Tamil Isolated Words in Speech Recognition | |
Rodríguez et al. | Evaluation of sublexical and lexical models of acoustic disfluencies for spontaneous speech recognition in Spanish. | |
Ri et al. | A method for constructing Korean spontaneous spoken language corpus based on an imitation of abbreviated and transformed particles |