Bijankhan et al., 2003 - Google Patents

Tfarsdat-the telephone farsi speech database.

Bijankhan et al., 2003

Document ID: 3353258474743494585
Author: Bijankhan M; Sheykhzadegan J; Roohani M; Zarrintare R; Ghasemi S; Ghasedi M
Publication year: 2003
Publication venue: INTERSPEECH

External Links

Cited by

Snippet

This paper describes an ongoing research to create an acoustic phonetic based telephone Farsi speech database, called “Tfarsdat”. It is compared with two LDC Farsi corpora, OGI and Call friend in terms of corpus dialectology. Up to now, we have recorded about 8 hours …

Continue reading at www.researchgate.net (PDF) (other versions)

230000002269 spontaneous 0 abstract description 17

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/027—Concept to speech synthesisers; Generation of natural phrases from machine-based concepts
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch

Similar Documents

Publication	Publication Date	Title
Ostendorf et al.	1995	The Boston University radio news corpus
Klatt	1982	The Klattalk text-to-speech conversion system
Lee et al.	2002	Spoken language resources for Cantonese speech processing
Neto et al.	2011	Free tools and resources for Brazilian Portuguese speech recognition
Dutoit	1999	A short introduction to text-to-speech synthesis
Tomokiyo	2001	Recognizing non-native speech: characterizing and adapting to non-native usage in LVCSR
Arısoy et al.	2006	A unified language model for large vocabulary continuous speech recognition of Turkish
Tseng	2005	Syllable contractions in a Mandarin conversational dialogue corpus
Yoo et al.	2021	The performance evaluation of continuous speech recognition based on Korean phonological rules of cloud-based speech recognition open API
Bijankhan et al.	2003	Tfarsdat-the telephone farsi speech database.
Demenko et al.	2008	JURISDIC: Polish Speech Database for Taking Dictation of Legal Texts.
Maamouri et al.	2004	Dialectal Arabic telephone speech corpus: Principles, tool design, and transcription conventions
Lamel et al.	1995	Speech recognition of European languages
Yoshino et al.	2016	Parallel speech corpora of Japanese dialects
Levow	2002	Adaptations in spoken corrections: Implications for models of conversational speech
Shahid et al.	2016	Subjective testing of urdu text-to-speech (tts) system
Marasek et al.	2004	Multi-level annotation in SpeeCon Polish speech database
Martinčić–Ipšić et al.	2008	Acoustic modelling for Croatian speech recognition and synthesis
Büyük	2005	Sub-world language modelling for Turkish speech recognition
Lamel et al.	1996	Spoken language processing in a multilingual context
Khusainov et al.	2016	Speech analysis and synthesis systems for the tatar language
Khaw et al.	2014	Preparation of MaDiTS corpus for Malay dialect translation and speech synthesis system.
Kalith et al.	2018	Comparison of Syllable and Phoneme Modelling of Agglutinative Tamil Isolated Words in Speech Recognition
Rodríguez et al.	2001	Evaluation of sublexical and lexical models of acoustic disfluencies for spontaneous speech recognition in Spanish.
Ri et al.	2022	A method for constructing Korean spontaneous spoken language corpus based on an imitation of abbreviated and transformed particles