Abstract
In this paper, four speech corpora collected in the Speech Lab of NCTU in recent years are discussed. They include a Mandarin tree-bank speech corpus, a Min-Nan speech corpus, a Hakka speech corpus, and a Chinese-English mixed speech corpus. Currently, they are used separately to develop a corpus-based Mandarin TTS system, a Min-Nan TTS system, a Hakka TTS system, and a Chinese-English bilingual TTS system. These systems will be integrated in the future to construct a multilingual TTS system covering the four primary languages used in Taiwan.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Huang, C.-R., Chen, K.-J., Chen, F.-Y., Chen, K.-J., Gao, Z.-M., Chen, K.-Y.: Sinica Treebank: Design Criteria, Annotation Guidelines, and Online Interface. In: Proceedings of 2nd Chinese Language Processing Workshop 2000, Hong Kong, pp. 29–37 (2000)
Wightman, C.W., Ostendorf, M.: Automatic Labeling of Prosodic Patterns. IEEE Transactions on Speech and Audio Processing 2(4), 469–481 (1994)
Chiang, C.-Y., Wang, Y.-R., Chen, S.-H.: On the inter-syllable coarticulation effect of pitch modeling for Mandarin speech. In: Interspeech 2005, pp. 3269–3272 (2005)
Chou, F.-C., Tseng, C.-Y., Lee, L.-S.: A Set of Corpus-Based Text-to-Speech Synthesis Technologies for Mandarin Chinese. IEEE Transactions on Speech and Audio Processing 10(7), 481–494 (2002)
Cheng, R.L.: Taiwanese pronunciation and Romanization – with rules and examples for teachers and students. Wang Wen Publishing Company (1993)
Kuo, W.-C., Wang, Y.-R., Chen, S.-H.: A model-based tone labeling method for Min-Nan/Taiwanese speech. In: ICASSP 2004 (2004)
Kuo, W.-C., Zhong, X.-R., Wang, Y.-R., Chen, S.-H.: High-Performance Min-Nan/Taiwanese TTS System. In: ICASSP 2003 (2003)
Kuo, W.-C., Lin, L.-F., Wang, Y.-R., Chen, S.-H.: An NN-based Approach to Prosodic Information Generation for Synthesizing English Words Embedded in Chinese Text. In: Proc. of Eurospeech 2003 (2003)
Chen, S.H., Hwang, S.H., Wang, Y.R.: An RNN-based Prosodic Information Synthesizer for Mandarin Text-to-Speech. IEEE Trans. Speech and Audio Processing 6(3), 226–239 (1998)
Chen, S.-H., Lai, W.-H., Wang, Y.-R.: A New Duration Modeling Approach for Mandarin Speech. IEEE Trans. on Speech and Audio Processing 11(4) (July 2003)
Chen, S.-H., Lai, W.-H., Wang, Y.-R.: A statistics-based pitch contour model for Mandarin speech. J. Acoust. Soc. Am. 117(2), 908–925 (2005)
Chen, S.H., Ho, C.C.: A Min-Nan Text-to-Speech System. In: ISCSLP 2000, Beijing, China (October 2000)
Yu, H.-M., Hwang, H.-T., Lin, D.-Y., Chen, S.-H.: A Hakka Text-to- Speech System. Submitted to ISCSLP 2006 (2006)
Chen, S.H., et al.: Technical Report of NCTU, MOE project EX-94-E-FA06-4-4
Cutler, A., Otake, T.: Rhythmic categories in spoken-word recognition. Journal of Memory and Language 46(2), 296–322 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hsiao, HC., Yu, HM., Wang, YR., Chen, SH. (2006). Multilingual Speech Corpora for TTS System Development. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_75
Download citation
DOI: https://doi.org/10.1007/11939993_75
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)