Can syllabification improve pronunciation by analogy of English?

Abstract

In spite of difficulty in defining the syllable unequivocally, and controversy over its role in theories of spoken and written language processing, the syllable is a potentially useful unit in several practical tasks which arise in computational linguistics and speech technology. For instance, syllable structure might embody valuable information for building word models in automatic speech recognition, and concatenative speech synthesis might use syllables or demisyllables as basic units. In this paper, we first present an algorithm for determining syllable boundaries in the orthographic form of unknown words that works by analogical reasoning from a database or corpus of known syllabifications. We call this syllabification by analogy (SbA). It is similarly motivated to our existing pronunciation by analogy (PbA) which predicts pronunciations for unknown words (specified by their spellings) by inference from a dictionary of known word spellings and corresponding pronunciations. We show that including perfect (according to the corpus) syllable boundary information in the orthographic input can dramatically improve the performance of pronunciation by analogy of English words, but such information would not be available to a practical system. So we next investigate combining automatically-inferred syllabification and pronunciation in two different ways: the series model in which syllabification is followed sequentially by pronunciation generation; and the parallel model in which syllabification and pronunciation are simultaneously inferred. Unfortunately, neither improves performance over PbA without syllabification. Possible reasons for this failure are explored via an analysis of syllabification and pronunciation errors.

Crossref Citations

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Teixeira, António Oliveira, Catarina and Moutinho, Lurdes 2006. Computational Processing of the Portuguese Language. Vol. 3960, Issue. , p. 212.

SOONKLANG, TASANAWAN DAMPER, ROBERT I. and MARCHAND, YANNICK 2008. Multilingual pronunciation by analogy. Natural Language Engineering, Vol. 14, Issue. 4, p. 527.

Zhang, Yi and Wang, Rui 2009. KI 2009: Advances in Artificial Intelligence. Vol. 5803, Issue. , p. 217.

Marchand, Yannick Adsett, Connie R. and Damper, Robert I. 2009. Automatic Syllabification in English: A Comparison of Different Algorithms. Language and Speech, Vol. 52, Issue. 1, p. 1.

Adsett, Connie R. Marchand, Yannick and Kes˘elj, Vlado 2009. Syllabification rules versus data-driven methods in a language with low syllabic complexity: The case of Italian. Computer Speech & Language, Vol. 23, Issue. 4, p. 444.

Adsett, Connie R. and Marchand, Yannick 2010. Syllabic Complexity: A Computational Evaluation of Nine European Languages. Journal of Quantitative Linguistics, Vol. 17, Issue. 4, p. 269.

Fujita, Sanae Bond, Francis Oepen, Stephan and Tanaka, Takaaki 2010. Exploiting Semantic Information for HPSG Parse Selection. Research on Language and Computation, Vol. 8, Issue. 1, p. 1.

Boroș, Tiberiu Ștefănescu, Dan and Ion, Radu 2013. Where Humans Meet Machines. p. 137.

Tufis, Dan Boros, Tiberiu and Daniel Dumitrescu, Stefan 2013. The RACAI speech translation system challenges of morphologically rich languages. p. 1.

Nguyen, Dat Quoc Nguyen, Dai Quoc Pham, Son Bao Nguyen, Phuong-Thai and Le Nguyen, Minh 2014. Natural Language Processing and Information Systems. Vol. 8455, Issue. , p. 196.

Tsarfaty, Reut 2014. Natural Language Processing of Semitic Languages. p. 67.

Guo, Shesen Zhang, Ganzhou Zhai, Run and Song, Zehua 2015. Distribution of English syllables in e-books of Project Gutenberg and the evolution of syllable number in two subcorpora. Digital Scholarship in the Humanities, Vol. 30, Issue. 3, p. 344.

Borleffs, Elisabeth Maassen, Ben A. M. Lyytinen, Heikki and Zwarts, Frans 2017. Measuring orthographic transparency and morphological-syllabic complexity in alphabetic orthographies: a narrative review. Reading and Writing, Vol. 30, Issue. 8, p. 1617.

Sarma, Himangshu Saharia, Navanath and Sharma, Utpal 2018. Development and Analysis of Speech Recognition Systems for Assamese Language Using HTK. ACM Transactions on Asian and Low-Resource Language Information Processing, Vol. 17, Issue. 1, p. 1.

Suyanto, Suyanto 2019. Incorporating syllabification points into a model of grapheme-to-phoneme conversion. International Journal of Speech Technology, Vol. 22, Issue. 2, p. 459.

Asahiah, Franklin Ọládiípọ̀ 2021. Comparison of rule-based and data-driven approaches for syllabification of simple syllable languages and the effect of orthography. Computer Speech & Language, Vol. 70, Issue. , p. 101233.

Article contents

Abstract

Access options

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

Can syllabification improve pronunciation by analogy of English?

Abstract

Access options

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests