Abstract
Determining the language proficiency level required to understand a given text is a key requirement in vetting documents for use in second language learning. In this work, we describe our approach for developing an automatic text analytic to estimate the text difficulty level using the Interagency Language Roundtable (ILR) proficiency scale. The approach we take is to use machine translation to translate a non-English document into English and then use an English language trained ILR level detector. We achieve good results in predicting ILR levels with both human and machine translation of Farsi documents. We also report results on text leveling prediction on human translations into English of documents from 54 languages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Interagency Language Roundtable: ILR Skill Scale, http://www.govtilr.org/Skills/ILRscale4.htm (accessed June 15, 2014)
Flesch, R.: A new readability yardstick. Journal of Applied Psychology 32(3), 221–233 (1948)
Shen, W., Williams, J., Marius, T., Salesky, E.: A Language-Independent Approach to Automatic Text Difficulty Assessment for Second-Language Learners. In: Proceedings of the Workshop on Predicting and Improving Text Readability for Target Reader Populations, Sofia, Bulgaria, pp. 30–38 (2013)
Schwarm, S.E., Ostendorf, M.: Reading Level Assessment Using Support Vector Machines and Statistical Language Models. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (2005)
Petersen, S.E., Ostendorf, M.: A machine learning approach to reading level assessment. Computer Speech and Language 23, 89–106 (2009)
Kate, R.J., Xiaoqiang, L., Patwardhan, S., Franz, M., Florian, R., Mooney, R.J., Roukos, S., Welty, C.: Learning to predict readability using diverse linguistic features. In: Proceedings of the 23rd International Conference on Computational Linguistics, COLING 2010, pp. 546–554 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Roukos, S., Quin, J., Ward, T. (2014). Multi-lingual Text Leveling. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2014. Lecture Notes in Computer Science(), vol 8655. Springer, Cham. https://doi.org/10.1007/978-3-319-10816-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-10816-2_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10815-5
Online ISBN: 978-3-319-10816-2
eBook Packages: Computer ScienceComputer Science (R0)