Abstract
Speech disorders may affect the process of phonetic transcriptions. In the Automated Phonetic Transcription-the grading tool (APTgt), a linguistic E-learning system, to reduce the influence of disordered speech in the phonetic exams, we proposed a speech disorders classification module that aims to classify disordered speech and non-disordered speech. The Mel-frequency cepstral coefficients (MFCCs) are utilized to represent the features of the speech sound files. With the two different formats of MFCCs, we adopted two approaches to classifying the MFCCs: calculating the similarity between MFCC values by dynamic time warping (DTW) algorithm and classifying the distances by support vector machine (SVM); directly image classification by the convolutional neural network (CNN). We will focus on the second approach in this paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Radha, R., et al.: E-Learning during lockdown of COVID-19 pandemic: a global perspective. In: Int. J. Control Autom. 13(4), 1088–1099 (2020)
Seals, C.D., et al.: Applied webservices platform supported through modified edit distance algorithm: automated phonetic transcription grading tool (APTgt). In: Zaphiris, P., Ioannou, A. (eds.) HCII 2020. LNCS, vol. 12205, pp. 380–398. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50513-4_29
Liu, J., et al.: Optimization to automated phonetic transcription grading tool (APTgt) – automatic exam generator. In: Zaphiris, P., Ioannou, A. (eds.) HCII 2021. LNCS, vol. 12784, pp. 80–91. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77889-7_6
Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint arXiv:1003.4083 (2010)
Mohan, B.J.: Speech recognition using MFCC and DTW. In: 2014 International Conference on Advances in Electrical Engineering (ICAEE). IEEE (2014)
Chowdhury, A., Ross, A.: Fusing MFCC and LPC features using 1D triplet CNN for speaker recognition in severely degraded audio signals. IEEE Trans. Inf. Forensics Secur. 15, 1616–1629 (2019)
Brown, A.: International Phonetic Alphabet. The Encyclopedia of Applied Linguistics (2012)
Shriberg, L.D., et al.: Extensions to the speech disorders classification system (SDCS). Clin. Linguist. Phon. 24(10), 795–824 (2010)
Han, W., et al.: An efficient MFCC extraction method in speech recognition. In: 2006 IEEE International Symposium on Circuits and Systems. IEEE (2006)
Speights Atkins, M., Bailey, D.J., Boyce, S.E.: Speech exemplar and evaluation database (SEED) for clinical training in articulatory phonetics and speech science. Clin. Linguist. Phon. 34(9), 878–886 (2020)
Mavroforakis, M.E., Theodoridis, S.: A geometric approach to support vector machine (SVM) classification. IEEE Trans. Neural Netw. 17(3), 671–682 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, J. et al. (2022). Speech Disorders Classification by CNN in Phonetic E-Learning System. In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2022. Lecture Notes in Computer Science(), vol 13336. Springer, Cham. https://doi.org/10.1007/978-3-031-05643-7_36
Download citation
DOI: https://doi.org/10.1007/978-3-031-05643-7_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-05642-0
Online ISBN: 978-3-031-05643-7
eBook Packages: Computer ScienceComputer Science (R0)