Speech Disorders Classification by CNN in Phonetic E-Learning System

Jueting Liu¹⁰,
Chang Ren¹⁰,
Yaoxuan Luan¹⁰,
Sicheng Li¹⁰,
Tianshi Xie¹⁰,
Cheryl Seals¹⁰ &
…
Marisha Speights Atkins¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13336))

Included in the following conference series:

International Conference on Human-Computer Interaction

2994 Accesses

Abstract

Speech disorders may affect the process of phonetic transcriptions. In the Automated Phonetic Transcription-the grading tool (APTgt), a linguistic E-learning system, to reduce the influence of disordered speech in the phonetic exams, we proposed a speech disorders classification module that aims to classify disordered speech and non-disordered speech. The Mel-frequency cepstral coefficients (MFCCs) are utilized to represent the features of the speech sound files. With the two different formats of MFCCs, we adopted two approaches to classifying the MFCCs: calculating the similarity between MFCC values by dynamic time warping (DTW) algorithm and classifying the distances by support vector machine (SVM); directly image classification by the convolutional neural network (CNN). We will focus on the second approach in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Pathological voice classification system based on CNN-BiLSTM network using speech enhancement and multi-stream approach

Article 01 June 2024

Phone-Based Speech Recognition for Phonetic E-Learning System

Binary and Multiclass Classification of Dysphonia Using Whisper Encoder and One-Dimensional Convolutional Neural Network

References

Radha, R., et al.: E-Learning during lockdown of COVID-19 pandemic: a global perspective. In: Int. J. Control Autom. 13(4), 1088–1099 (2020)
Google Scholar
Seals, C.D., et al.: Applied webservices platform supported through modified edit distance algorithm: automated phonetic transcription grading tool (APTgt). In: Zaphiris, P., Ioannou, A. (eds.) HCII 2020. LNCS, vol. 12205, pp. 380–398. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50513-4_29
Chapter Google Scholar
Liu, J., et al.: Optimization to automated phonetic transcription grading tool (APTgt) – automatic exam generator. In: Zaphiris, P., Ioannou, A. (eds.) HCII 2021. LNCS, vol. 12784, pp. 80–91. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77889-7_6
Chapter Google Scholar
Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint arXiv:1003.4083 (2010)
Mohan, B.J.: Speech recognition using MFCC and DTW. In: 2014 International Conference on Advances in Electrical Engineering (ICAEE). IEEE (2014)
Google Scholar
Chowdhury, A., Ross, A.: Fusing MFCC and LPC features using 1D triplet CNN for speaker recognition in severely degraded audio signals. IEEE Trans. Inf. Forensics Secur. 15, 1616–1629 (2019)
Article Google Scholar
Brown, A.: International Phonetic Alphabet. The Encyclopedia of Applied Linguistics (2012)
Google Scholar
Shriberg, L.D., et al.: Extensions to the speech disorders classification system (SDCS). Clin. Linguist. Phon. 24(10), 795–824 (2010)
Google Scholar
Han, W., et al.: An efficient MFCC extraction method in speech recognition. In: 2006 IEEE International Symposium on Circuits and Systems. IEEE (2006)
Google Scholar
Speights Atkins, M., Bailey, D.J., Boyce, S.E.: Speech exemplar and evaluation database (SEED) for clinical training in articulatory phonetics and speech science. Clin. Linguist. Phon. 34(9), 878–886 (2020)
Google Scholar
Mavroforakis, M.E., Theodoridis, S.: A geometric approach to support vector machine (SVM) classification. IEEE Trans. Neural Netw. 17(3), 671–682 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Auburn University, Auburn, AL, 36849, USA
Jueting Liu, Chang Ren, Yaoxuan Luan, Sicheng Li, Tianshi Xie & Cheryl Seals
Northwestern University, Evanston, IL, 60208, USA
Marisha Speights Atkins

Authors

Jueting Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chang Ren
View author publications
You can also search for this author in PubMed Google Scholar
Yaoxuan Luan
View author publications
You can also search for this author in PubMed Google Scholar
Sicheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Tianshi Xie
View author publications
You can also search for this author in PubMed Google Scholar
Cheryl Seals
View author publications
You can also search for this author in PubMed Google Scholar
Marisha Speights Atkins
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheryl Seals .

Editor information

Editors and Affiliations

Siemens (United States), Princeton, NJ, USA
Helmut Degen
Foundation for Research and Technology – Hellas (FORTH), Heraklion, Crete, Greece
Stavroula Ntoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, J. et al. (2022). Speech Disorders Classification by CNN in Phonetic E-Learning System. In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2022. Lecture Notes in Computer Science(), vol 13336. Springer, Cham. https://doi.org/10.1007/978-3-031-05643-7_36

Download citation

DOI: https://doi.org/10.1007/978-3-031-05643-7_36
Published: 15 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-05642-0
Online ISBN: 978-3-031-05643-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Speech Disorders Classification by CNN in Phonetic E-Learning System

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Pathological voice classification system based on CNN-BiLSTM network using speech enhancement and multi-stream approach

Phone-Based Speech Recognition for Phonetic E-Learning System

Binary and Multiclass Classification of Dysphonia Using Whisper Encoder and One-Dimensional Convolutional Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Speech Disorders Classification by CNN in Phonetic E-Learning System

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Pathological voice classification system based on CNN-BiLSTM network using speech enhancement and multi-stream approach

Phone-Based Speech Recognition for Phonetic E-Learning System

Binary and Multiclass Classification of Dysphonia Using Whisper Encoder and One-Dimensional Convolutional Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation