[go: up one dir, main page]

Skip to main content

Speech Disorders Classification by CNN in Phonetic E-Learning System

  • Conference paper
  • First Online:
Artificial Intelligence in HCI (HCII 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13336))

Included in the following conference series:

  • 2994 Accesses

Abstract

Speech disorders may affect the process of phonetic transcriptions. In the Automated Phonetic Transcription-the grading tool (APTgt), a linguistic E-learning system, to reduce the influence of disordered speech in the phonetic exams, we proposed a speech disorders classification module that aims to classify disordered speech and non-disordered speech. The Mel-frequency cepstral coefficients (MFCCs) are utilized to represent the features of the speech sound files. With the two different formats of MFCCs, we adopted two approaches to classifying the MFCCs: calculating the similarity between MFCC values by dynamic time warping (DTW) algorithm and classifying the distances by support vector machine (SVM); directly image classification by the convolutional neural network (CNN). We will focus on the second approach in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Radha, R., et al.: E-Learning during lockdown of COVID-19 pandemic: a global perspective. In: Int. J. Control Autom. 13(4), 1088–1099 (2020)

    Google Scholar 

  2. Seals, C.D., et al.: Applied webservices platform supported through modified edit distance algorithm: automated phonetic transcription grading tool (APTgt). In: Zaphiris, P., Ioannou, A. (eds.) HCII 2020. LNCS, vol. 12205, pp. 380–398. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50513-4_29

    Chapter  Google Scholar 

  3. Liu, J., et al.: Optimization to automated phonetic transcription grading tool (APTgt) – automatic exam generator. In: Zaphiris, P., Ioannou, A. (eds.) HCII 2021. LNCS, vol. 12784, pp. 80–91. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77889-7_6

    Chapter  Google Scholar 

  4. Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint arXiv:1003.4083 (2010)

  5. Mohan, B.J.: Speech recognition using MFCC and DTW. In: 2014 International Conference on Advances in Electrical Engineering (ICAEE). IEEE (2014)

    Google Scholar 

  6. Chowdhury, A., Ross, A.: Fusing MFCC and LPC features using 1D triplet CNN for speaker recognition in severely degraded audio signals. IEEE Trans. Inf. Forensics Secur. 15, 1616–1629 (2019)

    Article  Google Scholar 

  7. Brown, A.: International Phonetic Alphabet. The Encyclopedia of Applied Linguistics (2012)

    Google Scholar 

  8. Shriberg, L.D., et al.: Extensions to the speech disorders classification system (SDCS). Clin. Linguist. Phon. 24(10), 795–824 (2010)

    Google Scholar 

  9. Han, W., et al.: An efficient MFCC extraction method in speech recognition. In: 2006 IEEE International Symposium on Circuits and Systems. IEEE (2006)

    Google Scholar 

  10. Speights Atkins, M., Bailey, D.J., Boyce, S.E.: Speech exemplar and evaluation database (SEED) for clinical training in articulatory phonetics and speech science. Clin. Linguist. Phon. 34(9), 878–886 (2020)

    Google Scholar 

  11. Mavroforakis, M.E., Theodoridis, S.: A geometric approach to support vector machine (SVM) classification. IEEE Trans. Neural Netw. 17(3), 671–682 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheryl Seals .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, J. et al. (2022). Speech Disorders Classification by CNN in Phonetic E-Learning System. In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2022. Lecture Notes in Computer Science(), vol 13336. Springer, Cham. https://doi.org/10.1007/978-3-031-05643-7_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-05643-7_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-05642-0

  • Online ISBN: 978-3-031-05643-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics