Abstract
Voice pathologies are widespread in society. However, the exams are invasive and uncomfortable for the patient, depending on the doctor’s experience doing the evaluation. Classifying and recognizing speech pathologies in a non-invasive way using acoustic analysis saves time for the patient and the specialist while allowing analyzes to be objective and efficient. This work presents a detailed description of an aid system for diagnosing speech pathologies associated with the larynx. The interface displays the parameters that physicians use most to classify subjects: absolute Jitter, relative Jitter, absolute Shimmer, relative Shimmer, Harmonic to Noise Ratio (HNR) and autocorrelation. The parameters used for the classification of the model are also presented (relative Jitter, absolute Jitter, RAP jitter, PPQ5 Jitter, absolute Shimmer, relative Shimmer, shimmer APQ3, shimmer APQ5, fundamental frequency, HNR, autocorrelation, Shannon entropy, entropy logarithmic and subject’s sex), as well as the description of the entire pre-processing of the data (treatment of Outliers using the quartile method, then data normalization and, finally, application of Principal Component Analysis (PCA) to reduce the dimension). The selected classification model is Wide Neural Network, with an accuracy of 98% and AUC of 0.99.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Reid, J., Parmar, P., Lund, T., Aalto, D.K., Jeffery, C.C.: Development of a machine-learning based voice disorder screening tool. Am. J. Otolaryngol. 43(2), 103327 (2022). https://doi.org/10.1016/J.AMJOTO.2021.103327
Martins, R.H.G., Santana, M.F., Tavares, E.L.M.: Vocal cysts: clinical, endoscopic, and surgical aspects. J. Voice 25(1), 107–110 (2011). https://doi.org/10.1016/J.JVOICE.2009.06.008
Ding, H., Gu, Z., Dai, P., Zhou, Z., Wang, L., Wu, X.: Deep connected attention (DCA) ResNet for robust voice pathology detection and classification. Biomed. Signal Process. Control 70, 102973 (2021). https://doi.org/10.1016/J.BSPC.2021.102973
Godino-Llorente, J.I., Gomez-Vilda, P., Blanco-Velasco, M.: Dimensionality reduction of a pathological voice quality assessment system based on Gaussian mixture models and short-term cepstral parameters. IEEE Trans. Biomed. Eng. 53(10), 1943–1953 (2006). https://doi.org/10.1109/TBME.2006.871883
Gidaye, G., Nirmal, J., Ezzine, K., Frikha, M.: Unified wavelet-based framework for evaluation of voice impairment. Int. J. Speech Technol.25(3), 527–548 (2022)https://doi.org/10.1007/s10772-022-09969-6
Hegde, S., Shetty, S., Rai, S., Dodderi, T.: A survey on machine learning approaches for automatic detection of voice disorders. J. Voice 33(6), 947.e11-947.e33 (2019). https://doi.org/10.1016/J.JVOICE.2018.07.014
Pakravan, M., Jahed, M.: Significant pathological voice discrimination by computing posterior distribution of balanced accuracy. Biomed. Signal Process. Control 73, 103410 (2022). https://doi.org/10.1016/J.BSPC.2021.103410
Darouiche, M.S., El Moubtahij, H., Yakhlef, M.B., Tazi, E.B.: An automatic voice disorder detection system based on extreme gradient boosting classifier; an automatic voice disorder detection system based on extreme gradient boosting classifier. In: 2022 2nd International Conference on Innovative Research in Applied Science, Engineering and Technology (2022) https://doi.org/10.1109/IRASET52964.2022.9737980
Zhang, X.J., Zhu, X.C., Wu, D., Xiao, Z.Z., Tao, Z., Zhao, H.M.: Nonlinear features of bark wavelet sub-band filtering for pathological voice recognition. Eng. Lett. 29(1), 49–60 (2021)
Castellana, A., Carullo, A., Corbellini, S., Astolfi, A.: Discriminating pathological voice from healthy voice using cepstral peak prominence smoothed distribution in sustained vowel. IEEE Trans. Instrum. Meas. 67(3), 646–654 (2018). https://doi.org/10.1109/TIM.2017.2781958
Omeroglu, A.N., Mohammed, H.M.A., Oral, E.A.: Multi-modal voice pathology detection architecture based on deep and handcrafted feature fusion. Eng. Sci. Technol. an Int. J. 36, 101148 (2022). https://doi.org/10.1016/J.JESTCH.2022.101148
Salehi, P.: The separation of multi-class pathological speech signals related to vocal cords disorders using adaptation wavelet transform based on lifting scheme. Cumhur. Üniversitesi Fen Edeb. Fakültesi Fen Bilim. Derg. 36(6), 2371–2382 (2015)
Ankışhan, H.: A new approach for detection of pathological voice disorders with reduced parameters. Electrica 18(1), 60–71 (2018)
Hamdi, R., Hajji, S., Cherif, A.: Voice pathology recognition and classification using noise related features. Int. J. Adv. Comput. Sci. Appl. 9(11), 82–87 (2018). https://doi.org/10.14569/IJACSA.2018.091112
Chen, L., Chen, J.: Deep neural network for automatic classification of pathological voice signals. J. Voice 36(2), 288.e15-288.e24 (2022). https://doi.org/10.1016/J.JVOICE.2020.05.029
Mohammed, M.A., et al.: Voice pathology detection and classification using convolutional neural network model. Appl. Sci. 10(11), 3723 (2020). https://doi.org/10.3390/app10113723
Zakariah, M., Ajmi Alotaibi, Y., Guo, Y., Tran-Trung, K., Elahi, M.M.: An analytical study of speech pathology detection based on MFCC and deep neural networks (2022) https://doi.org/10.1155/2022/7814952
Ali, Z., Hossain, M.S., Muhammad, G., Sangaiah, A.K.: An intelligent healthcare system for detection and classification to discriminate vocal fold disorders. Futur. Gener. Comput. Syst. 85, 19–28 (2018). https://doi.org/10.1016/J.FUTURE.2018.02.021
Hammami, I., Salhi, L., Labidi, S.: Voice Pathologies classification and detection using EMD-DWT analysis based on higher order statistic features. IRBM 41(3), 161–171 (2020). https://doi.org/10.1016/J.IRBM.2019.11.004
Toutounchi, S.J.S., Eydi, M., Golzari, S.E., Ghaffari, M.R., Parvizian, N.: Vocal cord paralysis and its etiologies: a prospective study. J. Cardiovasc. Thorac. Res. 6(1), 47–50 (2014). https://doi.org/10.5681/jcvtr.2014.009
Moorthy, S.S., Gupta, S., Laurent, B., Weisberger, E.C.: Management of airway in patients with laryngeal tumors. J. Clin. Anesth. 17(8), 604–609 (2005). https://doi.org/10.1016/J.JCLINANE.2004.12.019
Trotti, A., et al.: Randomized trial of hyperfractionation versus conventional fractionation in T2 squamous cell carcinoma of the vocal cord (RTOG 9512). Int. J. Radiat. Oncol. 89(5), 958–963 (2014). https://doi.org/10.1016/J.IJROBP.2014.04.041
Arens, C., Glanz, H., Kleinsasser, O.: Clinical and morphological aspects of laryngeal cysts (1997)
Aminoff, M.J., Dedo, H.H., Izdebski, K.: Clinical aspects of spasmodic dysphonia. Neurosurgery Psychiatry 41, 361–365 (1978). https://doi.org/10.1136/jnnp.41.4.361
Roy, N.: Functional dysphonia. Curr Opin Otolaryngol Head Neck Surg 11(3), 144–148 (2003). https://doi.org/10.1097/00020840-200306000-00002
Kosztyla-Hojna, B., Rogowski, M., Ruczaj, J., Pepinski, W., Lobaczuk-Sitnik, A.: An analysis of occupational dysphonia diagnosed in the North-East of Poland. Int. J. Occup. Med. Environ. Health 17(2) 2004
Sudhir, P.M., Chandra, P.S., Shivashankar, N., Yamini, B.K.: Comprehensive management of psychogenic dysphonia: a case illustration. J. Commun. Disord. 42(5), 305–312 (2009). https://doi.org/10.1016/J.JCOMDIS.2009.04.003
Karki, P., Gurung, U., Baskota, D.: Fibroma of epiglottis. Nepal. J. ENT Head Neck Surg. 1(1), 19–20 (2010). https://doi.org/10.3126/njenthns.v1i1.4733
Wycliffe, N.D., Grover, R.S., Kim, P.D., Simental, A., Jr.: Hypopharyngeal cancer. Top Magn Reson Imaging 18(4), 243–258 (2007). https://doi.org/10.1097/RMR.0b013e3181570c3f
Epstein, S.S., Winston, P., Friedmann, I., Ormerod, F.C.: The vocal cord polyp. J. Laryngol. Otol. 71(10), 673–688 (1957). https://doi.org/10.1017/S0022215100052312
Epstein, S.S., Winston, P.: Intubation Granuloma. J. Laryngol. Otol. 71(1), 37–48 (1957). https://doi.org/10.1017/S0022215100051549
Jurkov, A.Y., Bahilin, V.M., Shustova, T.I., Alekseeva, N.S.: A crosscorrelation analysis of fluctuations in heart rate and breathing when diagnosing the autonomic disorders in patients with hypotonic type of functional dysphonia. Zhurnal Nevrol. i Psihiatr. Im. S.S. Korsakova 120(5), 60–66 (2020) https://doi.org/10.17116/jnevro202012005160
Teixeira, J.P., Gonçalves, A.: Algorithm for jitter and shimmer measurement in pathologic voices. Procedia Comput. Sci. 100, 271–279 (2016). https://doi.org/10.1016/J.PROCS.2016.09.155
Fernandes, J., Silva, L., Teixeira, F., Guedes, V., Santos, J., Teixeira, J.P.: Parameters for vocal acoustic analysis - cured database. Procedia Comput. Sci. 164, 654–661 (2019). https://doi.org/10.1016/J.PROCS.2019.12.232
Hamdi, R., HAJJI, S., Cherif, A., Processing, S.: Recognition of pathological voices by Human Factor Cepstral Coefficients (HFCC). J. Comput. Sci. (2020)https://doi.org/10.3844/jcssp.2020.1085.1099
Boersma, P.: Stemmen meten met Praat. Stem-, Spraak- en Taalpathologie 12(4), 237–251 (2004)
Silva, L., et al.: Outliers treatment to improve the recognition of voice pathologies. Procedia Comput. Sci. 164, 678–685 (2019). https://doi.org/10.1016/J.PROCS.2019.12.235
Teixeira, J.P., Alves, N., Fernandes, P.O.: Vocal acoustic analysis: ANN versos SVM in classification of dysphonic voices and vocal cords paralysis. Int. J. E-Health Med. Commun. 11 (2020) https://doi.org/10.4018/IJEHMC.2020010103
Acknowledgments
The authors are grateful to the Foundation for Science and Technology (FCT, Portugal) for financial support through national funds FCT/MCTES (PIDDAC) to CeDRI (UIDB/05757/2020 and UIDP/05757/2020), SusTEC (LA/P/0007/2021) and 2021.04729.BD.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Fernandes, J., Freitas, D., Teixeira, J.P. (2023). First Version of a Support System for the Medical Diagnosis of Pathologies in the Larynx. In: Roque, A.C.A., et al. Biomedical Engineering Systems and Technologies. BIOSTEC 2022. Communications in Computer and Information Science, vol 1814. Springer, Cham. https://doi.org/10.1007/978-3-031-38854-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-38854-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-38853-8
Online ISBN: 978-3-031-38854-5
eBook Packages: Computer ScienceComputer Science (R0)