Abstract
This paper presents a novel approach on text-dependent biometric speaker verification (SV) based on the ensemble of two feature extraction and classification processes using Hidden Markov Models in a Universal Background Model framework (HMM-UBM) and d-Vectors derived from a Deep Learning Network (DNN) structure. Once the individual SV systems are trained, a third classifier is trained/tuned over individual test scores in the same dataset using three different approaches for comparison purposes: Multilayer Perceptron (MLP), Support Vector Machine (SVM) with three different kernels, and a Fuzzy Inference System (FIS). Obtained results over a proprietary speech database in Spanish, indicate an improved performance, providing an Equal Error Rate (EER) within the range of 0.7%–2.54% when classifier ensembles are used, versus an EER of 3.6% and above obtained in average with individual classifiers. Results in detail corresponding to comparison of the several approaches used in this experimental work are further described.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bharathi, S., Sudhakar, R.: Biometric recognition using finger and palm vein images. Soft. Comput. 23(6), 1843–1855 (2018). https://doi.org/10.1007/s00500-018-3295-6
Bhukya, R.K., Prasanna, S.M., Sarma, B.D.: Robust methods for text-dependent speaker verification. Circuits Syst. Signal Process. 38(11), 5253–5288 (2019)
Bonab, H., Can, F.: Less is more: a comprehensive framework for the number of components of ensemble classifiers. IEEE Trans. Neural Netw. Learn. Syst. 30(9), 2735–2745 (2019)
Chen, Y.h., Lopez-Moreno, I., Sainath, T.N., Visontai, M., Alvarez, R., Parada, C.: Locally-connected and convolutional neural networks for small footprint speaker recognition. In: Sixteenth Annual Conference of the International Speech Communication Association (2015)
Das, R.K., Prasanna, S.M.: Investigating text-independent speaker verification systems under varied data conditions. Circuits Syst. Signal Process. 38(8), 3778–3801 (2019)
Fierrez, J., Morales, A., Vera-Rodriguez, R., Camacho, D.: Multiple classifiers in biometrics. Part 1: fundamentals and review. Inf. Fus. 44, 57–64 (2018)
Irum, A., Salman, A.: Speaker verification using deep neural networks: A. Int. J. Mach. Learn. Comput. 9(1) (2019)
Liu, Y., He, L., Tian, Y., Chen, Z., Liu, J., Johnson, M.T.: Comparison of multiple features and modeling methods for text-dependent speaker verification. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 629–636. IEEE (2017)
Maghsoodi, N., Sameti, H., Zeinali, H., Stafylakis, T.: Speaker recognition with random digit strings using uncertainty normalized HMM-based i-vectors. IEEE/ACM Trans. Audio Speech Lang. Process. 27(11), 1815–1825 (2019)
Mehdi Cherrat, E., Alaoui, R., Bouzahir, H.: Convolutional neural networks approach for multimodal biometric identification system using the fusion of fingerprint, finger-vein and face images. PeerJ Comput. Sci. 6, e248 (2020)
Modak, S.K.S., Jha, V.K.: Multibiometric fusion strategy and its applications: a review. Inf. Fus. 49, 174–204 (2019)
Mozumder, A.I., Begum, S.A.: Iris recognition using modular neural network and fuzzy inference system based score level fusion. Int. J. Adv. Res. Comput. Sci. 8(7), 604–610 (2017)
Nguyen, L.: Tutorial on hidden Markov model. Appl. Comput. Math. 6(4–1), 16–38 (2017)
Nguyen, L.: Tutorial on support vector machine. Applied and Computational Mathematics 6(4–1), 1–15 (2017)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Signal Proc. 10(1–3), 19–41 (2000)
Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1–2), 1–39 (2010)
Sahidullah, M., Kinnunen, T.: Local spectral variability features for speaker verification. Digit. Signal Proc. 50, 1–11 (2016)
Snyder, D., Garcia-Romero, D., Povey, D., Khudanpur, S.: Deep neural network embeddings for text-independent speaker verification. In: Interspeech,pp. 999–1003 (2017)
Snyder, D., Garcia-Romero, D., Sell, G., Povey, D., Khudanpur, S.: X-vectors: Robust DNN embeddings for speaker recognition. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5329–5333. IEEE (2018)
Tang, Y., Ding, G., Huang, J., He, X., Zhou, B.: Deep speaker embedding learning with multi-level pooling for text-independent speaker verification. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6116–6120. IEEE (2019)
Variani, E., Lei, X., McDermott, E., Moreno, I.L., Gonzalez-Dominguez, J.: Deep neural networks for small footprint text-dependent speaker verification. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4052–4056. IEEE (2014)
Xing, Y., Tan, P., Wang, X.: Speaker verification normalization sequence kernel based on gaussian mixture model super-vector and Bhattacharyya distance. J. Low Freq. Noise Vibr. Active Control 1461348419880744 (2019)
Yao, S., Zhou, R., Zhang, P.: Speaker-phonetic i-vector modeling for text-dependent speaker verification with random digit strings. IEICE Trans. Inf. Syst. 102(2), 346–354 (2019)
Zhou, J., Jiang, T., Li, Z., Li, L., Hong, Q.: Deep speaker embedding extraction with channel-wise feature responses and additive supervision softmax loss function. In: INTERSPEECH, pp. 2883–2887 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Atenco-Vazquez, J.C., Moreno-Rodriguez, J.C., Cruz-Vega, I., Gomez-Gil, P., Arechiga, R., Ramirez-Cortes, J.M. (2020). Classifiers Ensemble of HMM and d-Vectors in Biometric Speaker Verification. In: Martínez-Villaseñor, L., Herrera-Alcántara, O., Ponce, H., Castro-Espinoza, F.A. (eds) Advances in Soft Computing. MICAI 2020. Lecture Notes in Computer Science(), vol 12468. Springer, Cham. https://doi.org/10.1007/978-3-030-60884-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-60884-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60883-5
Online ISBN: 978-3-030-60884-2
eBook Packages: Computer ScienceComputer Science (R0)