[go: up one dir, main page]

Skip to main content

Classifiers Ensemble of HMM and d-Vectors in Biometric Speaker Verification

  • Conference paper
  • First Online:
Advances in Soft Computing (MICAI 2020)

Abstract

This paper presents a novel approach on text-dependent biometric speaker verification (SV) based on the ensemble of two feature extraction and classification processes using Hidden Markov Models in a Universal Background Model framework (HMM-UBM) and d-Vectors derived from a Deep Learning Network (DNN) structure. Once the individual SV systems are trained, a third classifier is trained/tuned over individual test scores in the same dataset using three different approaches for comparison purposes: Multilayer Perceptron (MLP), Support Vector Machine (SVM) with three different kernels, and a Fuzzy Inference System (FIS). Obtained results over a proprietary speech database in Spanish, indicate an improved performance, providing an Equal Error Rate (EER) within the range of 0.7%–2.54% when classifier ensembles are used, versus an EER of 3.6% and above obtained in average with individual classifiers. Results in detail corresponding to comparison of the several approaches used in this experimental work are further described.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bharathi, S., Sudhakar, R.: Biometric recognition using finger and palm vein images. Soft. Comput. 23(6), 1843–1855 (2018). https://doi.org/10.1007/s00500-018-3295-6

    Article  Google Scholar 

  2. Bhukya, R.K., Prasanna, S.M., Sarma, B.D.: Robust methods for text-dependent speaker verification. Circuits Syst. Signal Process. 38(11), 5253–5288 (2019)

    Article  Google Scholar 

  3. Bonab, H., Can, F.: Less is more: a comprehensive framework for the number of components of ensemble classifiers. IEEE Trans. Neural Netw. Learn. Syst. 30(9), 2735–2745 (2019)

    Article  MathSciNet  Google Scholar 

  4. Chen, Y.h., Lopez-Moreno, I., Sainath, T.N., Visontai, M., Alvarez, R., Parada, C.: Locally-connected and convolutional neural networks for small footprint speaker recognition. In: Sixteenth Annual Conference of the International Speech Communication Association (2015)

    Google Scholar 

  5. Das, R.K., Prasanna, S.M.: Investigating text-independent speaker verification systems under varied data conditions. Circuits Syst. Signal Process. 38(8), 3778–3801 (2019)

    Article  Google Scholar 

  6. Fierrez, J., Morales, A., Vera-Rodriguez, R., Camacho, D.: Multiple classifiers in biometrics. Part 1: fundamentals and review. Inf. Fus. 44, 57–64 (2018)

    Google Scholar 

  7. Irum, A., Salman, A.: Speaker verification using deep neural networks: A. Int. J. Mach. Learn. Comput. 9(1) (2019)

    Google Scholar 

  8. Liu, Y., He, L., Tian, Y., Chen, Z., Liu, J., Johnson, M.T.: Comparison of multiple features and modeling methods for text-dependent speaker verification. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 629–636. IEEE (2017)

    Google Scholar 

  9. Maghsoodi, N., Sameti, H., Zeinali, H., Stafylakis, T.: Speaker recognition with random digit strings using uncertainty normalized HMM-based i-vectors. IEEE/ACM Trans. Audio Speech Lang. Process. 27(11), 1815–1825 (2019)

    Article  Google Scholar 

  10. Mehdi Cherrat, E., Alaoui, R., Bouzahir, H.: Convolutional neural networks approach for multimodal biometric identification system using the fusion of fingerprint, finger-vein and face images. PeerJ Comput. Sci. 6, e248 (2020)

    Article  Google Scholar 

  11. Modak, S.K.S., Jha, V.K.: Multibiometric fusion strategy and its applications: a review. Inf. Fus. 49, 174–204 (2019)

    Article  Google Scholar 

  12. Mozumder, A.I., Begum, S.A.: Iris recognition using modular neural network and fuzzy inference system based score level fusion. Int. J. Adv. Res. Comput. Sci. 8(7), 604–610 (2017)

    Article  Google Scholar 

  13. Nguyen, L.: Tutorial on hidden Markov model. Appl. Comput. Math. 6(4–1), 16–38 (2017)

    Google Scholar 

  14. Nguyen, L.: Tutorial on support vector machine. Applied and Computational Mathematics 6(4–1), 1–15 (2017)

    Google Scholar 

  15. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Signal Proc. 10(1–3), 19–41 (2000)

    Article  Google Scholar 

  16. Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1–2), 1–39 (2010)

    Article  Google Scholar 

  17. Sahidullah, M., Kinnunen, T.: Local spectral variability features for speaker verification. Digit. Signal Proc. 50, 1–11 (2016)

    Article  Google Scholar 

  18. Snyder, D., Garcia-Romero, D., Povey, D., Khudanpur, S.: Deep neural network embeddings for text-independent speaker verification. In: Interspeech,pp. 999–1003 (2017)

    Google Scholar 

  19. Snyder, D., Garcia-Romero, D., Sell, G., Povey, D., Khudanpur, S.: X-vectors: Robust DNN embeddings for speaker recognition. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5329–5333. IEEE (2018)

    Google Scholar 

  20. Tang, Y., Ding, G., Huang, J., He, X., Zhou, B.: Deep speaker embedding learning with multi-level pooling for text-independent speaker verification. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6116–6120. IEEE (2019)

    Google Scholar 

  21. Variani, E., Lei, X., McDermott, E., Moreno, I.L., Gonzalez-Dominguez, J.: Deep neural networks for small footprint text-dependent speaker verification. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4052–4056. IEEE (2014)

    Google Scholar 

  22. Xing, Y., Tan, P., Wang, X.: Speaker verification normalization sequence kernel based on gaussian mixture model super-vector and Bhattacharyya distance. J. Low Freq. Noise Vibr. Active Control 1461348419880744 (2019)

    Google Scholar 

  23. Yao, S., Zhou, R., Zhang, P.: Speaker-phonetic i-vector modeling for text-dependent speaker verification with random digit strings. IEICE Trans. Inf. Syst. 102(2), 346–354 (2019)

    Article  Google Scholar 

  24. Zhou, J., Jiang, T., Li, Z., Li, L., Hong, Q.: Deep speaker embedding extraction with channel-wise feature responses and additive supervision softmax loss function. In: INTERSPEECH, pp. 2883–2887 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Carlos Atenco-Vazquez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Atenco-Vazquez, J.C., Moreno-Rodriguez, J.C., Cruz-Vega, I., Gomez-Gil, P., Arechiga, R., Ramirez-Cortes, J.M. (2020). Classifiers Ensemble of HMM and d-Vectors in Biometric Speaker Verification. In: Martínez-Villaseñor, L., Herrera-Alcántara, O., Ponce, H., Castro-Espinoza, F.A. (eds) Advances in Soft Computing. MICAI 2020. Lecture Notes in Computer Science(), vol 12468. Springer, Cham. https://doi.org/10.1007/978-3-030-60884-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-60884-2_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-60883-5

  • Online ISBN: 978-3-030-60884-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics