Abstract
In this paper we study the application of user-dependent score fusion to multilevel speaker recognition. After reviewing related works in multimodal biometric authentication, a new score fusion technique is described. The method is based on a form of Bayesian adaptation to derive the personalized fusion functions from prior user-independent data. Experimental results are reported using the MIT Lincoln Laboratory’s multilevel speaker verification system. It is experimentally shown that the proposed adapted fusion method outperforms both user independent and non-adapted user-dependent fusion approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10, 19–41 (2000)
Campbell, W.M.: A SVM/HMM system for speaker recognition. In: Proc. ICASSP, pp. 209–302 (2003)
Campbell, W.M., Reynolds, D.A., Campbell, J.: Fusing discriminative and generative methods for speaker recognition: Experiments on Switchboard and NFI/TNO field data. In: Proc. ODYSSEY, pp. 41–44 (2004)
Reynolds, D.A., et al.: The SuperSID project: Exploiting high-level information for high-accuracy speaker recognition. In: Proc. ICASSP, pp. 784–787 (2003)
Reynolds, D.A., et al.: The 2004 MIT Lincoln Laboratory Speaker Recognition System. In: Proc. ICASSP (2005) (to appear)
NIST SRE Web, http://www.nist.gov/speech/tests/spk/2004/index.htm
Doddington, G., et al.: Sheeps, goats, lambs and wolves: A statistical analysis of speaker performance in the NIST 1998 SRE. In: Proc. ICSLP (1998)
Bigun, E.S., Bigun, J., et al.: Expert conciliation for multi modal person authentication systems by Bayesian statistics. In: BigĂ¼n, J., Borgefors, G., Chollet, G. (eds.) AVBPA 1997. LNCS, vol. 1206, pp. 291–300. Springer, Heidelberg (1997)
Jain, A.K., Ross, A.: Learning user-specific parameters in a multibiometric system. In: Proc. ICIP, pp. 57–60 (2002)
Fierrez-Aguilar, J., et al.: A comparative evaluation of fusion strategies for multimodal biometric verification. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688, pp. 830–837. Springer, Heidelberg (2003)
Fierrez-Aguilar, J., et al.: Exploiting general knowledge in user-dependent fusion strategies for multimodal biometric verification. In: Proc. ICASSP, pp. 617–620 (2004)
Toh, K.A., Jiang, X., Yau, W.Y.: Exploiting local and global decisions for multimodal biometrics verification. IEEE Trans. on SP 52, 3059–3072 (2004)
Fierrez-Aguilar, J., et al.: Bayesian adaptation for user-dependent multimodal biometric authentication. Pattern Recognition (2005) (to appear)
Kumar, A., Zhang, D.: Integrating palmprint with face for user authentication. In: Proc. MMUA (2003), available at http://mmua.cs.ucsb.edu/
Snelick, R., et al.: Large scale evaluation of multimodal biometric authentication using state-of-the-art systems. IEEE Trans. PAMI 27, 450–455 (2005)
Poh, N., Bengio, S.: An Investigation of F-ratio client-dependent normalisation on biometric authentication tasks. In: Proc. ICASSP (2005) (to appear)
Lee, C.H., Huo, Q.: On adaptive decision rules and decision parameter adaptation for automatic speech recognition. Proc. IEEE, 88, 1241–1269 (2000)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, Chichester (2001)
Gauvain, J.L., Lee, C.H.: Maximum a Posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans. on SAP 2, 291–298 (1994)
Reynolds, D.A.: Channel robust speaker verification via feature mapping. In: Proc. ICASSP, pp. 53–56 (2003)
Auckenthaler, R., et al.: Score normalization for text-independent speaker verification systems. Digital Signal Processing 10, 42–54 (2000)
Doddington, G.: Speaker recognition based on idiolectal differences between speakers. In: Proc. EUROSPEECH, pp. 2521–2524 (2001)
Adami, A., Mihaescu, R., Reynolds, D.A., Godfrey, J.: Modeling prosodic dynamics for speaker recognition. In: Proc. ICASSP, pp. 788–791 (2003)
Adami, A.G.: Modeling prosodic differences for speaker and language recognition. PhD thesis, OGI (2004)
Martin, A., Doddington, G., et al.: The DET curve in assessment of decision task performance. In: Proc. EUROSPEECH 1997, pp. 1895–1898 (1997)
Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: A review. IEEE Trans. on PAMI 22, 4–37 (2000)
Fierrez-Aguilar, J., Ortega-Garcia, J., Gonzalez-Rodriguez, J.: Target dependent score normalization techniques and their application to signature verification. IEEE Trans. on SMC-CÂ 35 (2005) (to appear)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fierrez-Aguilar, J., Garcia-Romero, D., Ortega-Garcia, J., Gonzalez-Rodriguez, J. (2005). Speaker Verification Using Adapted User-Dependent Multilevel Fusion. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2005. Lecture Notes in Computer Science, vol 3541. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11494683_36
Download citation
DOI: https://doi.org/10.1007/11494683_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26306-7
Online ISBN: 978-3-540-31578-0
eBook Packages: Computer ScienceComputer Science (R0)