Abstract
This paper presents a novel system that performs text-independent speaker authentication using new spiking neural network (SNN) architectures. Each speaker is represented by a set of prototype vectors that is trained with standard Hebbian rule and winner-takes-all approach. For every speaker there is a separated spiking network that computes normalized similarity scores of MFCC (Mel Frequency Cepstrum Coefficients) features considering speaker and background models. Experiments with the VidTimit dataset show similar performance of the system when compared with a benchmark method based on vector quantization. As the main property, the system enables optimization in terms of performance, speed and energy efficiency. A procedure to create/merge neurons is also presented, which enables adaptive and on-line training in an evolvable way.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Burileanu, C., Moraru, D., Bojan, L., Puchiu, M., Stan, A.: On performance improvement of a speaker verification system using vector quantization, cohorts and hybrid cohort-world models. International Journal of Speech Technology 5, 247–257 (2002)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000)
Bimbot, F., et al.: A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing 4, 430–451 (2004)
Delorme, A., Gautrais, J., van Rullen, R., Thorpe, S.: SpikeNet: a simulator for modeling large networks of integrate and fire neurons. Neurocomputing, 26–27, 986–996 (1999)
Wysoski, S.G., Benuskova, L., Kasabov, N.: On-line learning with structural adaptation in a network of spiking neurons for visual pattern recognition. In: Kollias, S., Stafylopatis, A., Duch, W., Oja, E. (eds.) ICANN 2006. LNCS, vol. 4131, pp. 61–70. Springer, Heidelberg (2006)
Kuroyanagi, S., Iwata, A.: Auditory pulse neural network model to extract the inter-aural time and level difference for sound localization. Trans. of IEICE E77-D 4, 466–474 (1994)
Loiselle, S., Rouat, J., Pressnitzer, D., Thorpe, S.: Exploration of Rank Order Coding with spiking neural networks for speech recognition. IJCNN, Montreal, pp. 2076–2080 (2005)
Rouat, J., Pichevar, R., Loiselle, S.: Perceptive, non-linear speech processing and spiking neural networks. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 317–337. Springer, Heidelberg (2005)
Gerstner, W., Kistler, W.M.: Spiking Neuron Models. Cambridge Univ. Press, Cambridge MA (2002)
Gold, B., Morgan, N.: Speech and Audio Signal Processing. John Wiley & Sons, Chichester (2000)
Rabiner, L., Juang, B.: Fundamentals of Speech Recognition. Prentice Hall, New Jersey (1993)
Sanderson, C., Paliwal, K.K.: Identity verification using speech and face information. Digital Signal Processing 14, 449–480 (2004)
Bothe, S.M., La Poutre, H.A., Kok, J.N.: Unsupervised clustering with spiking neurons by sparse temporal coding and multi-layer RBF networks. IEEE Trans. Neural Networks 10(2), 426–435 (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wysoski, S.G., Benuskova, L., Kasabov, N. (2007). Text-Independent Speaker Authentication with Spiking Neural Networks. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D. (eds) Artificial Neural Networks – ICANN 2007. ICANN 2007. Lecture Notes in Computer Science, vol 4669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74695-9_78
Download citation
DOI: https://doi.org/10.1007/978-3-540-74695-9_78
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74693-5
Online ISBN: 978-3-540-74695-9
eBook Packages: Computer ScienceComputer Science (R0)