Abstract
A novel method to improve the generalization performance of the Minimum Classification Error (MCE) / Generalized Probabilistic Descent (GPD) learning is proposed. The MCE/GPD learning proposed by Juang and Katagiri in 1992 results in better recognition performance than the maximum-likelihood (ML) based learning in various areas of pattern recognition. Despite its superiority in recognition performance, it still suffers from the problem of "over-fitting" to the training samples as it is with other learning algorithms. In the present study, a regularization technique is employed to the MCE learning to overcome this problem. Feed-forward neural networks are employed as a recognition platform to evaluate the recognition performance of the proposed method. Recognition experiments are conducted on several sorts of datasets.
Chapter PDF
Keywords
- Discriminant Function
- Recognition Performance
- Dynamic Time Warping
- Generalization Performance
- Tikhonov Regularizer
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Keinosuke Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, 1972.
B-H. Juang and S. Katagiri. Discriminative learning for minimum error classification. IEEE Trans. Signal Processing, 40(12):3043–3054, 1992.
S. Amari. A theory of adaptive pattern classifiers. IEEE Trans. Elec. Comput., EC-16(3):299–307, 1967.
T. Kohonen. Learning Vector Quantization. Technical Report TKK-F-A601, Helsinki University of Technology, Laboratory of Computer and Information Science, 1978.
E. Oja et al. The ALSM Algorithm-an Improved Subspace Medhotd of Classification. Pattern Recognition, 16(4):421–427, 1983.
Eric McDermott and Shigeru Katagiri. Prototype-based minimum classificatikn error / generalized probabilistic descent training for various speech units. Computer Speech and Language, pages 351–368, August 1994.
Biing-Hwang Juang, Wu Chooud, and Chin-Hui Lee. Minimum classification error rate methods for speech recognition. IEEE Trans. Speech and Audio Processing, 5(3):257–265, 1997.
A. N. Tikhonov and V. Y. Arsenin. Solutions of Ill-Posed Problems. V. H. Winston, 1977.
Vladimir N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, 1995.
Christopher M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1995.
Christopher M. Bishop. Curvature-Driven Smoothing: A Learning Algorithm for Feed-forward Networks. IEEE Trans. Neural Networks, 4(5):882–884, 1993.
D. E. Rumelhart, G. E. Hinton, and R. J. Willams. Learning representations by back-propagation errors. Nature 323 9, 323(9):533–536, October 1986.
C.J. Merz and P.M. Murphy. UCI repository of machine learning databases, 1996. http://www.ics.uci.edu/-mlearn/MLRepository.html.
H. Kuwabara, K. Takeda, Y. Sagisaka, S. Katagiri, S. Morikawa, and T. Watanabe. Construction of a large-scale Japanese speech database and its management system. Proc. of Intl. Conferece on Acoustics, Speech, and Signal Processing (ICASSP89), pages 560–563, 1989.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shimodaira, H., Rokui, J., Nakai, M. (1998). Modified minimum classification error learning and its application to neural networks. In: Amin, A., Dori, D., Pudil, P., Freeman, H. (eds) Advances in Pattern Recognition. SSPR /SPR 1998. Lecture Notes in Computer Science, vol 1451. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0033303
Download citation
DOI: https://doi.org/10.1007/BFb0033303
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64858-1
Online ISBN: 978-3-540-68526-5
eBook Packages: Springer Book Archive