Information Theory-Based Curriculum Learning Factory to Optimize Training

Henok Ghebrechristos¹² &
Gita Alaghband¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12046))

Included in the following conference series:

Asian Conference on Pattern Recognition

1535 Accesses

Abstract

We present a new system to optimize feature extraction from 2D-topological data like images in the context of deep learning using correlation among training samples and curriculum learning optimization (CLO). The system treats every sample as 2D random variable, where a pixel contained in the sample is modelled as an independent and identically distributed random variable (i.i.d) realization. With this modelling we utilize information-theoretic and statistical measures of random variables to rank individual training samples and relationship between samples to construct syllabus. The rank of each sample is then used when the sample is fed to the network during training. Comparative evaluation of multiple state-of-the-art networks, including, ResNet, GoogleNet, and VGG, on benchmark datasets demonstrate a syllabus that ranks samples using measures such as Joint Entropy between adjacent samples, can improve learning and significantly reduce the amount of training steps required to achieve desirable training accuracy. We present results that indicate our approach can produce robust feature maps that in turn contribute to reduction of loss by as much as factors of 9 compared to conventional, no-curriculum, training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep curriculum learning optimization

Article 29 July 2020

Label-Similarity Curriculum Learning

Curriculum Learning: A Survey

Article 19 April 2022

References

Graves, A., Bellemare, M.G., Menick, J., Munos, R., Kavukcuoglu, K.: Automated curriculum learning for neural networks (2017). 10 p.
Google Scholar
Kim, T.-H., Choi, J.: ScreenerNet: learning self-paced curriculum for deep neural networks. arXiv:1801.00904 Cs, January 2018
Hinton, G.E.: To recognize shapes, first learn to generate images. Prog. Brain Res. 165, 535–547 (2007)
Article Google Scholar
Zhou, H.-Y., Gao, B.-B., Wu, J.: Adaptive feeding: achieving fast and accurate detections by adaptively combining object detectors. arXiv:1707.06399 Cs, July 2017
Graves, A., Bellemare, M.G., Menick, J., Munos, R., Kavukcuoglu, K.: Automated curriculum learning for neural networks. arXiv:1704.03003 Cs, April 2017
Kumar, M.P., Packer, B., Koller, D.: Self-paced learning for latent variable models. In: Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems 23, pp. 1189–1197. Curran Associates, Inc. (2010)
Google Scholar
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. arXiv:1511.05952 Cs, November 2015
Loshchilov, I., Hutter, F.: Online batch selection for faster training of neural networks. arXiv:1511.06343 Cs Math, November 2015
Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. arXiv:1703.04730 Cs Stat, March 2017
Ghebrechristos, H., Alaghband, G.: Expediting training using information theory-based patch ordering algorithm (2018). 6 p.
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
MATH Google Scholar
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27 (1948). 55 p.
Article MathSciNet Google Scholar
Bonev, B.I.: Feature selection based on information theory (2010). 200 p.
Google Scholar
Feixas, M., Bardera, A., Rigau, J., Xu, Q., Sbert, M.: Information theory tools for image processing. Synthesis Lectures on Computer Graphics and Animation, vol. 6, no. 1, pp. 1–164 (2014)
Article Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2006). 774 p.
MATH Google Scholar
Horé, A., Ziou, D.: Image quality metrics: PSNR vs. SSIM, pp. 2366–2369 (2010)
Google Scholar
Deming, W.E., Morgan, S.L.: The Elements of Statistical Learning. Elsevier, Amsterdam (1993)
Google Scholar
Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. arXiv:1605.08695 Cs, May 2016
Krizhevsky, A.: Learning multiple layers of features from tiny images (2009). 60 p.
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. arXiv:1409.0575 Cs, September 2014
Szegedy, C., et al.: Going deeper with convolutions. arXiv:1409.4842 Cs, September 2014
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. arXiv:1512.00567 Cs, December 2015
ImageNet Large Scale Visual Recognition Competition (ILSVRC). http://image-net.org/challenges/LSVRC/. Accessed 29 Apr 2017
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 770–778 (2016)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:14091556 Cs, September 2014
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:170404861 Cs, April 2017
TensorFlow: TensorFlow. https://www.tensorflow.org/. Accessed 14 Mar 2019
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, New York (1995). 498 p.
MATH Google Scholar
Lang, K.J., Hinton, G.E.: Dimensionality reduction and prior knowledge in E-set recognition (1990). 8 p.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Colorado, Denver, CO, 80014, USA
Henok Ghebrechristos & Gita Alaghband

Authors

Henok Ghebrechristos
View author publications
You can also search for this author in PubMed Google Scholar
Gita Alaghband
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henok Ghebrechristos .

Editor information

Editors and Affiliations

University of Malaya, Kuala Lumpur, Malaysia
Shivakumara Palaiahnakote
Consiglio Nazionale delle Ricerche, ICAR, Naples, Italy
Gabriella Sanniti di Baja
Chinese Academy of Sciences, Beijing, China
Liang Wang
Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ghebrechristos, H., Alaghband, G. (2020). Information Theory-Based Curriculum Learning Factory to Optimize Training. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science(), vol 12046. Springer, Cham. https://doi.org/10.1007/978-3-030-41404-7_29

Download citation

DOI: https://doi.org/10.1007/978-3-030-41404-7_29
Published: 23 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41403-0
Online ISBN: 978-3-030-41404-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics