JoCaD: a joint training method by combining consistency and diversity

Heyan Yang¹,
Hui Yin ORCID: orcid.org/0000-0002-4226-4368¹,
Zhengze Yang² &
…
Yingjun Zhang²

122 Accesses
1 Altmetric
Explore all metrics

Abstract

Noisy labels due to mistakes in manual labeling or data collecting are challenging for the expansion of deep neural network applications. Current robust network learning methods such as Decoupling, Co-teaching, and Joint Training with Co-Regularization are very promising for learning with noisy labels, yet the coordination between consistency and diversity is not fully considered which is crucial for the performance of the model. To tackle this issue, a novel robust learning paradigm called Joint training by combining Consistency and Diversity (JoCaD) is proposed in this paper. The JoCaD is devoted to maximize the prediction consistency of the networks while keeping enough diversity on their representation learning. Specifically, aiming to reconcile the relationship between consistency and diversity, an effective implementation is proposed which dynamically adjusts joint loss to boost the model learning with noisy labels. The extensive experimental results on MNIST, CIFAR-10, CIFAR-100, and Clothing1M demonstrate that our proposed JoCaD has better performance than some representative SOTA methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

JSMix: a holistic algorithm for learning with label noise

Article 29 September 2022

Training Noise Robust Deep Neural Networks with Self-supervised Learning

Data fusing and joint training for learning with noisy labels

Article 02 April 2022

Availability of data and materials

The datasets analysed during the current study are available in the JoCaD repository, https://github.com/04756/JoCaD.git.

References

Severyn A, Moschitti A (2015) Twitter sentiment analysis with deep convolutional neural networks, SIGIR ’15, 959–962 (Association for Computing Machinery, New York, NY, USA)
Howard J, Ruder S (2018) Universal language model fine-tuning for text classification, 328–339 (Association for Computational Linguistics, Melbourne, Australia)
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386
Article Google Scholar
Ilić V, Tadić J (2022) Active learning using a self-correcting neural network (alscn). Appl Intell 52(2):1956–1968. https://doi.org/10.1007/s10489-021-02515-y
Article Google Scholar
Sun L, Lyu G, Feng S, Huang X (2021) Beyond missing: weakly-supervised multi-label learning with incomplete and noisy labels. Appl Intell 51(3):1552–1564. https://doi.org/10.1007/s10489-020-01878-y
Article Google Scholar
Li Z, Tang J, Singh S, Markovitch S (2017) Weakly-supervised deep nonnegative low-rank model for social image tag refinement and assignment. Singh S, Markovitch S (eds) Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA, 4154–4160 (AAAI Press). http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14169
Nguyen V-A et al (2020) CLARA: Confidence of Labels and Raters, 2542–2552 (Association for Computing Machinery, New York, NY, USA)
Li Z, Tang J (2015) Weakly supervised deep metric learning for community-contributed image retrieval. Trans Multi 17(11):1989–1999. https://doi.org/10.1109/TMM.2015.2477035
Article Google Scholar
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfittin. J Mach Learn Res 1:1929–195
Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift, 448–456 (PMLR)
Pan S, Sheng B, He G, Li H, Xue G (2022) BAW: learning from class imbalance and noisy labels with batch adaptation weighted loss. Multimed Tools Appl 81(10):13593–13610. https://doi.org/10.1007/s11042-022-12323-2
Article Google Scholar
Menon AK, Van Rooyen B, Ong CS, Williamson RC (2015) Learning from corrupted binary labels via class-probability estimatio, ICML’15, 125–13 (JMLR.or)
Kong K et al (2022) Penalty based robust learning with noisy labels. Neurocomputing 489:112–127. https://doi.org/10.1016/j.neucom.2022.02.030
Article Google Scholar
Ren M, Zeng W, Yang B, Urtasun R, Dy J, Krause A (2018) Learning to reweight examples for robust deep learning. Dy J, Krause A (eds) Proceedings of the 35th international conference on machine learning, vol. 80 of proceedings of machine learning research, 4334–4343 (PMLR)
Wei H, Feng L, Chen X, An B (2020) Combating noisy labels by agreement: a joint training method with co-regularization, 13723–13732
Yu X et al (2019) Chaudhuri, K. & Salakhutdinov, R. (eds) How does disagreement help generalization against label corruption? Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th international conference on machine learning, vol. 97 of proceedings of machine learning research, 7164–7173 (PMLR)
Zhang Q et al (2021) An joint end-to-end framework for learning with noisy labels. Appl Soft Comput 108:107426. https://doi.org/10.1016/j.asoc.2021.107426
Article Google Scholar
Malach E, Shalev-Shwartz S, Guyon I et al (2017) Decoupling ”when to update” from ”how to update”. Guyon I et al (eds) Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4-9, 2017, Long Beach, CA, USA, 960–970
Yao Y et al (2021) Jo-src: A contrastive approach for combating noisy labels, 5188–5197
Jiang L, Zhou Z, Leung T, Li L-J, Fei-Fei L (2018) Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels, 2304–2313 (PMLR)
Han B, Bengio S et al (2018) Co-teaching: Robust training of deep neural networks with extremely noisy labels. Bengio S et al (eds) Advances in neural information processing systems, vol. 31, 8527–8537 (Curran Associates, Inc.)
Wang W, Arora R, Livescu K, Bilmes J, Bach F, Blei D (2015) On deep multi-view representation learning. Bach F, Blei D (eds) Proceedings of the 32nd international conference on machine learning, vol. 37 of proceedings of machine learning research, 1083–1092 (PMLR, Lille, France)
Wu F, Xiwei D, Han L, Jing X-Y, Ji Y-M (2019) Multi-view synthesis and analysis dictionaries learning for classification. IEICE Trans Inf Syst E102.D:659–662. https://doi.org/10.1587/transinf.2018EDL8107
Article Google Scholar
Han W, Feng R, Wang L, Cheng Y (2018) A semi-supervised generative framework with deep learning features for high-resolution remote sensing image scene classification. ISPRS J Photogramm Remote Sens 145:23–43. https://doi.org/10.1016/j.isprsjprs.2017.11.004 deep Learning RS Data
Peng J, Estrada G, Pedersoli M, Desrosiers C (2020) Deep co-training for semi-supervised image segmentation. Pattern Recognit 107:107269. https://doi.org/10.1016/j.patcog.2020.107269
Article Google Scholar
Wu F et al (2020) Modality-specific and shared generative adversarial network for cross-modal retrieval. Pattern Recognit 104:107335. https://doi.org/10.1016/j.patcog.2020.107335. https://www.sciencedirect.com/science/article/pii/S0031320320301382
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training, COLT’ 98, 92–100 (Association for Computing Machinery, New York, NY, USA)
Sindhwani V, Niyogi P, Belkin M (2005) A co-regularization approach to semi-supervised learning with multiple views, vol. 2005, 74–79 (Citeseer)
Arpit D et al (2017) A closer look at memorization in deep networks, ICML’17, 233–242 (JMLR.org)
Zhang C, Bengio S, Hardt M, Recht B, Vinyals O (2017) Understanding deep learning requires rethinking generalization (OpenReview.net)
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. https://doi.org/10.1109/5.726791
Article Google Scholar
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images
Xiao T, Xia T, Yang Y, Huang C, Wang X (2015) Learning from massive noisy labeled data for image classification, 2691–2699
Patrini G, Rozza A, Menon AK, Nock R, Qu L (2017) Making deep neural networks robust to label noise: A loss correction approach, 2233–2241
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition, 770–778
Cheng H et al (2021) Learning with instance-dependent label noise: a sample sieve approach
Fatras K et al (2021) Wasserstein adversarial regularization for learning with label noise. IEEE Trans Pattern Anal Mach Intell 99:1. https://doi.org/10.1109/TPAMI.2021.3094662
Article Google Scholar
Li X, Liu T, Han B, Niu G, Sugiyama M, Meila M, Zhang T (2021) Provably end-to-end label-noise learning without anchor points. Meila M, Zhang T (eds) Proceedings of the 38th international conference on machine learning, vol. 139 of Proceedings of Machine Learning Research, 6403–6413 (PMLR)
Wang Y et al (2019) Symmetric cross entropy for robust learning with noisy labels, 322–330
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection, 2999–3007
Kim Y, Lee Y, Jeon M (2021) Imbalanced image classification with complement cross entropy. Pattern Recognit Lett 151:33–40. https://doi.org/10.1016/j.patrec.2021.07.017. https://www.sciencedirect.com/science/article/pii/S016786552100266X
Li B, Liu Y, Wang X (2019) Gradient harmonized single-stage detector. 8577–8584 (AAAI Press). https://doi.org/10.1609/aaai.v33i01.33018577

Download references

Funding

This work is supported by National Nature Science Foundation of China (51827813), National Key Research and Development Program of China (2022YFB2603302) and R &D Program of Beijing Municipal Education Commission (KJZD20191000402).

Author information

Authors and Affiliations

Beijing Key Laboratory of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, 100044, China
Heyan Yang & Hui Yin
Key Laboratory of Beijing for Railway Engineering, Beijing Jiaotong University, Beijing, 100044, China
Zhengze Yang & Yingjun Zhang

Authors

Heyan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Hui Yin
View author publications
You can also search for this author in PubMed Google Scholar
Zhengze Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yingjun Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hui Yin.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, H., Yin, H., Yang, Z. et al. JoCaD: a joint training method by combining consistency and diversity. Multimed Tools Appl 83, 64573–64589 (2024). https://doi.org/10.1007/s11042-024-18221-z

Download citation

Received: 20 September 2022
Revised: 08 September 2023
Accepted: 08 January 2024
Published: 16 January 2024
Issue Date: July 2024
DOI: https://doi.org/10.1007/s11042-024-18221-z

JoCaD: a joint training method by combining consistency and diversity

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

JSMix: a holistic algorithm for learning with label noise

Training Noise Robust Deep Neural Networks with Self-supervised Learning

Data fusing and joint training for learning with noisy labels

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

JoCaD: a joint training method by combining consistency and diversity

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

JSMix: a holistic algorithm for learning with label noise

Training Noise Robust Deep Neural Networks with Self-supervised Learning

Data fusing and joint training for learning with noisy labels

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now