[go: up one dir, main page]

Skip to main content

Advertisement

Log in

JoCaD: a joint training method by combining consistency and diversity

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Noisy labels due to mistakes in manual labeling or data collecting are challenging for the expansion of deep neural network applications. Current robust network learning methods such as Decoupling, Co-teaching, and Joint Training with Co-Regularization are very promising for learning with noisy labels, yet the coordination between consistency and diversity is not fully considered which is crucial for the performance of the model. To tackle this issue, a novel robust learning paradigm called Joint training by combining Consistency and Diversity (JoCaD) is proposed in this paper. The JoCaD is devoted to maximize the prediction consistency of the networks while keeping enough diversity on their representation learning. Specifically, aiming to reconcile the relationship between consistency and diversity, an effective implementation is proposed which dynamically adjusts joint loss to boost the model learning with noisy labels. The extensive experimental results on MNIST, CIFAR-10, CIFAR-100, and Clothing1M demonstrate that our proposed JoCaD has better performance than some representative SOTA methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Availability of data and materials

The datasets analysed during the current study are available in the JoCaD repository, https://github.com/04756/JoCaD.git.

References

  1. Severyn A, Moschitti A (2015) Twitter sentiment analysis with deep convolutional neural networks, SIGIR ’15, 959–962 (Association for Computing Machinery, New York, NY, USA)

  2. Howard J, Ruder S (2018) Universal language model fine-tuning for text classification, 328–339 (Association for Computational Linguistics, Melbourne, Australia)

  3. Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386

    Article  Google Scholar 

  4. Ilić V, Tadić J (2022) Active learning using a self-correcting neural network (alscn). Appl Intell 52(2):1956–1968. https://doi.org/10.1007/s10489-021-02515-y

    Article  Google Scholar 

  5. Sun L, Lyu G, Feng S, Huang X (2021) Beyond missing: weakly-supervised multi-label learning with incomplete and noisy labels. Appl Intell 51(3):1552–1564. https://doi.org/10.1007/s10489-020-01878-y

    Article  Google Scholar 

  6. Li Z, Tang J, Singh S, Markovitch S (2017) Weakly-supervised deep nonnegative low-rank model for social image tag refinement and assignment. Singh S, Markovitch S (eds) Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA, 4154–4160 (AAAI Press). http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14169

  7. Nguyen V-A et al (2020) CLARA: Confidence of Labels and Raters, 2542–2552 (Association for Computing Machinery, New York, NY, USA)

  8. Li Z, Tang J (2015) Weakly supervised deep metric learning for community-contributed image retrieval. Trans Multi 17(11):1989–1999. https://doi.org/10.1109/TMM.2015.2477035

    Article  Google Scholar 

  9. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfittin. J Mach Learn Res 1:1929–195

    Google Scholar 

  10. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift, 448–456 (PMLR)

  11. Pan S, Sheng B, He G, Li H, Xue G (2022) BAW: learning from class imbalance and noisy labels with batch adaptation weighted loss. Multimed Tools Appl 81(10):13593–13610. https://doi.org/10.1007/s11042-022-12323-2

    Article  Google Scholar 

  12. Menon AK, Van Rooyen B, Ong CS, Williamson RC (2015) Learning from corrupted binary labels via class-probability estimatio, ICML’15, 125–13 (JMLR.or)

  13. Kong K et al (2022) Penalty based robust learning with noisy labels. Neurocomputing 489:112–127. https://doi.org/10.1016/j.neucom.2022.02.030

    Article  Google Scholar 

  14. Ren M, Zeng W, Yang B, Urtasun R, Dy J, Krause A (2018) Learning to reweight examples for robust deep learning. Dy J, Krause A (eds) Proceedings of the 35th international conference on machine learning, vol. 80 of proceedings of machine learning research, 4334–4343 (PMLR)

  15. Wei H, Feng L, Chen X, An B (2020) Combating noisy labels by agreement: a joint training method with co-regularization, 13723–13732

  16. Yu X et al (2019) Chaudhuri, K. & Salakhutdinov, R. (eds) How does disagreement help generalization against label corruption? Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th international conference on machine learning, vol. 97 of proceedings of machine learning research, 7164–7173 (PMLR)

  17. Zhang Q et al (2021) An joint end-to-end framework for learning with noisy labels. Appl Soft Comput 108:107426. https://doi.org/10.1016/j.asoc.2021.107426

    Article  Google Scholar 

  18. Malach E, Shalev-Shwartz S, Guyon I et al (2017) Decoupling ”when to update” from ”how to update”. Guyon I et al (eds) Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4-9, 2017, Long Beach, CA, USA, 960–970

  19. Yao Y et al (2021) Jo-src: A contrastive approach for combating noisy labels, 5188–5197

  20. Jiang L, Zhou Z, Leung T, Li L-J, Fei-Fei L (2018) Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels, 2304–2313 (PMLR)

  21. Han B, Bengio S et al (2018) Co-teaching: Robust training of deep neural networks with extremely noisy labels. Bengio S et al (eds) Advances in neural information processing systems, vol. 31, 8527–8537 (Curran Associates, Inc.)

  22. Wang W, Arora R, Livescu K, Bilmes J, Bach F, Blei D (2015) On deep multi-view representation learning. Bach F, Blei D (eds) Proceedings of the 32nd international conference on machine learning, vol. 37 of proceedings of machine learning research, 1083–1092 (PMLR, Lille, France)

  23. Wu F, Xiwei D, Han L, Jing X-Y, Ji Y-M (2019) Multi-view synthesis and analysis dictionaries learning for classification. IEICE Trans Inf Syst E102.D:659–662. https://doi.org/10.1587/transinf.2018EDL8107

    Article  Google Scholar 

  24. Han W, Feng R, Wang L, Cheng Y (2018) A semi-supervised generative framework with deep learning features for high-resolution remote sensing image scene classification. ISPRS J Photogramm Remote Sens 145:23–43. https://doi.org/10.1016/j.isprsjprs.2017.11.004 deep Learning RS Data

  25. Peng J, Estrada G, Pedersoli M, Desrosiers C (2020) Deep co-training for semi-supervised image segmentation. Pattern Recognit 107:107269. https://doi.org/10.1016/j.patcog.2020.107269

    Article  Google Scholar 

  26. Wu F et al (2020) Modality-specific and shared generative adversarial network for cross-modal retrieval. Pattern Recognit 104:107335. https://doi.org/10.1016/j.patcog.2020.107335. https://www.sciencedirect.com/science/article/pii/S0031320320301382

  27. Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training, COLT’ 98, 92–100 (Association for Computing Machinery, New York, NY, USA)

  28. Sindhwani V, Niyogi P, Belkin M (2005) A co-regularization approach to semi-supervised learning with multiple views, vol. 2005, 74–79 (Citeseer)

  29. Arpit D et al (2017) A closer look at memorization in deep networks, ICML’17, 233–242 (JMLR.org)

  30. Zhang C, Bengio S, Hardt M, Recht B, Vinyals O (2017) Understanding deep learning requires rethinking generalization (OpenReview.net)

  31. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. https://doi.org/10.1109/5.726791

    Article  Google Scholar 

  32. Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images

  33. Xiao T, Xia T, Yang Y, Huang C, Wang X (2015) Learning from massive noisy labeled data for image classification, 2691–2699

  34. Patrini G, Rozza A, Menon AK, Nock R, Qu L (2017) Making deep neural networks robust to label noise: A loss correction approach, 2233–2241

  35. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition, 770–778

  36. Cheng H et al (2021) Learning with instance-dependent label noise: a sample sieve approach

  37. Fatras K et al (2021) Wasserstein adversarial regularization for learning with label noise. IEEE Trans Pattern Anal Mach Intell 99:1. https://doi.org/10.1109/TPAMI.2021.3094662

    Article  Google Scholar 

  38. Li X, Liu T, Han B, Niu G, Sugiyama M, Meila M, Zhang T (2021) Provably end-to-end label-noise learning without anchor points. Meila M, Zhang T (eds) Proceedings of the 38th international conference on machine learning, vol. 139 of Proceedings of Machine Learning Research, 6403–6413 (PMLR)

  39. Wang Y et al (2019) Symmetric cross entropy for robust learning with noisy labels, 322–330

  40. Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection, 2999–3007

  41. Kim Y, Lee Y, Jeon M (2021) Imbalanced image classification with complement cross entropy. Pattern Recognit Lett 151:33–40. https://doi.org/10.1016/j.patrec.2021.07.017. https://www.sciencedirect.com/science/article/pii/S016786552100266X

  42. Li B, Liu Y, Wang X (2019) Gradient harmonized single-stage detector. 8577–8584 (AAAI Press). https://doi.org/10.1609/aaai.v33i01.33018577

Download references

Funding

This work is supported by National Nature Science Foundation of China (51827813), National Key Research and Development Program of China (2022YFB2603302) and R &D Program of Beijing Municipal Education Commission (KJZD20191000402).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui Yin.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, H., Yin, H., Yang, Z. et al. JoCaD: a joint training method by combining consistency and diversity. Multimed Tools Appl 83, 64573–64589 (2024). https://doi.org/10.1007/s11042-024-18221-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-024-18221-z

Keywords