NeighborMix data augmentation for image recognition

Feipeng Wang ORCID: orcid.org/0000-0003-2905-3723^1,2,
Kerong Ben¹,
Hu Peng² &
…
Meini Yang¹

204 Accesses
1 Citation
Explore all metrics

Abstract

Data augmentation can effectively enrich the diversity of training datasets to improve the generalization ability of deep learning models. Existing augmentation methods have achieved excellent performance on image recognition. However, the images generated by them can not well improve the accuracy and robustness of the model trained with natural scene data. In this paper, an augmentation method called NeighborMix is proposed to address this problem. NeighborMix only performs simple operations on the original image. Specifically, NeighborMix selects and pastes an occlusion object closely related to the original input image in a successive two-stage fashion. In the first stage, a block from a single original image is selected as the occlusion object. In the second stage, the occlusion object is copied and pasted into the original image to generate a new image. Finally, extensive experiments on benchmark datasets demonstrate that NeighborMix achieves top-1 results of 93.54%, 70.62%, and 89.30% superior model generalization performance on CIFAR-10, CIFAR-100, and STL-10, respectively, and performs the most robustness with a standard error of 0.05%, outperforms state-of-the-art data augmentation techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LMix: regularization strategy for convolutional neural networks

Article 21 August 2022

LocMix: local saliency-based data augmentation for image classification

Article 11 November 2023

PatchMix: patch-level mixup for data augmentation in convolutional neural networks

Article 30 May 2024

References

Pawar K, Egan GF, Chen Z (2021) Domain knowledge augmentation of parallel mr image reconstruction using deep learning. Comput Med Imaging Graph 92(2):101968. https://doi.org/10.1016/j.compmedimag.2021.101968
Article PubMed Google Scholar
Dash T, Chitlangia S, Ahuja A, Srinivasan A (2021) Incorporating domain knowledge into deep neural networks. Preprint at arXiv:2103.00180
Pan Y, Jing Y, Wu T, Kong X (2022) Knowledge-based data augmentation of small samples for oil condition prediction. Reliab Eng Syst Saf 217:108114. https://doi.org/10.1016/j.ress.2021.108114
Article Google Scholar
Zhan C, Hu H, Wang Z, Fan R, Niyato D (2020) Unmanned aircraft system aided adaptive video streaming: A joint optimization approach. IEEE Trans Multimed 22(3):795–807. https://doi.org/10.1109/TMM.2019.2931441
Article Google Scholar
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252. https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958. https://doi.org/10.5555/2627435.2670313
Article MathSciNet Google Scholar
Wang Y, Hebert MH (2016) Learning from small sample sets by combining unsupervised meta-training with cnns. In: Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 244–252. Curran Associates, Inc., Barcelona. https://proceedings.neurips.cc/paper/2016/file/140f6969d5213fd0ece03148e62e461e-Paper.pdf
Krizhevsky A, Sutskever I, Hinton G (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386
Article Google Scholar
Brendel W, Bethge M (2019) Approximating cnns with bag–of–local–features models works surprisingly well on imagenet. Preprint at arXiv:1904.00760
Zhang J, Wu Q, Shen C, Zhang J, Lu J (2018) Multi-label image classification with regional latent semantic dependencies. IEEE Trans Multimed 20(10):2801–2813. https://doi.org/10.1109/TMM.2018.2812605
Article Google Scholar
Zhang H, Luo Y, Ai Q, Wen Y, Hu H (2020) Look, read and feel: Benchmarking ads understanding with multimodal multitask learning. In: Chen CW, Cucchiara R, Hua X-, Qi G-, Ricci E, Zhang , Zimmermann R (eds.) MM ’20: The 28th ACM International Conference on Multimedia, vol. 28, pp. 430–438. ACM, Seattle. https://doi.org/10.1145/3394171.3413582
Simonyan K, Zisserman A (2014) Very deep convolutional networks for largescale image recognition. Preprint at arXiv:1409.1556
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Houlsby N (2020) An image is worth 16x16 words: Transformers for image recognition at scale. Preprint at arXiv:2010.11929
Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Article PubMed Google Scholar
Wu Z, Li S, Chen C, Hao A, Qin H (2022) Deeper look at image salient object detection: Bi-stream network with a small training dataset. IEEE Trans Multimed 24:73–86. https://doi.org/10.1109/TMM.2020.3046871
Article Google Scholar
Redmon J, Divvala SK, Girshick RB, Farhadi A (2016) You only look once: Unified, real–time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788. IEEE Computer Society, Las Vegas. https://doi.org/10.1109/CVPR.2016.91
Chen L-, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184
Article PubMed Google Scholar
Chen T, Xie G, Yao Y, Wang Q, Shen F, Tang Z, Zhang J (2022) Semantically meaningful class prototype learning for one-shot image semantic segmentation. IEEE Trans Multimed 24:968–980. https://doi.org/10.1109/TMM.2021.3061816
Article Google Scholar
Kim UH, Kim S, Kim JH (2022) Simvodis: Simultaneous visual odometry, object detection, and instance segmentation. IEEE Trans Pattern Anal Mach Intell 44(1):428–441. https://doi.org/10.1109/TPAMI.2020.3007546
Article PubMed Google Scholar
Yi J, Wu P, Tang H, Liu B, Huang Q, Qu H, Han L, Fan W, Hoeppner DJ, Metaxas DN (2021) Object-guided instance segmentation with auxiliary feature refinement for biological images. IEEE Trans Med Imaging 40(9):2403–2414. https://doi.org/10.1109/TMI.2021.3077285
Article PubMed Google Scholar
Xu K, Wen L, Li G, Huang Q (2021) Self-supervised deep triplenet for video object segmentation. IEEE Trans Multimed 23:3530–3539. https://doi.org/10.1109/TMM.2020.3026913
Article Google Scholar
Ma D, Tang P, Zhao L, Zhang Z (2021) Review of data augmentation for image in deep learning. Journal of Image and Graphics. Beijing 26(03):487–502. https://doi.org/10.11834/jig.200089
Alexander B, Patrick TV, Christian B, Yoan A, Zoé D, Emeric F, Franois CM, Nicolas G, Bastian H, Jaron KS (2021) Haplotype divergence supports long-term asexuality in the oribatid mite oppiella nova. Proc Natl Acad Sci U S A 118(38). https://doi.org/10.1073/pnas.2101485118
Devries T, Taylor GW (2017) Improved regularization of convolutional neural networks with cutout. Preprint at arXiv:1708.04552
Zhong Z, Zheng L, Kang G, Li S, Yang Y (2020) Random erasing data augmentation. In: The Thirty–Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, pp. 13001–13008. AAAI Press, New York. https://ojs.aaai.org/index.php/AAAI/article/view/7000
Yun S, Han D, Oh SJ, Chun S, Choe J, Yoo Y (2019) Cutmix: Regularization strategy to train strong classifiers with localizable features. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, pp. 6022–6031. IEEE, New York. https://doi.org/10.1109/ICCV.2019.00612
Lopes RG, Yin D, Poole B, Gilmer J, Cubuk ED (2019) Improving robustness without sacrificing accuracy with patch gaussian augmentation. Preprint at arXiv:1906.02611
Kim Y, Shahab U, Bae SH (2021) Local augment: Utilizing local bias property of convolutional neural networks for data augmentation. IEEE Access 9:15191–15199. https://doi.org/10.1109/ACCESS.2021.3050758
Article Google Scholar
Zhang H, Cisse M, Dauphin YN, Lopez–Paz D (2018) mixup: Beyond empirical risk minimization. In: 6th International Conference on Learning Representations, ICLR 2018. OpenReview.net, Vancouver. https://openreview.net/forum?id=r1Ddp1-Rb
Han J, Fang P, Li W, Hong J, Armin MA, Reid I, Petersson L, Li H (2022) You only cut once: Boosting data augmentation with a single cut. Preprint at arXiv2201.12078
Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV (2018) Autoaugment: Learning augmentation policies from data. Preprint at arXiv:1805.09501
Cubuk ED, Zoph B, Shlens J, Le QV (2020) Randaugment: Practical automated data augmentation with a reduced search space. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2020. Computer Vision Foundation / IEEE, Seattle. https://doi.org/10.1109/CVPRW50498.2020.00359
Welch DM, Meselson M (2000) Evidence for the evolution of bdelloid rotifers without sexual reproduction or genetic exchange. Science 288:1211–1215
Article ADS CAS Google Scholar
D’Hondt M, D’Hondt T (1999) Is domain knowledge an aspect? In: Moreira AMD, Demeyer S (eds.) Object–Oriented Technology, ECOOP’99 Workshop Reader, ECOOP’99 Workshops, Panels, and Posters, vol. 1743, pp. 293–294. Springer, Lisbon. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.7211 &rep=rep1 &type=pdf
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Handbook of Systemic Autoimmune Diseases 1(4)
Coates A, Lee H, Ng AY, Coates A, Lee H, Ng AY (2011) An analysis of single–layer networks in unsupervised feature learning. In: Gordon GJ, Dunson DB, Dudík M (eds.) Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2011, vol. 15, pp. 215–223. JMLR.org, Fort Lauderdale. http://proceedings.mlr.press/v15/coates11a/coates11a.pdf
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, pp. 770–778. IEEE Computer Society, Las Vegas. https://doi.org/10.1109/CVPR.2016.90
Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of the British Machine Vision Conference 2016, BMVC 2016, pp. 87–18712. BMVA Press, York. http://www.bmva.org/bmvc/2016/papers/paper087/index.html

Download references

Acknowledgements

This work was produced in part by the National Natural Science Foundation of China under Grants 61763019.

Author information

Authors and Affiliations

College of Electronic Engineering, Naval University of Engineering, Qiaokou District Hanshui Bridge Street, Wuhan, 430074, Hubei, China
Feipeng Wang, Kerong Ben & Meini Yang
School of Computer and Big Data Science, Jiujiang University, Shili Street, Jiujiang, 332005, Jiangxi, China
Feipeng Wang & Hu Peng

Authors

Feipeng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Kerong Ben
View author publications
You can also search for this author in PubMed Google Scholar
Hu Peng
View author publications
You can also search for this author in PubMed Google Scholar
Meini Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Feipeng Wang or Kerong Ben.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, F., Ben, K., Peng, H. et al. NeighborMix data augmentation for image recognition. Multimed Tools Appl 83, 26581–26598 (2024). https://doi.org/10.1007/s11042-023-16603-3

Download citation

Received: 07 June 2022
Revised: 23 May 2023
Accepted: 21 August 2023
Published: 01 September 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s11042-023-16603-3

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

LMix: regularization strategy for convolutional neural networks

LocMix: local saliency-based data augmentation for image classification

PatchMix: patch-level mixup for data augmentation in convolutional neural networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

NeighborMix data augmentation for image recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

LMix: regularization strategy for convolutional neural networks

LocMix: local saliency-based data augmentation for image classification

PatchMix: patch-level mixup for data augmentation in convolutional neural networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation