Constructing Balance from Imbalance for Long-Tailed Image Recognition

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13680))

Included in the following conference series:

European Conference on Computer Vision

3149 Accesses

Abstract

Long-tailed image recognition presents massive challenges to deep learning systems since the imbalance between majority (head) classes and minority (tail) classes severely skews the data-driven deep neural networks. Previous methods tackle with data imbalance from the viewpoints of data distribution, feature space, and model design, etc. In this work, instead of directly learning a recognition model, we suggest confronting the bottleneck of head-to-tail bias before classifier learning, from the previously omitted perspective of balancing label space. To alleviate the head-to-tail bias, we propose a concise paradigm by progressively adjusting label space and dividing the head classes and tail classes, dynamically constructing balance from imbalance to facilitate the classification. With flexible data filtering and label space mapping, we can easily embed our approach to most classification models, especially the decoupled training methods. Besides, we find the separability of head-tail classes varies among different features with different inductive biases. Hence, our proposed model also provides a feature evaluation method and paves the way for long-tailed feature learning. Extensive experiments show that our method can boost the performance of state-of-the-arts of different types on widely-used benchmarks. Code is available at https://github.com/silicx/DLSA.

Y. Xu and Y.-L. Li—Contributed equally.

C. Lu—Member of Qing Yuan Research Institute and Shanghai Qi Zhi Institute.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Long-tailed image recognition through balancing discriminant quality

Article 07 July 2023

SAFA: Sample-Adaptive Feature Augmentation for Long-Tailed Image Classification

Contrastive dual-branch network for long-tailed visual recognition

Article 20 December 2024

References

Buda, M., Maki, A., Mazurowski, M.A.: A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw. 106, 249–259 (2018)
Article Google Scholar
Byrd, J., Lipton, Z.: What is the effect of importance weighting in deep learning? In: International Conference on Machine Learning, pp. 872–881. PMLR (2019)
Google Scholar
Cao, K., Wei, C., Gaidon, A., Arechiga, N., Ma, T.: Learning imbalanced datasets with label-distribution-aware margin loss. arXiv preprint arXiv:1906.07413 (2019)
Chao, Y.W., Wang, Z., He, Y., Wang, J., Deng, J.: HICO: a benchmark for recognizing human-object interactions in images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1017–1025 (2015)
Google Scholar
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)
Google Scholar
Chou, Hsin-Ping., Chang, Shih-Chieh., Pan, Jia-Yu., Wei, Wei, Juan, Da-Cheng.: Remix: rebalanced mixup. In: Bartoli, Adrien, Fusiello, Andrea (eds.) ECCV 2020. LNCS, vol. 12540, pp. 95–110. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65414-6_9
Chapter Google Scholar
Chu, P., Bian, X., Liu, S., Ling, H.: Feature space augmentation for long-tailed data. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12374, pp. 694–710. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58526-6_41
Chapter Google Scholar
Cubuk, E.D., Zoph, B., Shlens, J., Le, Q.V.: RandAugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 702–703 (2020)
Google Scholar
Cui, J., Zhong, Z., Liu, S., Yu, B., Jia, J.: Parametric contrastive learning. arXiv preprint arXiv:2107.12028 (2021)
Danon, L., Diaz-Guilera, A., Duch, J., Arenas, A.: Comparing community structure identification. J. Stat. Mech Theory Exp. 2005(09), P09008 (2005)
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP. arXiv preprint arXiv:1605.08803 (2016)
Gidaris, S., Komodakis, N.: Dynamic few-shot visual learning without forgetting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4367–4375 (2018)
Google Scholar
Gupta, A., Dollar, P., Girshick, R.: LVIS: a dataset for large vocabulary instance segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5356–5364 (2019)
Google Scholar
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. arXiv preprint arXiv:1911.05722 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
He, Y.Y., Wu, J., Wei, X.S.: Distilling virtual examples for long-tailed recognition. arXiv preprint arXiv:2103.15042 (2021)
Huang, C., Li, Y., Loy, C.C., Tang, X.: Learning deep representation for imbalanced classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5375–5384 (2016)
Google Scholar
Huang, C., Li, Y., Loy, C.C., Tang, X.: Deep imbalanced learning for face recognition and attribute prediction. IEEE Trans. Pattern Anal. Mach. Intell. 42(11), 2781–2794 (2019)
Article Google Scholar
Izmailov, P., Kirichenko, P., Finzi, M., Wilson, A.G.: Semi-supervised learning with normalizing flows. In: International Conference on Machine Learning, pp. 4615–4630. PMLR (2020)
Google Scholar
Jamal, M.A., Brown, M., Yang, M.H., Wang, L., Gong, B.: Rethinking class-balanced methods for long-tailed visual recognition from a domain adaptation perspective. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7610–7619 (2020)
Google Scholar
Jamal, M.A., Brown, M., Yang, M.H., Wang, L., Gong, B.: Rethinking class-balanced methods for long-tailed visual recognition from a domain adaptation perspective. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7610–7619 (2020)
Google Scholar
Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6(5), 429–449 (2002)
Article Google Scholar
Jiang, Z., Chen, T., Mortazavi, B., Wang, Z.: Self-damaging contrastive learning. arXiv preprint arXiv:2106.02990 (2021)
Kang, B., Li, Y., Xie, S., Yuan, Z., Feng, J.: Exploring balanced feature spaces for representation learning. In: International Conference on Learning Representations (2020)
Google Scholar
Kang, B., Xie, S., Rohrbach, M., Yan, Z., Gordo, A., Feng, J., Kalantidis, Y.: Decoupling representation and classifier for long-tailed recognition. arXiv preprint arXiv:1910.09217 (2019)
Kiefer, J., Wolfowitz, J.: Stochastic estimation of the maximum of a regression function. Ann. Math. Statist. 23(3), 462–466 (1952)
Google Scholar
Kim, B., Kim, J.: Adjusting decision boundary for class imbalanced learning. IEEE Access 8, 81674–81685 (2020)
Article Google Scholar
Kim, J., Jeong, J., Shin, J.: M2m: imbalanced classification via major-to-minor translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13896–13905 (2020)
Google Scholar
Kingma, D.P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., Welling, M.: Improved variational inference with inverse autoregressive flow. Adv. Neural. Inf. Process. Syst. 29, 4743–4751 (2016)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)
Google Scholar
Li, S., Gong, K., Liu, C.H., Wang, Y., Qiao, F., Cheng, X.: MetaSAug: meta semantic augmentation for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5212–5221 (2021)
Google Scholar
Li, Y.L., et al.: Hake: a knowledge engine foundation for human activity understanding. arXiv preprint arXiv:2202.06851 (2022)
Li, Y.L., et al.: PaStaNet: toward human activity knowledge engine. In: CVPR (2020)
Google Scholar
Li, Y.L., Xu, Y., Mao, X., Lu, C.: Symmetry and group in attribute-object compositions. In: CVPR (2020)
Google Scholar
Li, Y.L., et al.: Transferable interactiveness knowledge for human-object interaction detection. In: CVPR (2019)
Google Scholar
Liu, Z., Miao, Z., Zhan, X., Wang, J., Gong, B., Yu, S.X.: Large-scale long-tailed recognition in an open world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2537–2546 (2019)
Google Scholar
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11) (2008)
Google Scholar
Matthews, B.W.: Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure 405(2), 442–451 (1975)
Google Scholar
Menon, A.K., Jayasumana, S., Rawat, A.S., Jain, H., Veit, A., Kumar, S.: Long-tail learning via logit adjustment. arXiv preprint arXiv:2007.07314 (2020)
Oord, A.V.D., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)
Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. arXiv preprint arXiv:1705.07057 (2017)
Qi, H., Brown, M., Lowe, D.G.: Low-shot learning with imprinted weights. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5822–5830 (2018)
Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, pp. 8748–8763. PMLR (2021)
Google Scholar
Ren, J., et al.: Balanced meta-softmax for long-tailed visual recognition. arXiv preprint arXiv:2007.10740 (2020)
Rezende, D., Mohamed, S.: Variational inference with normalizing flows. In: International conference on machine learning, pp. 1530–1538. PMLR (2015)
Google Scholar
Samuel, D., Atzmon, Y., Chechik, G.: From generalized zero-shot learning to long-tail with class descriptors. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 286–295 (2021)
Google Scholar
Samuel, D., Atzmon, Y., Chechik, G.: From generalized zero-shot learning to long-tail with class descriptors. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 286–295 (2021)
Google Scholar
Samuel, D., Chechik, G.: Distributional robustness loss for long-tail learning. arXiv preprint arXiv:2104.03066 (2021)
Samuel, D., Chechik, G.: Distributional robustness loss for long-tail learning. arXiv preprint arXiv:2104.03066 (2021)
Shen, Li., Lin, Zhouchen, Huang, Qingming: Relay backpropagation for effective learning of deep convolutional neural networks. In: Leibe, Bastian, Matas, Jiri, Sebe, Nicu, Welling, Max (eds.) ECCV 2016. LNCS, vol. 9911, pp. 467–482. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_29
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sinha, S., Ohashi, H., Nakamura, K.: Class-wise difficulty-balanced loss for solving class-imbalance. In: Proceedings of the Asian Conference on Computer Vision (2020)
Google Scholar
Spain, M., Perona, P.: Measuring and predicting importance of objects in our visual world (2007)
Google Scholar
Van Horn, G., et al.: The inaturalist species classification and detection dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8769–8778 (2018)
Google Scholar
Wang, J., Lukasiewicz, T., Hu, X., Cai, J., Xu, Z.: Rsg: A simple but effective module for learning imbalanced datasets. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3784–3793 (2021)
Google Scholar
Wang, P., Han, K., Wei, X.S., Zhang, L., Wang, L.: Contrastive learning based hybrid networks for long-tailed image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 943–952 (2021)
Google Scholar
Wang, X., Lian, L., Miao, Z., Liu, Z., Yu, S.X.: Long-tailed recognition by routing diverse distribution-aware experts. arXiv preprint arXiv:2010.01809 (2020)
Wang, Y.X., Ramanan, D., Hebert, M.: Learning to model the tail. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 7032–7042 (2017)
Google Scholar
Wu, Tz-Ying., Morgado, Pedro, Wang, Pei, Ho, Chih-Hui., Vasconcelos, Nuno: Solving long-tailed recognition with deep realistic taxonomic classifier. In: Vedaldi, Andrea, Bischof, Horst, Brox, Thomas, Frahm, Jan-Michael. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 171–189. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_11
Chapter Google Scholar
Xiang, Liuyu, Ding, Guiguang, Han, Jungong: Learning from multiple experts: self-paced knowledge distillation for long-tailed classification. In: Vedaldi, Andrea, Bischof, Horst, Brox, Thomas, Frahm, Jan-Michael. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 247–263. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_15
Chapter Google Scholar
Yang, Y., Xu, Z.: Rethinking the value of labels for improving class-imbalanced learning. arXiv preprint arXiv:2006.07529 (2020)
Zhang, S., Chen, C., Hu, X., Peng, S.: Balanced knowledge distillation for long-tailed learning. arXiv preprint arXiv:2104.10510 (2021)
Zhang, S., Li, Z., Yan, S., He, X., Sun, J.: Distribution alignment: a unified framework for long-tail visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2361–2370 (2021)
Google Scholar
Zhang, Z., Pfister, T.: Learning fast sample re-weighting without reward data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 725–734 (2021)
Google Scholar
Zhong, Z., Cui, J., Liu, S., Jia, J.: Improving calibration for long-tailed recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16489–16498 (2021)
Google Scholar
Zhong, Z., Cui, J., Liu, S., Jia, J.: Improving calibration for long-tailed recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16489–16498 (2021)
Google Scholar
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2017)
Article Google Scholar
Zhou, B., Cui, Q., Wei, X.S., Chen, Z.M.: BBN: bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9719–9728 (2020)
Google Scholar
Zipf, G.K.: The psycho-biology of language: an introduction to dynamic philology. Routledge (2013)
Google Scholar

Download references

Acknowledgement

This work was supported by the National Key R &D Program of China (No. 2021ZD0110700), Shanghai Municipal Science and Technology Major Project (2021SHZDZX0102), Shanghai Qi Zhi Institute, and SHEITC (2018-RGZN-02046).

Author information

Authors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Yue Xu, Yong-Lu Li, Jiefeng Li & Cewu Lu
Hong Kong University of Science and Technology, Hong Kong, China
Yong-Lu Li

Authors

Yue Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yong-Lu Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiefeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Cewu Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cewu Lu .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 242 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, Y., Li, YL., Li, J., Lu, C. (2022). Constructing Balance from Imbalance for Long-Tailed Image Recognition. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13680. Springer, Cham. https://doi.org/10.1007/978-3-031-20044-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-20044-1_3
Published: 20 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20043-4
Online ISBN: 978-3-031-20044-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics