Abstract
Learning from online data with noisy web labels is gaining more attention due to the increasing cost of fully annotated datasets in large-scale multi-label classification tasks. Partial (positive) annotated data, as a particular case of data with noisy labels, are economically accessible. And they serve as benchmarks to evaluate the learning capacity of state-of-the-art methods in real scenarios, though they contain a large number of samples with false negative labels. Existing (partial) multi-label methods are usually studied in the Euclidean space, where the relationship between the label embeddings and image features is not symmetrical and thus can be challenging to learn. To alleviate this problem, we propose reformulating the task into a hyperspherical space, where an angular margin can be incorporated into a hyperspherical multi-label loss function. This margin allows us to effectively balance the impact of false negative and true positive labels. We further design a mechanism to tune the angular margin and scale adaptively. We investigate the effectiveness of our method under three multi-label scenarios (single positive labels, partial positive labels and full labels) on four datasets (VOC12, COCO, CUB-200 and NUS-WIDE). In the single and partial positive labels scenarios, our method achieves state-of-the-art performance. The robustness of our method is verified by comparing the performances at different proportions of partial positive labels in the datasets. Our method also obtains more than 1% improvement over the BCE loss even on the fully annotated scenario. Analysis shows that the learned label embeddings potentially correspond to actual label correlation, since in hyperspherical space label embeddings and image features are symmetrical and interchangeable. This further indicates the geometric interpretability of our method. Code is available at https://github.com/TencentYoutuResearch/MultiLabel-HML.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akbarnejad, A.H., Baghshah, M.S.: An efficient semi-supervised multi-label classifier capable of handling missing labels. IEEE Trans. Knowl. Data Eng. 31(2), 229–242 (2018)
Bustos, A., Pertusa, A., Salinas, J.M., de la Iglesia-Vayá, M.: PadChest: a large chest x-ray image dataset with multi-label annotated reports. Med. Image Anal. 66, 101797 (2020)
Chaudhari, S., Mithal, V., Polatkan, G., Ramanath, R.: An attentive survey of attention models. ACM Trans. Intell. Syst. Technol. (TIST) 12(5), 1–32 (2021)
Chen, Z.M., Wei, X.S., Wang, P., Guo, Y.: Multi-label image recognition with graph convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5177–5186 (2019)
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from national university of Singapore. In: Proceedings of the ACM international conference on image and video retrieval, pp. 1–9 (2009)
Cid-Sueiro, J.: Proper losses for learning from partial labels. In: Advances in Neural Information Processing Systems, pp. 1565–1573. Citeseer (2012)
Cole, E., Mac Aodha, O., Lorieul, T., Perona, P., Morris, D., Jojic, N.: Multi-label learning from single positive labels. In: CVPR, pp. 933–942 (2021)
Cour, T., Sapp, B., Taskar, B.: Learning from partial labels. J. Mach. Learn. Res. 12, 1501–1536 (2011)
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)
Dong, H.C., Li, Y.F., Zhou, Z.H.: Learning from semi-supervised weak-label data. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2020)
Durand, T., Mehrasa, N., Mori, G.: Learning a deep convnet for multi-label classification with partial labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 647–657 (2019)
Elkan, C., Noto, K.: Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 213–220 (2008)
Everingham, M., Eslami, S.A., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)
Fan, X., Jiang, W., Luo, H., Fei, M.: SphereReID: deep hypersphere manifold embedding for person re-identification. J. Vis. Commun. Image Represent. 60, 51–58 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
He, X., Zemel, R.: Learning hybrid models for image annotation with partially labeled data. Adv. Neural. Inf. Process. Syst. 21, 625–632 (2008)
Huang, S.J., Zhou, Z.H.: Multi-label learning by exploiting label correlations locally. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 26 (2012)
Huynh, D., Elhamifar, E.: Interactive multi-label CNN learning with partial labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9423–9432 (2020)
Jin, R., Ghahramani, Z.: Learning with multiple labels. In: NIPS, vol. 2, pp. 897–904. Citeseer (2002)
Kang, F., Jin, R., Sukthankar, R.: Correlated label propagation with application to multi-label learning. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 1719–1726. IEEE (2006)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (Poster) (2015), http://arxiv.org/abs/1412.6980
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)
Lanchantin, J., Wang, T., Ordonez, V., Qi, Y.: General multi-label image classification with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16478–16488 (2021)
Li, Q., Peng, X., Qiao, Y., Peng, Q.: Learning label correlations for multi-label image recognition with graph networks. Pattern Recogn. Lett. 138, 378–384 (2020)
Li, W., et al.: WebVision challenge: visual learning and understanding with web data. ArXiv preprint arXiv:1705.05640 (2017)
Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: IJCAI, vol. 3, pp. 587–592. Citeseer (2003)
Lin, T., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 212–220 (2017)
Liu, W., et al.: Deep hyperspherical learning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 3953–3963 (2017)
Liu, Z., et al.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10012–10022 (October 2021)
Luo, H., Gu, Y., Liao, X., Lai, S., Jiang, W.: Bag of tricks and a strong baseline for deep person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Luo, H., et al.: A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans. Multimedia 22(10), 2597–2609 (2019)
Meng, Q., Zhang, W.: Multi-label image classification with attention mechanism and graph convolutional networks. In: Proceedings of the ACM Multimedia Asia, pp. 1–6. ACM (2019)
Nguyen, N., Caruana, R.: Classification with partial labels. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 551–559 (2008)
Pham, H., Dai, Z., Xie, Q., Le, Q.V.: Meta pseudo labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11557–11568 (2021)
Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12179–12188 (2021)
Ridnik, T., et al.: Asymmetric loss for multi-label classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 82–91 (2021)
Sun, Y.Y., Zhang, Y., Zhou, Z.H.: Multi-label learning with weak label. In: Twenty-fourth AAAI conference on artificial intelligence (2010)
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers & distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR (2021)
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset. Journal (2011)
Wang, F., Cheng, J., Liu, W., Liu, H.: Additive margin softmax for face verification. IEEE Signal Process. Lett. 25(7), 926–930 (2018)
Wang, F., Xiang, X., Cheng, J., Yuille, A.L.: Normface: L2 hypersphere embedding for face verification. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1041–1049 (2017)
Wang, H., et al.: CosFace: Large margin cosine loss for deep face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5265–5274 (2018)
Wang, L., Liu, Y., Qin, C., Sun, G., Fu, Y.: Dual relation semi-supervised multi-label learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 6227–6234 (2020)
Wang, Y., He, D., Li, F., Long, X., Zhou, Z., Ma, J., Wen, S.: Multi-label classification with label graph superimposing. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12265–12272 (2020)
Wen*, Y., Liu*, W., Weller, A., Raj, B., Singh, R.: Sphereface2: binary classification is all you need for deep face recognition. In: 10th International Conference on Learning Representations (ICLR) (2022). https://openreview.net/forum?id=l3SDgUh7qZO, *equal contribution
Wojke, N., Bewley, A.: Deep cosine metric learning for person re-identification. In: 2018 IEEE winter conference on applications of computer vision (WACV), pp. 748–756. IEEE (2018)
Xie, M.K., Huang, S.J.: Partial multi-label learning with noisy label identification. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)
Ye, J., He, J., Peng, X., Wu, W., Qiao, Y.: Attention-driven dynamic graph convolutional network for multi-label image recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12366, pp. 649–665. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58589-1_39
Yu, F., Rawat, A.S., Menon, A., Kumar, S.: Federated learning with only positive labels. In: International Conference on Machine Learning, pp. 10946–10956. PMLR (2020)
Yu, Y., Pedrycz, W., Miao, D.: Multi-label classification by exploiting label correlations. Expert Syst. Appl. 41(6), 2989–3004 (2014)
Zhao, F., Huang, Y., Wang, L., Tan, T.: Deep semantic ranking based hashing for multi-label image retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1556–1564 (2015)
Zhao, J., Yan, K., Zhao, Y., Guo, X., Huang, F., Li, J.: Transformer-based dual relation graph for multi-label image recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 163–172 (2021)
Zhao, J., Zhao, Y., Li, J.: M3tr: Multi-modal multi-label recognition with transformer. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 469–477 (2021)
Zhu, J., Liao, S., Lei, Z., Yi, D., Li, S.: Pedestrian attribute classification in surveillance: Database and evaluation. In: Proceedings of the IEEE international conference on computer vision workshops, pp. 331–338 (2013)
Zhu, Y., Kwok, J.T., Zhou, Z.H.: Multi-label learning with global and local label correlation. IEEE Trans. Knowl. Data Eng. 30(6), 1081–1094 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ke, B., Zhu, Y., Li, M., Shu, X., Qiao, R., Ren, B. (2022). Hyperspherical Learning in Multi-Label Classification. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13685. Springer, Cham. https://doi.org/10.1007/978-3-031-19806-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-19806-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19805-2
Online ISBN: 978-3-031-19806-9
eBook Packages: Computer ScienceComputer Science (R0)