Abstract
We present a label-based, semantic distance induced regularization learning method for Fine-grained image recognition (FGIR). In contrast to previous label-based methods that involve a nontrivial optimization in multi-task metric learning, our approach can be integrated into an end-to-end network without introducing any extra parameters, thus easy to be optimized. To this end, a category-level hierarchical distance matrix (HDM) that encodes semantic distance between subcategories through a tree-like label hierarchy is constructed. HDM is then incorporated into a DCNN to aggregate misclassified prediction probabilities for model learning, thus providing additional discriminative information for fine-grained feature learning. Experiments on three fine-grained benchmark datasets (Stanford Cars, FGVC-Aircraft, CUB-Birds) validate the effectiveness of our approach and demonstrate its improvements over previous methods.
The first author is a student.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akata, Z., Reed, S., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: CVPR (2015)
Chen, T., Wu, W., Gao, Y., Dong, L., Luo, X., Lin, L.: Fine-grained representation learning and recognition by exploiting hierarchical semantic embedding. In: ACM MM (2018)
Fong, R., Vedaldi, A.: Net2vec: quantifying and explaining how concepts are encoded by filters in deep neural networks. In: CVPR (2018)
Gebru, T., Hoffman, J., Li, F.F.: Fine-grained recognition in the wild: a multi-task domain adaptation approach. In: ICCV (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
He, X., Peng, Y.: Fine-grained image classification via combining vision and language. In: CVPR (2017)
He, X., Peng, Y.: Weakly supervised learning of part selection model with spatial constraints for fine-grained image classification. In: AAAI (2017)
He, X., Peng, Y., Zhao, J.: Which and how many regions to gaze: focus discriminative regions for fine-grained visual categorization. Int. J. Comput. Vis., 1–21 (2019)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR (2017)
Krause, J., Stark, M., Deng, J., Li, F.F.: 3D object representations for fine-grained categorization. In: 4th IEEE Workshop on 3D Representation and Recognition at ICCV (2013)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Li, Z., Yang, Y., Liu, X., Wen, S., Xu, W.: Dynamic computational time for visual attention. In: ICCV (2017)
Liu, X., Xia, T., Wang, J., Lin, Y.: Fully convolutional attention localization networks: efficient attention localization for fine-grained recognition. In: arXiv preprint arXiv:1603.06765 (2016)
Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: CVPR (2015)
Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. In: arXiv preprint arXiv:1306.5151 (2013)
Mostajabi, M., Maire, M., Shakhnarovich, G.: Regularizing deep networks by modeling and predicting label structure. In: CVPR (2018)
Dubey, A., Gupta, O., Raskar, R., Naik, N.: Maximum entropy fine-grained classification. In: NIPS (2018)
Peng, Y., He, X., Zhao, J.: Object-part attention model for fine-grained image classification. IEEE Trans. Image Process. 27(3), 1487–1500 (2017)
Simon, M., Rodner, E.: Neural activation constellations: unsupervised part model discovery with convolutional networks. In: ICCV (2015)
Szegedy, C., et al.: Going deeper with convolutions. In: CVPR (2015)
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD birds-200-2011 dataset. Technical report, California Institute of Technology (2011)
Wang, D., Shen, Z., Shao, J., Zhang, W., Xue, X., Zhang, Z.: Multiple granularity descriptors for fine-grained categorization. In: ICCV (2015)
Xiao, T., Xu, Y., Yang, K., Zhang, J., Peng, Y., Zhang, Z.: The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: CVPR (2015)
Xie, S., Yang, T., Wang, X., Lin, Y.: Hyper-class augmented and regularized deep learning for fine-grained image classification. In: CVPR (2015)
Yang, S., Bo, L., Wang, J., Shapiro, L.G.: Unsupervised template learning for fine-grained object recognition. In: NIPS (2012)
Yang, Z., Luo, T., Wang, D., Hu, Z., Gao, J., Wang, L.: Learning to navigate for fine-grained classification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 438–454. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_26
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Zhang, H., et al.: SPDA-CNN: unifying semantic part detection and abstraction for fine-grained recognition. In: CVPR (2016)
Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_54
Zhang, N., Farrell, R., Darrell, T.: Pose pooling kernels for sub-category recognition. In: CVPR (2012)
Zhang, N., Farrell, R., Iandola, F., Darrell, T.: Deformable part descriptors for fine-grained recognition and attribute prediction. In: ICCV (2013)
Zhang, X., Zhou, F., Lin, Y., Zhang, S.: Embedding label structures for fine-grained feature representation. In: CVPR (2016)
Zhang, X., Xiong, H., Zhou, W., Lin, W., Tian, Q.: Picking deep filter responses for fine-grained image recognition. In: CVPR (2016)
Zhou, F., Lin, Y.: Fine-grained image classification by exploring bipartite-graph labels. In: CVPR (2016)
Zlateski, A., Jaroensri, R., Sharma, P., Durand, F.: On the importance of label quality for semantic segmentation. In: CVPR (2018)
Acknowledgement
This work was supported in part by the National Natural Science Foundation of China under Grant 61702197, in part by the Natural Science Foundation of Guangdong Province under Grant 2017A030310261, in part by the program of China Scholarship Council.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Mo, X., Zhu, J., Zhao, X., Liu, M., Wei, T., Luo, W. (2019). Exploiting Category-Level Semantic Relationships for Fine-Grained Image Recognition. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2019. Lecture Notes in Computer Science(), vol 11857. Springer, Cham. https://doi.org/10.1007/978-3-030-31654-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-31654-9_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31653-2
Online ISBN: 978-3-030-31654-9
eBook Packages: Computer ScienceComputer Science (R0)