Abstract
Capsule network (CapsNet) is a novel concept demonstrating the importance of learning spatial hierarchical relationship between features for the effective recognition of images. However, the baseline capsule network is not suitable for the recognition of complex images leading to its poor performance on such images. This limitation can partially be attributed to the inability of CapsNets to extract important features from the input images as well as the attempt to account for every object in the image including background objects. To address these problems, we propose a variant of a capsule network that is less complex yet robust with strong feature extraction capabilities. The model uses the advantages of Gabor filter and custom preprocessing block to learn the structure and semantic information in the image. This enhances the extraction of only important features, resulting in improved activation diagrams that enable meaningful hierarchical information to be learned. Experimental results show that the proposed model can achieve 85.24%, 68.17%, 94.78% and 91.50% test accuracies on complex images such as CIFAR 10, CIFAR 100, fashion-MNIST and kvasir-dataset-v2 datasets, respectively. The performance of the proposed model is comparable to that of the state-of-the-art models on the five datasets with a relatively small number of parameters.









Similar content being viewed by others
References
NIH: Digestive Diseases Statistics for the United States, Digestive Diseases Statistics for the United States (2020). [Online]. https://www.niddk.nih.gov/health-information/health-statistics/digestive-diseases
Sabour, S., Frosst, N., Hinton, G. E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, pp. 3856–3866 (2017)
Coughlan, G., Flanagan, E., Jeffs, S., Bertoux, M., Spiers, H., Mioshi, E., Hornberger, M.: Diagnostic relevance of spatial orientation for vascular dementia. Dement. Neuropsychol. 12(1), 85–91 (2018)
Singh, A., Sengupta, S., Lakshminarayanan, V.: Explainable deep learning models in medical image analysis. J. Imaging 6(6), 1–18 (2020)
Cao, S., Yao, Y., An, G.: E2-capsule neural networks for facial expression recognition using AU-aware attention 1–2 (2019)
Kwabena Patrick, M., Felix Adebayo, A., Abra Mighty, A., Edward, B.Y.: Capsule networks-a survey. J. King Saud Univ. Comput. Inf. (2019). https://doi.org/10.1016/j.jksuci.2019.09.0141319-1578
Xi, E., Bing, S., Jin, Y.: Capsule Network Performance on Complex Data 10707(Fall), 1–7 (2017)
Chang, S., Liu, J.: Multi-lane capsule network for classifying images with complex background. IEEE Access 8, 79876–79886 (2020)
Xiang, C., Zhang, L., Zou, W., Tang, Y., Xu, C.: MS-CapsNet: A novel multi-scale capsule network. IEEE Signal Process. Lett. 1 (2018)
Zhao, Z., Kleinhans, A., Sandhu, G., Patel, I., Unnikrishnan, K. P.: Fast Inference in Capsule Networks Using Accumulated Routing Coefficients, pp. 1–13 (2019)
Zhao, Z., Kleinhans, A., Sandhu, G., Patel, I.K., Unnikrishnan, P.: Capsule Networks with Max–Min Normalization, pp. 1–15 (2019)
Jiang, X., Wang, Y., Liu, W., Li, S., Liu, J.: CapsNet, CNN, FCN: comparative performance evaluation for image classification. Int. J. Mach. Learn. Comput. 9(6), 840–848 (2019)
Ding, X., Wang, N., Gao, X., Li, J., Wang, X.: Group reconstruction and max-pooling residual capsule network. In: International Joint Conferences on Artificial Intelligence, vol. 2019-August, pp. 2237–2243 (2019)
Chang, Y., Chen, W., Huang, Z., Shen, Q.: Gastrointestinal tract diseases detection with deep attention neural network. In: MM 2019—Proceedings of 27th ACM International Conference on Multimedia, pp. 2568–2572 (2019)
Asperti, A., Mastronardo, C.: The effectiveness of data augmentation for detection of gastrointestinal diseases from endoscopical images. In: BIOIMAGING 2018—5th International Conference on Bioimaging, Proceedings; Part 11th International Joint Conference on Biomedical Engineering Systems and Technologies BIOSTEC 2018, vol. 2, pp. 199–205 (2018)
Zhang, X., et al.: Real-time gastric polyp detection using convolutional neural networks. PLoS One 14(3), 1–16 (2019)
Gabor, D.: Theory of communication. J. Inst. Electr. Eng. Part III Radio Commun. Eng. 93(26), 429–441 (1946)
Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583), 607–609 (1996)
Patrick, M.K., Weyori, B.A., Mighty, A.A.: Max-pooled fast learning Gabor capsule network. In: 2020 International Conference on Artificial Intelligence, pp. 1–8. Big Data, Computing and Data Communication Systems (icABCD), Durban, South Africa (2020)
Buades, A., Coll, B., Morel, J. M.: Denoising image sequence does not require motion estimation. In: IEEE Conference on Advanced Video and Signal Based Surveillance, pp. 70–74 (2005)
Krizhevsky, A.: Learning Multiple Layers of Features from Tiny Images (2009)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms, pp. 1–6 (2017)
Pogorelov, K., Randel, K.R., Griwodz, C. et al., Kvasir: a multi-class image dataset for computer aided gastrointestinal disease detection. In: Proceedings of the 8th ACM on Multimedia Systems Conference, pp. 164–169 (2017)
Meyes, R., Lu, M., de Puiseau, C. W., Meisen, T.: Ablation Studies in Artificial Neural Networks, pp. 1–19 (2019)
Xie, N., Ras, G., van Gerven, M., Doran, D.: Explainable Deep Learning: A Field Guide for the Uninitiated (2020)
Shahroudnejad, A., Afshar, P., Plataniotis, K. N., Mohammadi, A.: Improved explainability of capsule networks: relevance path by agreement. In: 2018 IEEE Global Conference on Signal and Information Processing Global. 2018—Proceedings, pp. 549–553 (2019)
García-Alonso, C.R., Pérez-Naranjo, L.M., Fernández-Caballero, J.C.: Visualizing data using t-SNE. Ann. Oper. Res. 219(1), 187–202 (2014)
Tsai, Y.-H.H., Srivastava, N., Goh, H., Salakhutdinov, R: Capsules with Inverted Dot-Product Attention Routing, pp. 1–15 (2020)
Han, T., Sun, R., Shao, F., Sui, Y.: Feature and spatial relationship coding capsule network. J. Electron. Imaging 29(02), 1 (2020)
Ahmed, K., Torresani, L.: STAR-CAPS: Capsule Networks with Straight-Through Attentive Routing, no. NeurIPS, pp. 1–10 (2019)
Yang, S., et al.: RS-CapsNet: an advanced capsule network. IEEE Access 8, 85007–85018 (2020)
Ozcan, B., Kınlı, F., Kıraç, F.: Quaternion Capsule Networks. arXiv Prepr. arXiv.2007.04389 (2020)
Deborshi, G., Sun, R.: Application of Capsule Networks for Image Classification on Complex Datasets (2019)
Krizhevsky, A., Sutskever, L., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 1–9 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR Conference (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016—December, pp. 770–778 (2016)
Funding
This research was supported by the National Natural Science Foundation of China (NSFC Grant No. 61550110248); Research on Sino-Tibetan multi-source information acquisition, fusion, data mining and its application (Grant No. H04W170186) and Sichuan Science and Technology Program (Grant No. 2019YFG0190).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Abra Ayidzoe, M., Yu, Y., Mensah, P.K. et al. Gabor capsule network with preprocessing blocks for the recognition of complex images. Machine Vision and Applications 32, 91 (2021). https://doi.org/10.1007/s00138-021-01221-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00138-021-01221-6