Image2Height: Self-height Estimation from a Single-Shot Image

Kei Shimonishi¹²,
Tyler Fisher¹³,
Hiroaki Kawashima¹⁴ &
…
Kotaro Funakoshi¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12046))

Included in the following conference series:

Asian Conference on Pattern Recognition

1493 Accesses

Abstract

This paper analyzes a self-height estimation method from a single-shot image using a convolutional architecture. To estimate the height where the image was captured, the method utilizes object-related scene structure contained in a single image in contrast to SLAM methods, which use geometric calculation on sequential images. Therefore, a variety of application domains from wearable computing (e.g., estimation of wearer’s height) to the analysis of archived images can be considered. This paper shows that (1) fine tuning from a pretrained object-recognition architecture contributes also to self-height estimation and that (2) not only visual features but their location on an image is fundamental to the self-height estimation task. We verify these two points through the comparison of different learning conditions, such as preprocessing and initialization, and also visualization and sensitivity analysis using a dataset obtained in indoor environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

LiteDepth: Digging into Fast and Accurate Depth Estimation on Mobile Devices

Repmono: a lightweight self-supervised monocular depth estimation architecture for high-speed inference

Article Open access 10 August 2024

Dyna-MSDepth: multi-scale self-supervised monocular depth estimation network for visual SLAM in dynamic scenes

Article 19 August 2024

Notes

1.
Currently, FLIR Systems.
2.
https://github.com/utkuozbulak/pytorch-cnn-visualizations was modified and used for the implementation.

References

Caballero, F., Merino, L., Ferruz, J., Ollero, A.: Vision-based odometry and SLAM for medium and high altitude flying UAVs. J. Intell. Robot. Syst. 54(1–3), 137–161 (2009)
Article Google Scholar
Finocchiaro, J., Khan, A.U., Borji, A.: Egocentric height estimation. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1142–1150 (2017). https://doi.org/10.1109/WACV.2017.132
Grabe, V., Bülthoff, H.H., Giordano, P.R.: On-board velocity estimation and closed-loop control of a quadrotor UAV based on optical flow. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 491–497 (2012). https://doi.org/10.1109/ICRA.2012.6225328
Jiang, H., Grauman, K.: Seeing invisible poses: estimating 3D body pose from egocentric video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3501–3509 (2017). https://doi.org/10.1109/CVPR.2017.373
Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: Proceedings of the International Symposium on Mixed and Augmented Reality (2007)
Google Scholar
Mur-Artal, R., Montiel, J.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Robot. 31(5), 1147–1163 (2015). https://doi.org/10.1109/TRO.2015.2463671
Article Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 618–626 (2017). https://doi.org/10.1109/ICCV.2017.74
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Proceedings of the International Conference on Neural Information Processing Systems (NIPS), pp. 568–576 (2014)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceeding of the International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net. In: Proceedings of the International Conference on Learning Representations (ICLR) (2014)
Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2016)
Google Scholar

Download references

Acknowledgements

This work is supported by the Cooperative Intelligence Joint Research Chair with Honda Research Institute Japan Co., Ltd.

Author information

Authors and Affiliations

Kyoto University, Kyoto, Kyoto, Japan
Kei Shimonishi & Kotaro Funakoshi
University of Victoria, Victoria, BC, Canada
Tyler Fisher
University of Hyogo, Kobe, Hyogo, Japan
Hiroaki Kawashima

Authors

Kei Shimonishi
View author publications
You can also search for this author in PubMed Google Scholar
Tyler Fisher
View author publications
You can also search for this author in PubMed Google Scholar
Hiroaki Kawashima
View author publications
You can also search for this author in PubMed Google Scholar
Kotaro Funakoshi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kei Shimonishi .

Editor information

Editors and Affiliations

University of Malaya, Kuala Lumpur, Malaysia
Shivakumara Palaiahnakote
Consiglio Nazionale delle Ricerche, ICAR, Naples, Italy
Gabriella Sanniti di Baja
Chinese Academy of Sciences, Beijing, China
Liang Wang
Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shimonishi, K., Fisher, T., Kawashima, H., Funakoshi, K. (2020). Image2Height: Self-height Estimation from a Single-Shot Image. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science(), vol 12046. Springer, Cham. https://doi.org/10.1007/978-3-030-41404-7_61

Download citation

DOI: https://doi.org/10.1007/978-3-030-41404-7_61
Published: 23 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41403-0
Online ISBN: 978-3-030-41404-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics