OptiDepthNet: A Real-Time Unsupervised Monocular Depth Estimation Network

Feng Wei¹,
XingHui Yin¹,
Jie Shen¹ &
…
HuiBin Wang ORCID: orcid.org/0000-0003-3765-1520¹

242 Accesses
Explore all metrics

Abstract

With the development of deep learning, the network architectures and algorithm accuracy applied to monocular depth estimation have been greatly improved. However, these complex network structures can be very difficult to realize real-time processing on embedded platforms. Consequently, this study proposed a lightweight encoding and decoding structure based on the U-Net model. The depthwise separable convolution was introduced into the encoder and decoder to optimize the network structure, further reduce the computational complexity, and improve the running speed, the implementation algorithm being more suitable for embedded platforms. When the accuracy of similar depth images was achieved, the network parameters could be reduced by up to eight times, and the running speed could be more than doubled. The research showed the proposed method to be very effective, having a certain reference value in monocular depth estimation algorithms running on embedded platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Real-time monocular depth estimation for low-power embedded systems using deep learning

Article 08 August 2022

A Synopsis of Monocular Depth Estimation

FasterMDE: A real-time monocular depth estimation search method that balances accuracy and speed on the edge

Article 26 July 2023

Data Availability

Enquiries about data availability should be directed to the authors.

References

Liu, F., Shen, C., Lin, G., et al. (2015). Learning depth from single monocular images using deep convolutional neural fields. IEEE Transactions on Pattern Analysis & Machine Intelligence, 38(10), 2024–2039.
Article Google Scholar
Qingbo, Z., & Hongyuan, W. (2010). Block recovery stereo matching algorithm using image segmentation. Journal of Huazhong University of Science and Technology, 38(1), 81–84.
Google Scholar
Zexiao, X., & Zuoqi, Z. (2018). Spatial point localization method based on the motion recovery structure. Progress in Laser and optoelectronics, 55(8), 370–377.
Google Scholar
Cheng, X., Xiaohan, T., Siping, L., et al. (2019). Fast monocular depth estimation methods for embedded platforms. "in chinese", CN110599533A.
Eigen, D., Puhrsch, C., and Fergus, R., (2014). Depth map prediction from a single image using a multi-scale deep network. In Advances in Neural Information Processing Systems (NIPS), pp. 2366–2374.
Eigen, D., Fergus, R. (2014). Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In 2015 IEEE International Conference on Computer Vision (ICCV).
Liu, F., Shen, C., and Lin, G. (2015). Deep convolutional neural fields for depth estimation from a single image. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5162–5170.
Li, B., Shen, C. H., Dai, Y. C., Van, den H. A., and He M Y. (2015). Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. In Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston IEEE, 1119–1127 https://doi.org/10.1109/CVPR.2015.7298715.
Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., and Navab, N., (2016). Deeper depth prediction with fully convolutional residual networks. In 2016 Fourth International Conference on 3D Vision (3DV), pp. 239–248.
Cao, Y., Wu, Z., & Shen, C. (2017). Estimating depth from monocular images as classification using deep fully convolutional residual networks. IEEE Transactions on Circuits and Systems for Video Technology.
Garg, R., Vijay Kumar, B. G., Carneiro, G., and Ian, R.,(2016). Unsupervised CNN for single view depth estimation: geometry to the rescue. In Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer, 740-756.
Godard C, Aodha O M, and Brostow G J. (2017). Unsupervised monocular depth estimation with left-right consistency. In Conference on Computer Vision and Pattern Recognition (CVPR).
Godard, C., Aodha, O. M., Firman, M., et al. (2019). Digging into self-supervised monocular depth estimation. In ICCV.
Tosi, F., Aleotti, F., Poggi, M., and Mattoccia, S., (2019). Learning monocular depth estimation infusing traditional stereo knowledge. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9799–9809. https://doi.org/10.1109/CVPR.2019.01003.
Casser, V., Pirk, S., Mahjourian, R., et al. (2019). Depth prediction without the sensors: leveraging structure for unsupervised learning from monocular videos. In AAAI.
Wang, L., Zhang, J., Wang, Y., et al. (2020). Cliffnet for monocular depth estimation with hierarchical embedding loss. Cham: Springer.
Book Google Scholar
Mancini, M., Costante, G., Valigi, P., et al. (2016). Fast robust monocular depth estimation for obstacle detection with fully convolutional networks. https://doi.org/10.1109/IROS.2016.7759632
Atapour-Ab Arghouei, A., (2018). Real-time monocular depth estimation using synthetic data with domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision & Pattern Recognition. IEEE.
Liu W, Anguelov D, Erhan D, et al. (2016). SSD: single shot multibox detector. In European Conference on Computer Vision.
Redmon, J., Farhadi, A., (2015). YOLOv3: An incremental improvement. arXiv e-prints, 2018.
Technicolor, T., Related, S., Technicolor, T., et al. ImageNet classification with deep convolutional neural networks [50].
Lecun, Y., Denker, J. S., Solla, S. A., Howard, R. E., & Jackel, L. D. (1989). Optimal brain damage. In Advances in Neural Information Processing Systems 2, NIPS Conference, Denver, Colorado, USA, November 27–30, 1989.
Google Scholar
He, K., Zhang, X., Ren, S., et al. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Girshick R. Fast R-CNN. arXiv e-prints, 2015.
Mancini, M., Costante, G., Valigi, P., and Ciarfuglia, T. A., (2016). Fast robust monocular depth estimation for obstacle detection with fully convolutional networks. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4296–4303.
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., et al. (2017). MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint https://arxiv.org/pdf/1704.04861.pdf.
Jlab, C., Qlab, C., Rui, C., et al. (2020). MiniNet: An extremely lightweight convolutional neural network for real-time unsupervised monocular depth estimation. ISPRS Journal of Photogrammetry and Remote Sensing, 166, 255–267.
Article Google Scholar
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. Springer International Publishing.
Google Scholar
Chen, L. C., Zhu, Y., Papandreou, G., et al. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. In European Conference on Computer Vision. Springer, Cham.
Enkun, C., Yanqing, T., & Jiawei, L. (2020). Calibration error compensation for the stereo measurement system. Applied Optics, 242(06), 46–52.
Google Scholar
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 640–651.
Google Scholar
Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612. https://doi.org/10.1109/TIP.2003.819861
Article Google Scholar
Heise, P., Klose, S., Jensen, B., & Knoll, A. (2013). Pm-huber: Patchmatch with huber regularization for stereo matching. In IEEE 2013 IEEE International Conference on Computer Vision (ICCV) - Sydney, Australia, 2013.12.1–2013.12.8, pp. 2360–2367.
Ranjan, A., Jampani, V., Balles, L., Kim, K., Sun, D., Wulffff, J., and Black, M. J., (2019). Competitive collaboration: Joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12240–12249. https://doi.org/10.1109/CVPR.2019.01252.
Wofk, D., Ma, F., Yang, T-J., Karaman, S., and Sze, V., (2019). FastDepth: Fast monocular depth estimation on embedded systems. In International Conference on Robotics and Automation (ICRA).
Kuznietsov, Y., Stuckler, J., and Leibe, B., (2017). Semi-supervised deep learning for monocular depth map prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6647–6655. https://doi.org/10.1109/CVPR.2017.238.
Zhou T., Brown M., Snavely N, and Lowe, D. G. (2017). Unsupervised learning of depth and ego-motion from video. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.6612–6619. https://doi.org/10.1109/CVPR.2017.700.
Yin, Z., and Shi, J., (2018). Geonet: Unsupervised learning of dense depth, optical flow and camera pose. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 2, 2018, pp. 1983–1992. doi:https://doi.org/10.1109/CVPR.2018.00212.

Download references

Acknowledgements

Thanks to Godard and his team who shared their results.

Funding

This work was supported by the National Natural Science Foundation of China (NSFC Grant No. 61903124).

Author information

Authors and Affiliations

School of Computer and Information, Hohai University, Nanjing, 211100, China
Feng Wei, XingHui Yin, Jie Shen & HuiBin Wang

Authors

Feng Wei
View author publications
You can also search for this author in PubMed Google Scholar
XingHui Yin
View author publications
You can also search for this author in PubMed Google Scholar
Jie Shen
View author publications
You can also search for this author in PubMed Google Scholar
HuiBin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to HuiBin Wang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wei, F., Yin, X., Shen, J. et al. OptiDepthNet: A Real-Time Unsupervised Monocular Depth Estimation Network. Wireless Pers Commun 128, 2831–2846 (2023). https://doi.org/10.1007/s11277-022-10074-9

Download citation

Accepted: 28 September 2022
Published: 28 January 2023
Issue Date: February 2023
DOI: https://doi.org/10.1007/s11277-022-10074-9

OptiDepthNet: A Real-Time Unsupervised Monocular Depth Estimation Network

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Real-time monocular depth estimation for low-power embedded systems using deep learning

A Synopsis of Monocular Depth Estimation

FasterMDE: A real-time monocular depth estimation search method that balances accuracy and speed on the edge

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

OptiDepthNet: A Real-Time Unsupervised Monocular Depth Estimation Network

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Real-time monocular depth estimation for low-power embedded systems using deep learning

A Synopsis of Monocular Depth Estimation

FasterMDE: A real-time monocular depth estimation search method that balances accuracy and speed on the edge

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now