Abstract
Visual odometry can be augmented by depth information such as provided by RGB-D cameras, or from lidars associated with cameras. However, such depth information can be limited by the sensors, leaving large areas in the visual images where depth is unavailable. Here, we propose a method to utilize the depth, even if sparsely available, in recovery of camera motion. In addition, the method utilizes depth by structure from motion using the previously estimated motion, and salient visual features for which depth is unavailable. Therefore, the method is able to extend RGB-D visual odometry to large scale, open environments where depth often cannot be sufficiently acquired. The core of our method is a bundle adjustment step that refines the motion estimates in parallel by processing a sequence of images, in a batch optimization. We have evaluated our method in three sensor setups, one using an RGB-D camera, and two using combinations of a camera and a 3D lidar. Our method is rated #4 on the KITTI odometry benchmark irrespective of sensing modality—compared to stereo visual odometry methods which retrieve depth by triangulation. The resulting average position error is 1.14 % of the distance traveled.













Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Andersen, R. (2008). Modern methods for robust regression.Sage University paper series on quantitative applications in the social sciences.
Badino, H., & Kanade, T. (2011). A head-wearable short-baseline stereo system for the simultaneous estimation of structure and motion. In IAPR conference on machine vision application, Nara.
Badino, A.Y.H., & Kanade, T. (2013). Visual odometry by multi-frame feature integration. In Workshop on computer vision for autonomous driving (collocated with ICCV 2013), Sydney.
Corke, P., Strelow, D., & Singh, S. (2004). Omnidirectional visual odometry for a planetary rover. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, Sendai (pp. 149–171).
de Berg, M., M., Cheong, O., van Kreveld, M., & Overmars, M. (2008). Computation geometry: Algorithms and applications (3rd ed.). New York: Springer.
Dryanovski, I., Valenti, R., Xiao, J. (2013). Fast visual odometry and mapping from RGB-D data. In IEEE international conference on robotics and automation (ICRA), Karlsruhe.
Engel, J., Sturm, J., Cremers, D. (2013). Semi-dense visual odometry for a monocular camera. In IEEE international conference on computer vision (ICCV), Sydney.
Engelhard, N., Endres, F., Hess, J., Sturm, J., & Burgard, W. (2011). Real-time 3D visual SLAM with a hand-held RGB-D camera. In RGB-D Workshop on 3D perception in robotics at the European robotics forum, Vasteras.
Forster, C., Pizzoli, M., & Scaramuzza, D. (2014). SVO: Fast semi-direct monocular visual odometry. In IEEE international conference on robotics and automation (ICRA), Hong Kong.
Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? The kitti vision benchmark suite. In IEEE conference on computer vision and pattern recognition (pp. 3354–3361).
Geiger, A., Ziegler, J., & Stiller, C. (2011). Stereoscan: Dense3D reconstruction in real-time. In IEEE intelligentvehicles symposium, Baden-Baden.
Geiger, A., Lenz, P., Stiller, C., & Urtasun, R. (2013). Vision meets robotics: The KITTI dataset. International Journal of Robotics Research, 32, 1229–1235.
Hartley, R., & Zisserman, A. (2004). Multiple view geometry in computer vision. New York: Cambridge University Press.
Henry, P., Krainin, M., Herbst, E., Ren, X., & Fox, D. (2012). RGB-D mapping: Using kinect-style depth cameras for dense 3D modeling of indoor environments. The International Journal of Robotics Research, 31(5), 647–663.
Howard, A. (2008). Real-time stereo visual odometry for autonomous ground vehicles. In IEEE international conference on intelligent robots and systems, Nice.
Hu, G., Huang, S., Zhao, L., Alempijevic, A., & Dissanayake, G. (2012) A robust RGB-D slam algorithm. In IEEE/RSJ international conference on intelligent robots and systems, Vilamoura.
Huang, A., Bachrach, A., Henry, P., Krainin, M., Maturana, D., Fox, D., & Roy, N. (2011). Visual odometry and mapping for autonomous flight using an RGB-D camera. In International symposium on robotics research (ISRR), Flagstaff.
Kaess, M., Johannsson, H., Roberts, R., Ila, V., Leonard, J., & Dellaert, F. (2012). iSAM2: Incremental smoothing and mapping using the Bayes tree. International Journal of Robotics Research, 31, 217–236.
Kerl, C., Sturm, J., & Cremers, D. (2013). Robust odometry estimation for RGB-D cameras. In IEEE international conference on robotics and automation, Karlsruhe.
Klein, G., & Murray, D. (2007). Parallel tracking amd mapping for small AR workspaces. In Proceedings of the international symposium on mixed and augmented reality, Nara (pp. 1–10).
Konolige, K., Agrawal, M., & Sol, J. (2011). Large-scale visual odometry for rough terrain. Robotics Research, 66, 201–212.
Lucas, B., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In Proceedings of imaging understanding workshop (pp. 121–130).
Maimone, M., Cheng, Y., & Matthies, L. (2007). Two years of visual odometry on the mars exploration rovers. Journal of Field Robotics, 24(2), 169–186.
Murray, R. M., Li, Z., & Sastry, S. S. (1994). A mathematical introduction to robotic manipulation. Boca Raton: CRC Press.
Newcombe, R., Davison, A., Izadi, S., Kohli, P., Hilliges, O., Shotton, J., Molyneaux, D., Hodges, S., Kim, D., & Fitzgibbon, A. (2011). KinectFusion: Real-time dense surface mapping and tracking. In IEEE international symposium on mixed and augmented reality (pp. 127–136).
Newcombe, R.A., Lovegrove, S.J., & Davison, A.J. (2011). DTAM: Dense tracking and mapping in real-time. In IEEE International Conference on Computer Vision, 2011, (pp. 2320–2327).
Nister, D., Naroditsky, O., & Bergen, J. (2006). Visual odometry for ground vechicle applications. Journal of Field Robotics, 23(1), 3–20.
Paz, L., Pinies, P., & Tardos, J. (2008). Large-scale 6-DOF SLAM with stereo-in-hand. IEEE Transactions on Robotics, 24(5), 946–957.
Quigley, M., Gerkey, B., Conley, K., Faust, J., Foote, T., Leibs, J., Berger, E., Wheeler, R., & Ng, A. (2009). ROS: An open-source robot operating system. In Workshop on open source software (collocated with ICRA 2009), Kobe.
Rusinkiewicz, S., & Levoy, M. (2001). Efficient variants of the ICP algorithm. In Third international conference on 3D digital imaging and modeling (3DIM), Quebec City.
Sattler, T., Leibe, B., & Kobbelt, L. (2011). Fast image-based localization using direct 2D-to-3D matching. In IEEE international conference on computer vision (ICCV), Barcelona.
Scaramuzza, D. (2011). 1-point-ransac structure from motion for vehicle-mounted cameras by exploiting non-holonomic constraints. International Journal of Computer Vision, 95, 74–85.
Sturm, J., Bylow, E., Kerl, C., Kahl, F., & Cremer, D. (2013). Densetracking and mapping with a quadrocopter. In Unmanned aerialvehicle in geomatics (UAV-g), Rostock.
Vogiatzis, G., & Hernandez, C. (2011). Video-based, real-time multi-view stereo. Image and Vision Computing, 29(7), 434–441.
Weiss, S., Achtelik, M., Lynen, S., Achtelik, M., Kneip, L., Chli, M., et al. (2013). Monocular vision for long-term micro aerial vehicle state estimation: A compendium. Journal of Field Robotics, 30(5), 803–831.
Whelan, T., Johannsson, H., Kaess, M., Leonard, J., & McDonald, J. (2013). Robust real-time visual odometry for dense RGB-D mapping. In IEEE international conference on robotics and automation, Karlsruhe.
Zhang, J., Kaess, M., & Singh, S. (2014). Real-time depth enhanced monocular odometry. In IEEE/RSJ international conference on intelligent robots and systems (IROS), Chicago.
Zhang, J., & Singh, S. (2014). LOAM: Lidar odometry and mapping in real-time. In Robotics: Science and systems conference (RSS), Berkeley.
Zhang, J., & Singh, S. (2015). Visual-lidar odometry and mapping: Low-drift, robust, and fast. In Submitted to IEEE international conference on robotics and automation (ICRA), Seattle.
Acknowledgments
The paper is based upon work supported by the National Science Foundation under Grant No. IIS-1328930. Special thanks are given to S. Scherer, M. Bergerman, D. Huber, S. Nuske, Z. Fang, and D. Maturana for their insightful inputs.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
All authors are affiliated with Carnegie Mellon University. Author Michael Kaess was affiliated with Massachusetts Institute of Technology within the last three years. Michael Kaess also serves as an associate editor for IEEE Transactions on Robotics. Author Sanjiv Singh is the editor in chief of Journal of Field Robotics.
Funding
This study is funded by National Science Foundation (Grant No. IIS-1328930).
Rights and permissions
About this article
Cite this article
Zhang, J., Kaess, M. & Singh, S. A real-time method for depth enhanced visual odometry. Auton Robot 41, 31–43 (2017). https://doi.org/10.1007/s10514-015-9525-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-015-9525-1