Abstract
Recovering 6D object pose has gained much focus, because of its application in robotic intelligent manipulation to name but a few. This paper presents an approach for 6D object pose refinement from noisy depth images obtained from a consumer depth sensor. Compared to the state of the art aimed at the same goal, the proposed method has high precision, high robustness to partial occlusions and noise, low computation cost and fast convergence. This is achieved by using an iterative scheme that only employs Random Forest to minimize a cost function of object pose which can quantify the misalignment between the ground truth and the estimated one. The random forest in our algorithm is learnt only using synthetic depth images rendered from 3D model of the object. Several experimental results show the superior performance of the proposed approach compared to ICP-based algorithm and optimization-based algorithm, which are generally used for 6D pose refinement in depth images. Moreover, the iterative process of our algorithm can be much faster than the state of the art by only using one CPU core.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Collet A, Martinez M, Srinivasa SS (2011) The MOPED framework: object recognition and pose estimation for manipulation. Int J Robot Res 30(10):1284–1306
Savarese S, and Fei-Fei L (2007) 3D generic object categorization, localization and pose estimation. In: International Conference on Computer Vision, pp. 1–8
Besl PJ, Mckay ND (1992) A method for registration of 3-D shapes. IEEE Trans Pattern Anal Mach Intell 14(2):239–256
Rusinkiewicz S, Levoy M (2001) Efficient variants of the ICP algorithm. In: IEEE international conference on 3D digital imaging and modeling, pp 145–152
Fisher R. (2001) Projective ICP and stabilizing architectural augmented reality overlays. In: International symposium on virtual and augmented architecture, pp. 69–80
Zabulis X, Lourakis M, and Koutlemanis P. (2015) 3D object pose refinement in range images. In: International conference on computer vision systems, pp. 263–274
Zabulis X, Lourakis M, Koutlemanis P (2016) Correspondence-free pose estimation for 3D objects from noisy depth data. Vis Comput:1–19
Tan D J, Ilic S (2014) Multi-forest tracker: a chameleon in tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1202–1209
Tan D J, Tombari F, Ilic S, et al. (2015) A versatile learning-based 3D temporal tracker: scalable, robust, online. In: IEEE International Conference on Computer Vision, pp. 693–701
Frome A, Huber D, Kolluri R, et al. (2004) Recognizing objects in range data using regional point descriptors. In: European conference on computer vision, pp. 224–237
Zaharescu A, Boyer E, Varanasi K, et al. (2009) Surface feature detection and description with applications to mesh matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 373–380
Rusu R B, Blodow N, Beetz M (2009) Fast point feature histograms (FPFH) for 3D registration. In: IEEE International Conference on Robotics and Automation, pp. 3212–3217
Rusu R B, Bradski G, Thibaux R, et al. (2010) Fast 3d recognition and pose using the viewpoint feature histogram. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2155–2162
Bronstein M M, Kokkinos I (2010) Scale-invariant heat kernel signatures for non-rigid shape recognition. In: IEEE Conference on Computer Vision and Pattern Recognition pp. 1704–1711
Drost B, Ulrich M, Navab N, et al. (2010) Model globally, match locally: efficient and robust 3D object recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 998–1005
Hinterstoisser S, Lepetit V, Rajkumar N, et al. (2016) Going further with point pair features. In: European conference on computer vision, pp. 834–848
Darom T, Keller Y (2012) Scale-invariant features for 3-D mesh models. IEEE Trans Image Process 21(5):2758–2769
Choi C, Christensen H I (2012) 3D pose estimation of daily objects using an RGB-D camera. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3342–3349
Hinterstoisser S, Lepetit V, Ilic S, et al. (2012) Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Asian conference on computer vision, pp. 548–562
Hinterstoisser S, Cagniart C, Ilic S et al (2012) Gradient response maps for real-time detection of textureless objects. IEEE Trans Pattern Anal Mach Intell 34(5):876–888
Rios-Cabrera R, Tuytelaars T. (2014) Discriminatively trained templates for 3D object detection: a real time scalable approach. In: IEEE international conference on computer vision, pp. 2048–2055
Kehl W, Tombari F, Navab N, et al. (2016) Hashmod: a hashing method for scalable 3D object detection. In: British machine vision conference, pp. 1–12
Cai H, Werner T, Matas J (2013) Fast detection of multiple Textureless 3-D objects. In: International Conference on Computer Vision Systems, pp. 103–112
Hodan T, Zabulis X, Lourakis M, et al. (2015) Detection and fine 3D pose estimation of texture-less objects in RGB-D images. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4421–4428
Zhang H, Cao Q (2017) Texture-less object detection and 6D pose estimation in RGB-D images. Robot Auton Syst 95:64–79
Bradski G, Bradski G, Xu B X, et al. (2010) Depth-encoded hough voting for joint object detection and shape recovery. In: European Conference on Computer Vision, pp. 658–671
Brachmann E, Krull A, Michel F, et al. (2014) Learning 6D object pose estimation using 3D object coordinates. In: European conference on computer vision, pp. 536–551
Krull A, Brachmann E, Michel F, et al. (2015) Learning analysis-by-synthesis for 6D pose estimation in RGB-D images. In: International Conference on Computer Vision, pp. 954–962
Wohlhart P, Lepetit V (2015) Learning descriptors for object recognition and 3D pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3109–3118
Doumanoglou A, Kouskouridas R, Malassiotis S, et al. (2016) Recovering 6D object pose and predicting next-best-view in the crowd. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3583–3592
Kehl W, Milletari F, Tombari F, et al. (2016) Deep learning of local RGB-D patches for 3D object detection and 6D pose estimation. In: European conference on computer vision, pp. 205–220
Zhang H, Cao Q (2017) Combined holistic and local patches for recovering 6D object pose. In: International Conference on Computer Vision Workshops, pp. 2219–2227
Kehl W, Manhardt F, Tombari F, et al. (2017) SSD-6D: making RGB-based 3D detection and 6D pose estimation great again. In: International Conference on Computer Vision, pp. 1530–1538
Michel F, Kirillov A, Brachmann E, et al. (2016) Global hypothesis generation for 6D object pose estimation. In: IEEE conference on computer vision and pattern recognition, pp. 115–124
Porzi L, Penate-Sanchez A, Ricci E, et al. (2017) Depth-aware convolutional neural networks for accurate 3D pose estimation in RGB-D images. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5777–5783
Jafari OH, Mustikovela SK, Pertsch K, et al. (2017) The Best of BothWorlds: Learning Geometry-based 6D Object Pose Estimation. arXiv preprint arXiv:1712.01924
Acknowledgments
This work has been supported by National Natural Science Foundation of China (Grant No. 61673261).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, H., Cao, Q. Fast 6D object pose refinement in depth images. Appl Intell 49, 2287–2300 (2019). https://doi.org/10.1007/s10489-018-1376-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-018-1376-y