[go: up one dir, main page]

Skip to main content
Log in

Multi-sensor 3D object dataset for object recognition with full pose estimation

  • Computational Intelligence for Vision and Robotics
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In this work, we propose a new dataset for 3D object recognition using the new high-resolution Kinect V2 sensor and some other popular low-cost devices like PrimeSense Carmine. Since most already existing datasets for 3D object recognition lack some features such as 3D pose information about objects in the scene, per pixel segmentation or level of occlusion, we propose a new one combining all this information in a single dataset that can be used to validate existing and new 3D object recognition algorithms. Moreover, with the advent of the new Kinect V2 sensor we are able to provide high-resolution data for RGB and depth information using a single sensor, whereas other datasets had to combine multiple sensors. In addition, we will also provide semiautomatic segmentation and semantic labels about the different parts of the objects so that the dataset could be used for testing robot grasping and scene labeling systems as well as for object recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. http://www.dtic.ua.es/~agarcia/dataset.

  2. http://www.danielgm.net/cc/.

  3. https://github.com/Blitzman/multisensor-dataset-tools.

References

  1. Lai K, Bo L, Ren X, Fox D (2011) A large-scale hierarchical multi-view RGB-D object dataset. In: 2011 IEEE international conference on robotics and automation (ICRA). IEEE, pp 1817–1824

  2. Richtsfeld A, Morwald T, Prankl J, Zillich M, Vincze M (2012) Segmentation of unknown objects in indoor environments. In: 2012 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 4791–4796

  3. Ciocarlie M, Bradski G, Hsiao K, Brook P (2010) A dataset for grasping and manipulation using ros. In: IROS workshop, RoboEarth—towards a world wide web for robots, Taipei, Taiwan

  4. Mian A, Bennamoun M, Owens R (2006) Three-dimensional model-based object recognition and segmentation in cluttered scenes. IEEE Trans Pattern Anal Mach Intell 28(10):1584–1601

    Article  Google Scholar 

  5. Mian A, Bennamoun M, Owens R (2010) On the repeatability and quality of keypoints for local feature-based 3d object retrieval from cluttered scenes. Int J Comput Vis 89(2–3):348–361

    Article  Google Scholar 

  6. Aldoma A, Tombari F, Di Stefano L, Vincze M (2012) A global hypotheses verification method for 3D object recognition. In: Computer vision-ECCV 2012. Springer, pp 511–524

  7. Singh A, Sha J, Narayan KS, Achim T, Abbeel P (2014) BigBIRD: A large-scale 3D database of object instances. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), 2014. IEEE, pp 509–516

  8. Hinterstoisser S, Lepetit V, Ilic S, Holzer S, Bradski G, Konolige K, Navab N (2013) Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Computer vision-ACCV 2012. Springer, pp 548–562

  9. Tombari F, Salti S, Di Stefano L (2010) Unique signatures of histograms for local surface description. In: Computer vision-ECCV 2010. Springer, pp 356–369

  10. Tombari F, Salti S, Di Stefano L (2011) A combined texture-shape descriptor for enhanced 3D feature matching. In: 2011 18th IEEE international conference on image processing (ICIP). IEEE, pp 809–812

  11. Salti S, Tombari F, Di Stefano L (2014) SHOT: unique signatures of histograms for surface and texture description. Comput Vis Image Underst 125:251–264

    Article  Google Scholar 

  12. Rusu R, Cousins S (2011) 3D is here: Point Cloud Library (PCL). In: IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China

  13. Chen Y, Medioni G (1991) Object modeling by registration of multiple range images. In: 1991 IEEE international conference on robotics and automation. Proceedings. IEEE, pp 2724–2729

  14. Besl PJ, McKay ND (1992) Method for registration of 3-D shapes. In: Robotics-DL tentative. International Society for Optics and Photonics, pp 586–606

  15. Rusinkiewicz S, Levoy M (2001) Efficient variants of the ICP algorithm. In: Third international conference on 3-D digital imaging and modeling, 2001. Proceedings. IEEE, pp 145–152

  16. Bradski G, Kaehler A (2008) Learning OpenCV: computer vision with the OpenCV library. O’Reilly Media, Sebastopol

    Google Scholar 

  17. Mortensen E, Barrett W (1995) Intelligent scissors for image composition. In: Proceedings of the 22nd annual conference on computer graphics and interactive techniques. ACM, pp 191–198

  18. Ruzon M, Tomasi C (2000) Alpha estimation in natural images. In: IEEE conference on computer vision and pattern recognition, 2000. Proceedings, vol 1. IEEE, pp 18–25

  19. Chuang Y, Curless B, Salesin D, Szeliski R (2001) A bayesian approach to digital matting. In: Proceedings of the 2001 IEEE Computer Society conference on computer vision and pattern recognition, 2001. CVPR 2001, vol 2. IEEE, pp II–264

  20. Rother C, Kolmogorov V, Blake A (2004) Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314

    Article  Google Scholar 

  21. Greig D, Porteous B, Seheult A (1989) Exact maximum a posteriori estimation for binary images. J R Stat Soc Ser B (Methodol), pp 271–279

  22. Boykov Y, Jolly M (2001) Interactive graph cuts for optimal boundary & region segmentation of objects in ND images. In: Eighth IEEE international conference on computer vision, 2001. ICCV 2001. Proceedings, vol 1. IEEE, pp 105–112

Download references

Acknowledgments

This work was partially funded by the Spanish Government DPI2013-40534-R Grant. This work has also been funded by the grant “Ayudas para Estudios de Mster e Iniciacin a la Investigacin” from the University of Alicante.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alberto Garcia-Garcia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Garcia-Garcia, A., Orts-Escolano, S., Oprea, S. et al. Multi-sensor 3D object dataset for object recognition with full pose estimation. Neural Comput & Applic 28, 941–952 (2017). https://doi.org/10.1007/s00521-016-2224-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-016-2224-9

Keywords

Navigation