Computer Science > Computer Vision and Pattern Recognition

arXiv:2409.02581 (cs)

[Submitted on 4 Sep 2024]

Title:Object Gaussian for Monocular 6D Pose Estimation from Sparse Views

Authors:Luqing Luo, Shichu Sun, Jiangang Yang, Linfang Zheng, Jinwei Du, Jian Liu

Abstract:Monocular object pose estimation, as a pivotal task in computer vision and robotics, heavily depends on accurate 2D-3D correspondences, which often demand costly CAD models that may not be readily available. Object 3D reconstruction methods offer an alternative, among which recent advancements in 3D Gaussian Splatting (3DGS) afford a compelling potential. Yet its performance still suffers and tends to overfit with fewer input views. Embracing this challenge, we introduce SGPose, a novel framework for sparse view object pose estimation using Gaussian-based methods. Given as few as ten views, SGPose generates a geometric-aware representation by starting with a random cuboid initialization, eschewing reliance on Structure-from-Motion (SfM) pipeline-derived geometry as required by traditional 3DGS methods. SGPose removes the dependence on CAD models by regressing dense 2D-3D correspondences between images and the reconstructed model from sparse input and random initialization, while the geometric-consistent depth supervision and online synthetic view warping are key to the success. Experiments on typical benchmarks, especially on the Occlusion LM-O dataset, demonstrate that SGPose outperforms existing methods even under sparse view constraints, under-scoring its potential in real-world applications.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2409.02581 [cs.CV]
	(or arXiv:2409.02581v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2409.02581

Submission history

From: Luqing Luo [view email]
[v1] Wed, 4 Sep 2024 10:03:11 UTC (4,506 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Object Gaussian for Monocular 6D Pose Estimation from Sparse Views

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Object Gaussian for Monocular 6D Pose Estimation from Sparse Views

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators