Computer Science > Computer Vision and Pattern Recognition

arXiv:2312.08869 (cs)

[Submitted on 10 Dec 2023 (v1), last revised 30 Mar 2024 (this version, v2)]

Title:I'M HOI: Inertia-aware Monocular Capture of 3D Human-Object Interactions

Authors:Chengfeng Zhao, Juze Zhang, Jiashen Du, Ziwei Shan, Junye Wang, Jingyi Yu, Jingya Wang, Lan Xu

Abstract:We are living in a world surrounded by diverse and "smart" devices with rich modalities of sensing ability. Conveniently capturing the interactions between us humans and these objects remains far-reaching. In this paper, we present I'm-HOI, a monocular scheme to faithfully capture the 3D motions of both the human and object in a novel setting: using a minimal amount of RGB camera and object-mounted Inertial Measurement Unit (IMU). It combines general motion inference and category-aware refinement. For the former, we introduce a holistic human-object tracking method to fuse the IMU signals and the RGB stream and progressively recover the human motions and subsequently the companion object motions. For the latter, we tailor a category-aware motion diffusion model, which is conditioned on both the raw IMU observations and the results from the previous stage under over-parameterization representation. It significantly refines the initial results and generates vivid body, hand, and object motions. Moreover, we contribute a large dataset with ground truth human and object motions, dense RGB inputs, and rich object-mounted IMU measurements. Extensive experiments demonstrate the effectiveness of I'm-HOI under a hybrid capture setting. Our dataset and code will be released to the community.

Comments:	Accepted to CVPR 2024. Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2312.08869 [cs.CV]
	(or arXiv:2312.08869v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2312.08869

Submission history

From: Chengfeng Zhao [view email]
[v1] Sun, 10 Dec 2023 08:25:41 UTC (28,011 KB)
[v2] Sat, 30 Mar 2024 07:23:20 UTC (32,345 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:I'M HOI: Inertia-aware Monocular Capture of 3D Human-Object Interactions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:I'M HOI: Inertia-aware Monocular Capture of 3D Human-Object Interactions

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators