[3DV24] Cas6D: Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot,
Generalizable Approach using RGB Images

Panwang Pan^1,*, Zhiwen Fan^2,*, Brandon Y. Feng^3,*, Peihao Wang², Chenxin Li⁴, Zhangyang Wang²

¹ByteDance ²The University of Texas at Austin ³MIT ⁴The Chinese University of Hong Kong ^*denotes equal contribution

We present a new cascade framework named Cas6D for few-shot 6DoF pose estimation that is generalizable and uses only RGB images.

Training

Download processed co3d data (co3d.tar.gz), google scanned objects data (google_scanned_objects.tar.gz) and ShapeNet renderings (shapenet.tar.gz) at here.
Download COCO 2017 training set.
Organize files like

Gen6D
|-- data
    |-- GenMOP
        |-- chair 
            ...
    |-- LINEMOD
        |-- cat 
            ...
    |-- shapenet
        |-- shapenet_cache
        |-- shapenet_render
        |-- shapenet_render_v1.pkl
    |-- co3d_256_512
        |-- apple
            ...
    |-- google_scanned_objects
        |-- 06K3jXvzqIM
            ...
    |-- coco
        |-- train2017
4. Train the detector
```shell
python3 train_model.py --cfg configs/detector/detector_train.yaml

Train the selector

python3 train_model.py --cfg configs/selector/selector_train.yaml

Prepare the validation data for training refiner

python3 prepare.py --action gen_val_set \
                  --estimator_cfg configs/gen6d_train.yaml \
                  --que_database linemod/cat \
                  --que_split linemod_val \
                  --ref_database linemod/cat \
                  --ref_split linemod_val

python3 prepare.py --action gen_val_set \
                  --estimator_cfg configs/gen6d_train.yaml \
                  --que_database genmop/tformer-test \
                  --que_split all \
                  --ref_database genmop/tformer-ref \
                  --ref_split all

This command will generate the information in the data/val, which will be used in producing validation data for the refiner. 7. Train the refiner

python3 train_model.py --cfg configs/refiner/refiner_train.yaml

Evaluate all components together.

# Evaluate on the object TFormer from the GenMOP/LINEMOD dataset
python3 eval.py --cfg configs/cas6d_train.yaml

Acknowledgement

We would like to thank Gen6D authors for open-sourcing their implementations.

Citation

If you find this repo is helpful, please consider citing:

@inproceedings{pan2024learning,
  title={Learning to estimate 6dof pose from limited data: A few-shot, generalizable approach using rgb images},
  author={Pan, Panwang and Fan, Zhiwen and Feng, Brandon Y and Wang, Peihao and Li, Chenxin and Wang, Zhangyang},
  booktitle={2024 International Conference on 3D Vision (3DV)},
  pages={1059--1071},
  year={2024},
  organization={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
configs		configs
dataset		dataset
network		network
utils		utils
.gitignore		.gitignore
README.md		README.md
colmap_script.py		colmap_script.py
estimator.py		estimator.py
eval.py		eval.py
gpu_mem_track.py		gpu_mem_track.py
modelsize_estimate.py		modelsize_estimate.py
predict.py		predict.py
prepare.py		prepare.py
requirements.txt		requirements.txt
run_eval.sh		run_eval.sh
train_model.py		train_model.py
train_refine.sh		train_refine.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[3DV24] Cas6D: Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot,
Generalizable Approach using RGB Images

Training

Evaluate all components together.

Acknowledgement

Citation

About

Releases

Packages

Languages

paulpanwang/Cas6D

Folders and files

Latest commit

History

Repository files navigation

[3DV24] Cas6D: Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot, Generalizable Approach using RGB Images

Training

Evaluate all components together.

Acknowledgement

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

[3DV24] Cas6D: Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot,
Generalizable Approach using RGB Images

Packages