VLDet: Learning Object-Language Alignments for Open-Vocabulary Object Detection

Learning Object-Language Alignments for Open-Vocabulary Object Detection,
Chuang Lin, Peize Sun, Yi Jiang, Ping Luo, Lizhen Qu, Gholamreza Haffari, Zehuan Yuan, Jianfei Cai,
ICLR 2023 (https://arxiv.org/abs/2211.14843)

Highlight

We are excited to announce that our paper was accepted to ICLR 2023! 🥳🥳🥳

A quick explainable video demo for VLDet

vldet_demo.mp4

Performance

Open-Vocabulary on COCO

Open-Vocabulary on LVIS

Installation

Requirements

Linux or macOS with Python ≥ 3.7
PyTorch ≥ 1.9. Install them together at pytorch.org to make sure of this. Note, please check PyTorch version matches that is required by Detectron2.
Detectron2: follow Detectron2 installation instructions.

Example conda environment setup

conda create --name VLDet python=3.7 -y
conda activate VLDet
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch-lts -c nvidia

# under your working directory

git clone https://github.com/clin1223/VLDet.git
cd VLDet
cd detectron2
pip install -e .
cd ..
pip install -r requirements.txt

Features

Directly learn an open-vocabulary object detector from image-text pairs by formulating the task as a bipartite matching problem.
State-of-the-art results on Open-vocabulary LVIS and Open-vocabulary COCO.
Scaling and extending novel object vocabulary easily.

Benchmark evaluation and training

Please first prepare datasets.

The VLDet models are finetuned on the corresponding Box-Supervised models (indicated by MODEL.WEIGHTS in the config files). Please train or download the Box-Supervised model and place them under VLDet_ROOT/models/ before training the VLDet models.

To train a model, run

python train_net.py --num-gpus 8 --config-file /path/to/config/name.yaml

To evaluate a model with a trained/ pretrained model, run

python train_net.py --num-gpus 8 --config-file /path/to/config/name.yaml --eval-only MODEL.WEIGHTS /path/to/weight.pth

Download the trained network weights here.

OV_COCO	box mAP50	box mAP50_novel
config_RN50	45.8	32.0

OV_LVIS	mask mAP_all	mask mAP_novel
config_RN50	30.1	21.7
config_Swin-B	38.1	26.3

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@article{VLDet,
  title={Learning Object-Language Alignments for Open-Vocabulary Object Detection},
  author={Lin, Chuang and Sun, Peize and Jiang, Yi and Luo, Ping and Qu, Lizhen and Haffari, Gholamreza and Yuan, Zehuan and Cai, Jianfei},
  journal={arXiv preprint arXiv:2211.14843},
  year={2022}
}

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Acknowledgement

This repository was built on top of Detectron2, Detic, RegionCLIP and OVR-CNN. We thank for their hard work.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
CenterNet2		CenterNet2
configs		configs
detectron2		detectron2
docs		docs
tools		tools
vldet		vldet
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
prepare_datasets.md		prepare_datasets.md
requirements.txt		requirements.txt
train_net.py		train_net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VLDet: Learning Object-Language Alignments for Open-Vocabulary Object Detection

Highlight

A quick explainable video demo for VLDet

Performance

Open-Vocabulary on COCO

Open-Vocabulary on LVIS

Installation

Requirements

Example conda environment setup

Features

Benchmark evaluation and training

Citation

License

Acknowledgement

About

Releases

Packages

Contributors 2

Languages

License

clin1223/VLDet

Folders and files

Latest commit

History

Repository files navigation

VLDet: Learning Object-Language Alignments for Open-Vocabulary Object Detection

Highlight

A quick explainable video demo for VLDet

Performance

Open-Vocabulary on COCO

Open-Vocabulary on LVIS

Installation

Requirements

Example conda environment setup

Features

Benchmark evaluation and training

Citation

License

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages