[paper][project page][demo][code]
The official data cuartion code for CameraSettings20k (from "Camera Settings as Tokens: Modeling Photography on Latent Diffusion Models", SIGGRAPH Asia 2024).
We highly recommend using the Conda to build the environment.
You can build and activate the environment by following commands.
conda env create -f env.yml
conda activate CameraSettings20k
After activating the environment, you need to install the requirements by running the following command due to dependency issue of LAVIS.
pip install opencv-python
Please download the RAISE, DDPD, and PPR10k datasets and put their raw images as follows.
CameraSettings20k_src ┬ RAISE_raw
├ DDPD_raw # Put indoor and ourddoor CR2 images in the same folder
├ PPR10k_raw # Put raw images ("PPR10K-dataset/raw") in this folder
└ PPR10K_360_tif # Put 360p tif images ("PPR10K-dataset/train_val_images_tif_360p") in this folder
After preparing the source datasets, you can curate the CameraSettings20k (with out image caption) by running the following command.
python data_curation.py --image_size <image_size> --source_dir <path to CameraSettings20k_src> --target_dir <path to save CameraSettings20k>
This code will generate the CameraSettings20k dataset with the following structure for training with diffusers and datasets.
CameraSettings20k - train ┬ metadata.jsonl
├ <image_id_0>.png
├ <image_id_1>.png
├ ...
└ <image_id_n>.png
After curating the CameraSettings20k, you can generate the image caption by running the following command.
python image_caption.py --dataset_dir <path to CameraSettings20k>
This dataset is for research purposes only. We do not own the rights to the images in the source datasets. Please refer to the original source datasets for the usage of the images.
If you use CameraSettings20K, please cite our paper.
@inproceedings{fang2024camera,
title={Camera Settings as Tokens: Modeling Photography on Latent Diffusion Models},
author={I-Sheng Fang and Yue-Hua Han and Jun-Cheng Chen},
booktitle={SIGGRAPH Asia},
year={2024}
}