[go: up one dir, main page]

Skip to content

Latest commit

 

History

History
19 lines (11 loc) · 1.56 KB

README.md

File metadata and controls

19 lines (11 loc) · 1.56 KB

MAD Data documentation

This document contains the up-to-data list of data files released by MAD. If you do not have access to the data yet, please follow the steps here.

Data ZOO:

  • DataInspection.ipynb: A jupyter notebook containing the data walkthrough. It showcases how to navigate the content of all the files below.

  • annotations.tar.xz: Tar file containing the annotations for the dataset. It contains a single json file for each split.

  • CLIP_B32_frames_features_5fps.h5: H5 file containing the frame-level embeddings for the videos contained in MAD computed with the CLIP B/32 model. Frames are extracted at 5FPS. It is used for computing the performance of Zero-Shot CLIP and VLG-Net baselines.

  • CLIP_B32_language_tokens_features.h5: H5 file containing the token-level embeddings for the entire dataset's query annotations extracted with the CLIP B/32 model. This file is used to train and evaluate the VLG-Net baseline.

  • CLIP_B32_language_features_MAD_test.h5: H5 file containing the sentence-level embeddings for the test set query annotations extracted with the CLIP B/32 model. This file is only used to compute the Zero-Shot CLIP baseline performance.

  • CLIP_L14_frames_features_5fps.h5: H5 file containing the frame-level embeddings for the videos contained in MAD computed with the CLIP L/14 model. Frames are extracted at 5FPS.

  • CLIP_L14_language_tokens_features.h5: H5 file containing the token-level embeddings for the entire dataset's query annotations extracted with the CLIP L/14 model.