Skip to content

Latest commit

 

History

History
164 lines (119 loc) · 6.69 KB

DATASET.md

File metadata and controls

164 lines (119 loc) · 6.69 KB

Dataset Preparation

Kinetics

The Kinetics Dataset could be downloaded via the code released by ActivityNet:

  1. Download the videos via the official scripts.

  2. After all the videos were downloaded, resize the video to the short edge size of 256, then prepare the csv files for training, validation, and testing set as train.csv, val.csv, test.csv. The format of the csv file is:

path_to_video_1 label_1
path_to_video_2 label_2
path_to_video_3 label_3
...
path_to_video_N label_N

All the Kinetics models in the Model Zoo are trained and tested with the same data as Non-local Network. For dataset specific issues, please reach out to the dataset provider.

AVA

The AVA Dataset could be downloaded from the official site

We followed the same downloading and preprocessing procedure as the Long-Term Feature Banks for Detailed Video Understanding do.

You could follow these steps to download and preprocess the data:

  1. Download videos
DATA_DIR="../../data/ava/videos"

if [[ ! -d "${DATA_DIR}" ]]; then
  echo "${DATA_DIR} doesn't exist. Creating it.";
  mkdir -p ${DATA_DIR}
fi

wget https://s3.amazonaws.com/ava-dataset/annotations/ava_file_names_trainval_v2.1.txt

for line in $(cat ava_file_names_trainval_v2.1.txt)
do
  wget https://s3.amazonaws.com/ava-dataset/trainval/$line -P ${DATA_DIR}
done
  1. Cut each video from its 15th to 30th minute
IN_DATA_DIR="../../data/ava/videos"
OUT_DATA_DIR="../../data/ava/videos_15min"

if [[ ! -d "${OUT_DATA_DIR}" ]]; then
  echo "${OUT_DATA_DIR} doesn't exist. Creating it.";
  mkdir -p ${OUT_DATA_DIR}
fi

for video in $(ls -A1 -U ${IN_DATA_DIR}/*)
do
  out_name="${OUT_DATA_DIR}/${video##*/}"
  if [ ! -f "${out_name}" ]; then
    ffmpeg -ss 900 -t 901 -i "${video}" "${out_name}"
  fi
done
  1. Extract frames
IN_DATA_DIR="../../data/ava/videos_15min"
OUT_DATA_DIR="../../data/ava/frames"

if [[ ! -d "${OUT_DATA_DIR}" ]]; then
  echo "${OUT_DATA_DIR} doesn't exist. Creating it.";
  mkdir -p ${OUT_DATA_DIR}
fi

for video in $(ls -A1 -U ${IN_DATA_DIR}/*)
do
  video_name=${video##*/}

  if [[ $video_name = *".webm" ]]; then
    video_name=${video_name::-5}
  else
    video_name=${video_name::-4}
  fi

  out_video_dir=${OUT_DATA_DIR}/${video_name}/
  mkdir -p "${out_video_dir}"

  out_name="${out_video_dir}/${video_name}_%06d.jpg"

  ffmpeg -i "${video}" -r 30 -q:v 1 "${out_name}"
done
  1. Download annotations
DATA_DIR="../../data/ava/annotations"

if [[ ! -d "${DATA_DIR}" ]]; then
  echo "${DATA_DIR} doesn't exist. Creating it.";
  mkdir -p ${DATA_DIR}
fi

wget https://research.google.com/ava/download/ava_train_v2.1.csv -P ${DATA_DIR}
wget https://research.google.com/ava/download/ava_val_v2.1.csv -P ${DATA_DIR}
wget https://research.google.com/ava/download/ava_action_list_v2.1_for_activitynet_2018.pbtxt -P ${DATA_DIR}
wget https://research.google.com/ava/download/ava_train_excluded_timestamps_v2.1.csv -P ${DATA_DIR}
wget https://research.google.com/ava/download/ava_val_excluded_timestamps_v2.1.csv -P ${DATA_DIR}
  1. Download "frame lists" (train, val) and put them in the frame_lists folder (see structure above).

  2. Download person boxes (train, val, test) and put them in the annotations folder (see structure above). If you prefer to use your own person detector, please see details in here.

Download the ava dataset with the following structure:

ava
|_ frames
|  |_ [video name 0]
|  |  |_ [video name 0]_000001.jpg
|  |  |_ [video name 0]_000002.jpg
|  |  |_ ...
|  |_ [video name 1]
|     |_ [video name 1]_000001.jpg
|     |_ [video name 1]_000002.jpg
|     |_ ...
|_ frame_lists
|  |_ train.csv
|  |_ val.csv
|_ annotations
   |_ [official AVA annotation files]
   |_ ava_train_predicted_boxes.csv
   |_ ava_val_predicted_boxes.csv

You could also replace the v2.1 by v2.2 if you need the AVA v2.2 annotation. You can also download some pre-prepared annotations from here.

Something-Something V2

  1. Please download the dataset and annotations from dataset provider.

  2. Download the frame list from the following links: (train, val).

  3. Extract the frames at 30 FPS. (We used ffmpeg-4.1.3 with command ffmpeg -i "${video}" -r 30 -q:v 1 "${out_name}" in experiments.) Please put the frames in a structure consistent with the frame lists.

Please put all annotation json files and the frame lists in the same folder, and set DATA.PATH_TO_DATA_DIR to the path. Set DATA.PATH_PREFIX to be the path to the folder containing extracted frames.

The detected boxes are available here. To use them, simply extract this file to SSV2.DATA_ROOT.

To use the compositional split (Something-Else dataset), download the dataset splits lists, and extract to SSV2.SPLITS_ROOT. Alternatively, you can download these files from https://github.com/joaanna/something_else.

EpicKitchen 100

Follow instructions from dataset provider. An h5 file containing the the processed bounding boxes will be generated once the dataloader is first generated. Alternativly you can download the processesd boxes file from here and place it in data_cache/linked_boxes/epickitchens.h5.