In order to reproduce the results in the paper smoothly, the dataset needs to be ingested correctly. Here we identify two modality of datasets
We use class GluonCVMotionDataset to represent each video-based dataset, for example MOT17, TAO-person, AOT, etc.
In order to ingest the original video dataset to GluonCVMotionDataset format, we provide ingestion scripts in data/ingestion folder, please follow the examples to ingest the video datasets.
To ingest your own dataset, organize it in the following structure:
| -- raw_data
All artifacts related to the raw dataset are put to the raw_data folder. After ingestion, the dataset structure is expected to be like the following:
|-- annotation
|-- anno.json
|-- splits.json
|-- cache
|-- raw_data
We also provide the following ingested datasets (anno.json
and splits.json
files are provided),
Please make sure that cfg.DATASETS.ROOT_DIR
in the configuration has been pointed to dataset_root
- MOT17:
MOT17 videos with all 3 set of detections (DPM, FRCNN, SDP)
Ingested annotation - MOT17_DPM:
MOT17 videos with DPM detection
Ingested annotation - TAO:
TAO-person dataset
- CRP:
Caltech Roadside Pedestrains dataset
- AOT:
AOT dataset for airbone object detection and tracking
In order to train with the above ingested datatset, the raw videos need to be downloaded the original data page, and extract them into raw_data
We use class COCO to represent each image-based dataset, for example, COCO17, CrowdHuman. Please follow the example to ingest the image-based datasets.
We provide the following ingested person detection datasets:
- COCO17_person_train Ingested annotation
- CrowdHuman_fbox_train Ingested annotation
- CrowdHuman_vbox_train Ingested annotation