Adding new tasks to ViLCo consists of three main components:
- Generate metadata for continual learning sub-tasks splitting, following the template from
scripts/split_mq.py
- Data Processing: refactor
QILSetTask()
method inlibs/datasets/cl_benchmark.py
for each task, including sepecific dataset class inlibs/datasets/ego4d.py
- Changing the task trainer function: train_one_epoch function in
libs/utils/train_utils.py
, including the corresponding metric inlibs/utils/metric.py
- Adding task configuration and parameters to
configs/
i. In libs/datasets/ego4d.py
, you have to create:
a sepecific TaskDataset class, such as class Ego4dCLDataset, mainly focus on
_load_json_db()
method, and__getitem__()
method
Ego4dCLDataset
: This is a torch.utils.data.Dataset class. The __init__()
method can take in whatever arguments you want, but should contain at least three arguments:
current_task_data
: A subset data for current taskfeat_folder
: pre-extracted feature foldertext_feat_folder
: pre-extracted query feature folder- Optional:
narration_feat_folder
, a pre-extracted folder for SSL.
The Ego4dCLDataset
class must have a __getitem__()
method that returns the text and video feature inputs and output label, in the form of a dictionary.
ii. In libs/datasets/cl_benchmark.py
,
QILSetTask()
: This is a class that assign sub-tasks from metadata.
This includes all sub-tasks and assign them to the model step by step.
The method you should change:
__next__
: determine which sub-task you will use in the next step.get_valSet_by_taskNum
: based on the task numbers to evaluate metrics within continual learning.
In libs/utils/train_utils.py
, you have to change a train_one_epoch
function. The trainer controls the whole training process and printed information during training within one epoch
You should also create a new evaluation funtion, such as final_validate
to calculate metrics.
In configs/mq_vilco.yaml
, you need to create a dictionary, containing the following keys:
dataset_name
: Name of the dataloaderdataset
: all information containing this task's data, including json_file, feat_folder..model
: specific architecture for visual/textual encoder and cross-modal encoder.opt
: Learning hyperparametersnum_epochs
,lr
,weight_decay
.train_cfg
: training configurationcl_cfg
: continual learning method configuration.