Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation (ICLR'24)
Use this repo to reproduce our results and customize your own multi-gpu continual learning algorithms.
Abstract
-
We propose and study a realistic Continual Learning (CL) setting where learning algorithms are granted a restricted computational budget per time step while training.
-
We apply this setting to large-scale semi-supervised Continual Learning scenarios with sparse label rate. Previous proficient CL methods perform very poorly in this challenging setting. Overfitting to the sparse labeled data and insufficient computational budget are the two main culprits for such a poor performance.
-
We propose a simple but highly effective baseline, DietCL, which utilizes both unlabeled and labeled data jointly. DietCL meticulously allocates computational budget for both types of data.
-
We validate our baseline, at scale, on several datasets, e.g., CLOC, ImageNet10K, and CGLM, under constraint budget setup. DietCL outperforms, by a large margin, all existing supervised CL algorithms as well as more recent continual semi-supervised methods. Our extensive analysis and ablations demonstrate that DietCL is stable under a full spectrum of label sparsity, computational budget and various other ablations.
conda env create -f environment.yml
conda activate dietcl
Follow this issue to fix the library import problem.
Follow this link to download teh pre-trained model.
To avoid repeatant long pre-processing time and for stable results and faster read, we suggest to pre-process the dataset once and save it as folders of symbolic links. Please follow the following steps to prepare the dataset:
- download the imagenet 21k v2 dataset from the official ImageNet Website. We use the Winter 2021 release, i.e., Processed version of ImageNet21Kusing the script of "ImageNet-21K pretraining for the masses"
- Run the following script to prepare the dataset. Here we use three separate files to build the general imagenet10k, the task sequence, and the task labels for flexiable usage.
python pre_process/get_unique_set.py --root21k /path/to/your/imagenet21k/folder
python pre_process/build_cl_tasks.py
python pre_process/label_split.py
- Run the following command to reproduce the results.
N_GPU=4
python main.py trainer@_global_=diet dataset@_global_=imagenet10k \
n_gpu_per_node=${N_GPU} \
data_root=/path/to/your/imagenet21k/folder
- Download the CGLM dataset using the following script.
bash pre_process/download_cglm.sh
Download the CLOC dataset from the official CLOC
- Run the following command to split the dataset
python pre_process/cglm|cloc.py --root /path/to/your/cglm|cloc/folder
- Run the following command to reproduce the results.
data_path
refers to the path to the split files, anddata_root
refers to the path to the image files.
N_GPU=4
python main.py trainer@_global_=diet dataset@_global_=cglm|cloc \
n_gpu_per_node=${N_GPU} \
data_root=/path/to/your/cglm|cloc/folder data_path=/path/to/your/cglm|cloc/split/file/folder
- Supervised continual learning with experience replay
# mix the current set and buffer. Uniformly sampling from the buffer and current task
python main.py trainer@_global_=base sampling=uniform replay_before=True
# Balanced sampling from buffer and the current labeled set for each batch.
python main.py trainer@_global_=base sampling=batchmix
- Continual pre-training and finetuning: pre-train and finetune for each tasks
python main.py trainer@_global_=pretrain
Customize your multi-gpu continual learning with this repo.
- Write your own dataset and put it in the
datasets
folder with the template
class YourDataset(Dataset):
def __init__(self, args):
pass
def get_new_classes(self, task_id): # we need this function to adjust the classification head for the model
pass
def get_labeled_set(self,task):
pass
def get_eval_set(self,task,per_task_eval):
# per_task_eval refers to whether we want to evaluate all tasks at once (for efficiency purpose) or evaluate each task separately.
pass
- Write your own trainer and put it in the
trainers
folder with the template - Write your own model and put it in the
models
folder
- If slurm is used, please make sure to allocate enough CPU and CPU memory.
We thank the authors of the following repositories for their great work:
If you find this work useful, please consider citing our paper:
@inproceedings{zhang2024continual,
title={Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation},
author={Zhang, Wenxuan and Mohamed, Youssef and Ghanem, Bernard and Torr, Philip and Bibi, Adel and Elhoseiny, Mohamed},
booktitle={International Conference on Learning Representations},
year={2024}
}