Can't Reproduce Zero Shot Performance MSRVTT and LSMDC with Intervid-10m-FLT Checkpoint #139

fmthoker · 2024-06-17T09:26:30Z

Dear Authors,
I am trying to reproduce Zeroshot performance with the checkpoint ViCLIP-L-14 InternVid-10M-FLT .
However, the performance is different from reported numbers in the paper. Here are the results I obtain:

MSRVTT:
txt_r1 txt_r5 txt_r10 txt_r_mean img_r1 img_r5 img_r10 img_r_mean r_mean
msrvtt_1k_test/ 38.9 62.2 74.0 58.37 39.4 61.9 73.0 58.10 58.23
msrvtt_1k_test_emb/ 39.0 62.2 73.3 58.17 39.1 63.2 73.9 58.73 58.45

LSMDC:

txt_r1 txt_r5 txt_r10 txt_r_mean img_r1 img_r5 img_r10 img_r_mean r_mean
test/ 15.2 29.0 35.6 26.6 17.8 32.1 40.1 30.00 28.30
test_emb/ 15.8 29.1 36.7 27.2 18.5 32.7 40.8 30.67 28.93

Here is the script that i run to obtain the performances:

source /ibex/user/thokerfm/anaconda3/bin/activate viclip
export PYTHONPATH=.

MASTER_NODE=$(scontrol show hostnames $SLURM_JOB_NODELIST | head -n 1)
MASTER_PORT=$((RANDOM % (65535 - 1024 + 1) + 1024))

echo $MASTER_NODE
echo $MASTER_PORT

OUTPUT_DIR='expirements_zero_shot/ViClip-InternVid-10M-FLT/lsmdc/'

OMP_NUM_THREADS=1
torchrun --rdzv_endpoint=${MASTER_NODE}:${MASTER_PORT}
--nnodes=1
--nproc_per_node=4
--rdzv_backend=c10d
tasks/retrieval.py
$(dirname $0)/config.py
wandb.enable False
train_corpus viclip
evaluate True
output_dir ${OUTPUT_DIR}
model.vision_encoder.pretrained 'CLIP-ViT-L/14'
model.text_encoder.pretrained 'CLIP-ViT-L/14'
pretrained_path pretrained_viclip_models/ViClip-InternVid-10M-FLT.pth

leexinhao · 2024-06-26T04:12:13Z

I guess you didn't turn on wise ft. We average the internvid10M-fliered weights with the original CLIP weights during the test.

fmthoker · 2024-06-26T08:37:43Z

@leexinhao thanks for the reply, after evaluating with wise ft = True, indeed the results are better:

MSRVTT:
txt_r1 txt_r5 txt_r10 txt_r_mean img_r1 img_r5 img_r10 img_r_mean r_mean
msrvtt_1k_test/ 42.0 65.7 75.3 61.0 41.9 66.5 75.6 61.33 61.17
msrvtt_1k_test_emb/ 42.8 66.8 75.5 61.7 42.8 67.2 75.5 61.83 61.77

LSMDC:
txt_r1 txt_r5 txt_r10 txt_r_mean img_r1 img_r5 img_r10 img_r_mean r_mean
test/ 16.4 32.0 39.3 29.23 18.8 36.1 43.5 32.80 31.02
test_emb/ 17.9 33.2 40.8 30.63 18.7 36.9 44.5 33.37 32.00

Can you please confirm which numbers are reported in the paper ( test or test_emb) ?

fmthoker mentioned this issue Jun 17, 2024

LSMDC Annotation Files #137

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't Reproduce Zero Shot Performance MSRVTT and LSMDC with Intervid-10m-FLT Checkpoint #139

Can't Reproduce Zero Shot Performance MSRVTT and LSMDC with Intervid-10m-FLT Checkpoint #139

fmthoker commented Jun 17, 2024 •

edited

Loading

leexinhao commented Jun 26, 2024

fmthoker commented Jun 26, 2024

Can't Reproduce Zero Shot Performance MSRVTT and LSMDC with Intervid-10m-FLT Checkpoint #139

Can't Reproduce Zero Shot Performance MSRVTT and LSMDC with Intervid-10m-FLT Checkpoint #139

Comments

fmthoker commented Jun 17, 2024 • edited Loading

leexinhao commented Jun 26, 2024

fmthoker commented Jun 26, 2024

fmthoker commented Jun 17, 2024 •

edited

Loading