Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing no-filtering baseline commonpool_s_s13m_b4k #89

Open
dwahdany opened this issue Sep 17, 2024 · 0 comments
Open

Reproducing no-filtering baseline commonpool_s_s13m_b4k #89

dwahdany opened this issue Sep 17, 2024 · 0 comments

Comments

@dwahdany
Copy link

I'm trying to replicate basic results without filtering for now. There are some major differences between the models I trained locally and the ones on huggingface. The accuracy of the locally trained models is worse, when doing linear probing the differences are even larger (20% acc. from locally trained model vs. 44% from huggingface model on CIFAR100)

Dataset Encoder Zero-shot Test Linear Probe Test
cifar10 commonpool_s_s13m_b4k 0.4077 0.685 ± 0.0014
cifar10 local_commonpool_s_s13m_b4k_0 0.3572 0.4694 ± 0.0106
cifar10 local_commonpool_s_s13m_b4k_1 0.3443 0.4565 ± 0.0143
cifar10 local_commonpool_s_s13m_b4k_3 0.3406 0.4609 ± 0.0126
cifar10 local_commonpool_s_s13m_b4k_4 0.3346 0.469 ± 0.0141
cifar10 local_commonpool_s_s13m_b4k_2 0.3323 0.4447 ± 0.0164
vtab/cifar100 commonpool_s_s13m_b4k 0.1297 0.4355 ± 0.0025
vtab/cifar100 local_commonpool_s_s13m_b4k_1 0.1246 0.2024 ± 0.0035
vtab/cifar100 local_commonpool_s_s13m_b4k_0 0.1168 0.1997 ± 0.0085
vtab/cifar100 local_commonpool_s_s13m_b4k_3 0.1139 0.2004 ± 0.0066
vtab/cifar100 local_commonpool_s_s13m_b4k_2 0.1138 0.2002 ± 0.0043
vtab/cifar100 local_commonpool_s_s13m_b4k_4 0.1128 0.2047 ± 0.0044
  1. To my understanding, just calling train.py --scale small on the unmodified commonpool dataset should replicate the no-filter baseline commonpool_s_s13m_b4k. Is that right?
  2. I ran five different seeds for the pretraining and for each ten different seeds for the linear probing. Why are the results so different from the online models?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant