Skip to content

Commit bca0cb3

Browse files
committed
release
0 parents  commit bca0cb3

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

70 files changed

+237327
-0
lines changed

.gitignore

+249
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,249 @@
1+
**/wandb/
2+
fine-tune/data/
3+
*.ckpt
4+
*.pdf
5+
*.png
6+
*.pyc
7+
8+
# Created by https://www.toptal.com/developers/gitignore/api/linux,macos,python,visualstudiocode
9+
# Edit at https://www.toptal.com/developers/gitignore?templates=linux,macos,python,visualstudiocode
10+
11+
### Linux ###
12+
*~
13+
14+
# temporary files which can be created if a process still has a handle open of a deleted file
15+
.fuse_hidden*
16+
17+
# KDE directory preferences
18+
.directory
19+
20+
# Linux trash folder which might appear on any partition or disk
21+
.Trash-*
22+
23+
# .nfs files are created when an open file is removed but is still being accessed
24+
.nfs*
25+
26+
### macOS ###
27+
# General
28+
.DS_Store
29+
.AppleDouble
30+
.LSOverride
31+
32+
# Icon must end with two \r
33+
Icon
34+
35+
# Thumbnails
36+
._*
37+
38+
# Files that might appear in the root of a volume
39+
.DocumentRevisions-V100
40+
.fseventsd
41+
.Spotlight-V100
42+
.TemporaryItems
43+
.Trashes
44+
.VolumeIcon.icns
45+
.com.apple.timemachine.donotpresent
46+
47+
# Directories potentially created on remote AFP share
48+
.AppleDB
49+
.AppleDesktop
50+
Network Trash Folder
51+
Temporary Items
52+
.apdisk
53+
54+
### macOS Patch ###
55+
# iCloud generated files
56+
*.icloud
57+
58+
### Python ###
59+
# Byte-compiled / optimized / DLL files
60+
__pycache__/
61+
*.py[cod]
62+
*$py.class
63+
64+
# C extensions
65+
*.so
66+
67+
# Distribution / packaging
68+
.Python
69+
build/
70+
develop-eggs/
71+
dist/
72+
downloads/
73+
eggs/
74+
.eggs/
75+
lib/
76+
lib64/
77+
parts/
78+
sdist/
79+
var/
80+
wheels/
81+
share/python-wheels/
82+
*.egg-info/
83+
.installed.cfg
84+
*.egg
85+
MANIFEST
86+
87+
# PyInstaller
88+
# Usually these files are written by a python script from a template
89+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
90+
*.manifest
91+
*.spec
92+
93+
# Installer logs
94+
pip-log.txt
95+
pip-delete-this-directory.txt
96+
97+
# Unit test / coverage reports
98+
htmlcov/
99+
.tox/
100+
.nox/
101+
.coverage
102+
.coverage.*
103+
.cache
104+
nosetests.xml
105+
coverage.xml
106+
*.cover
107+
*.py,cover
108+
.hypothesis/
109+
.pytest_cache/
110+
cover/
111+
112+
# Translations
113+
*.mo
114+
*.pot
115+
116+
# Django stuff:
117+
*.log
118+
local_settings.py
119+
db.sqlite3
120+
db.sqlite3-journal
121+
122+
# Flask stuff:
123+
instance/
124+
.webassets-cache
125+
126+
# Scrapy stuff:
127+
.scrapy
128+
129+
# Sphinx documentation
130+
docs/_build/
131+
132+
# PyBuilder
133+
.pybuilder/
134+
target/
135+
136+
# Jupyter Notebook
137+
.ipynb_checkpoints
138+
139+
# IPython
140+
profile_default/
141+
ipython_config.py
142+
143+
# pyenv
144+
# For a library or package, you might want to ignore these files since the code is
145+
# intended to run in multiple environments; otherwise, check them in:
146+
# .python-version
147+
148+
# pipenv
149+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
150+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
151+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
152+
# install all needed dependencies.
153+
#Pipfile.lock
154+
155+
# poetry
156+
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
157+
# This is especially recommended for binary packages to ensure reproducibility, and is more
158+
# commonly ignored for libraries.
159+
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
160+
#poetry.lock
161+
162+
# pdm
163+
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
164+
#pdm.lock
165+
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
166+
# in version control.
167+
# https://pdm.fming.dev/#use-with-ide
168+
.pdm.toml
169+
170+
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
171+
__pypackages__/
172+
173+
# Celery stuff
174+
celerybeat-schedule
175+
celerybeat.pid
176+
177+
# SageMath parsed files
178+
*.sage.py
179+
180+
# Environments
181+
.env
182+
.venv
183+
env/
184+
venv/
185+
ENV/
186+
env.bak/
187+
venv.bak/
188+
189+
# Spyder project settings
190+
.spyderproject
191+
.spyproject
192+
193+
# Rope project settings
194+
.ropeproject
195+
196+
# mkdocs documentation
197+
/site
198+
199+
# mypy
200+
.mypy_cache/
201+
.dmypy.json
202+
dmypy.json
203+
204+
# Pyre type checker
205+
.pyre/
206+
207+
# pytype static type analyzer
208+
.pytype/
209+
210+
# Cython debug symbols
211+
cython_debug/
212+
213+
# PyCharm
214+
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
215+
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
216+
# and can be added to the global gitignore or merged into this file. For a more nuclear
217+
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
218+
#.idea/
219+
220+
### Python Patch ###
221+
# Poetry local configuration file - https://python-poetry.org/docs/configuration/#local-configuration
222+
poetry.toml
223+
224+
# ruff
225+
.ruff_cache/
226+
227+
# LSP config files
228+
pyrightconfig.json
229+
230+
### VisualStudioCode ###
231+
.vscode/*
232+
!.vscode/settings.json
233+
!.vscode/tasks.json
234+
!.vscode/launch.json
235+
!.vscode/extensions.json
236+
!.vscode/*.code-snippets
237+
238+
# Local History for Visual Studio Code
239+
.history/
240+
241+
# Built Visual Studio Code Extensions
242+
*.vsix
243+
244+
### VisualStudioCode Patch ###
245+
# Ignore all local history of files
246+
.history
247+
.ionide
248+
249+
# End of https://www.toptal.com/developers/gitignore/api/linux,macos,python,visualstudiocode

LICENSE

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
Permission is hereby granted, free of charge, to any person obtaining a copy of
2+
this software and associated documentation files (the "Software"), to deal in
3+
the Software without restriction, including without limitation the rights to use,
4+
copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the
5+
Software, and to permit persons to whom the Software is furnished to do so,
6+
subject to the following conditions:
7+
8+
The above copyright notice and this permission notice shall be included in all copies
9+
or substantial portions of the Software.
10+
11+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
12+
INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
13+
PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
14+
FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
15+
OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
16+
DEALINGS IN THE SOFTWARE.

README.md

+17
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
## Overview
2+
3+
<p align="center">
4+
<img src="method.png"/>
5+
</p>
6+
7+
This is the official repository for the paper:
8+
> **[Diversified in-domain synthesis with efficient fine-tuning for few-shot classification](arxivlink)**<br>
9+
> [Victor G. Turrisi da Costa*](https://scholar.google.com/citations?user=UQctXiEAAAAJ&hl=en&oi=ao), [Nicola Dall'Asen*](https://scholar.google.com/citations?user=e7lgiYYAAAAJ&hl), [Yiming Wang](https://scholar.google.co.uk/citations?user=KBZ3zrEAAAAJ&hl=en), [Nicu Sebe](https://scholar.google.com/citations?user=tNtjSewAAAAJ&hl=en) and [Elisa Ricci](https://scholar.google.com/citations?user=xf1T870AAAAJ&hl=en). <br>
10+
11+
The code is divided into two main parts, one for fine-tuning a pre-trained model
12+
in the few-shot scenario (`fine-tune`) and the other for generating synthetic data
13+
to enhance fine-tuning (`generation`).
14+
15+
Each part contains its respective README files available in `fine-tune/README.md` and `generation/README.md` with additional details about installation, code organization and execution.
16+
17+
## Citation

fine-tune/README.md

+62
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Overview
2+
This is the source code for fine-tuning a VLM with DISEF.
3+
It contains all the utily functions for dataloading, training, loggings, etc.
4+
It also contains other baselines like VPT, TPT, VPT + TPT and Classifier Tuning.
5+
6+
## Structure
7+
- `artifacts` contain extra information about individual datasets.
8+
- `configs` contain the base yaml configuration files for the experiments.
9+
- `data` should contain all the downloaded datasets.
10+
- `scripts` contain example bash scripts for running different experiments.
11+
- `slurm_scripts` contain the scripts for launching many experiments with slurm.
12+
- `src` contain the source code for:
13+
- DISEF, with and without synthetic data, VPT, TPT, VPT + TPT and Classifier Tuning.
14+
- All dataloaders for the different benchmark datasets.
15+
- Utility functions.
16+
17+
18+
## Downloading the data
19+
20+
Download the individual datasets by following their respective tutorials and move them to `data`.
21+
Our dataloaders also use the extra dataset info files in `artifacts`, but those are already provided.
22+
23+
- DTD: [here](https://www.robots.ox.ac.uk/~vgg/data/dtd/)
24+
- EuroSAT: [here](https://github.com/phelber/EuroSAT)
25+
- Oxford-IIIT Pet: [here](https://www.robots.ox.ac.uk/~vgg/data/pets/)
26+
- Stanford Cars: [here](https://www.kaggle.com/datasets/jessicali9530/stanford-cars-dataset)
27+
- FGVC Aircraft: [here](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/)
28+
- Caltech101: [here](https://data.caltech.edu/records/mzrjq-6wc02)
29+
- Food101: [here](https://www.kaggle.com/dansbecker/food-101)
30+
- Flowers102: [here](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/)
31+
- SUN397: [here](https://vision.princeton.edu/projects/2010/SUN/)
32+
- ImageNet: [here](https://www.kaggle.com/c/imagenet-object-localization-challenge/data)
33+
34+
35+
## Installation
36+
37+
For installation, we suggest using conda to keep a clean environment:
38+
```
39+
conda create --name disef python=3.9
40+
conda activate disef
41+
pip3 install -r requirements.txt
42+
```
43+
44+
## Running
45+
46+
Example scripts are available in `scripts`. Simply change the parameters there and run as, for example:
47+
```
48+
bash scripts/disef.sh
49+
```
50+
51+
## Running with Slurm
52+
53+
To launch a more experiments, we suggest leveraging slurm.
54+
For that, we provide in `slurm_scripts` all the scripts for running grid search on all datasets and methods.
55+
56+
To launch any experiment, first set the correct slurm options in the header according to your cluster.
57+
After that, set the correct paths for the synthetic data, if using it
58+
(refer to `generation/README` for how to generate the data)
59+
Then, simply launch as:
60+
```
61+
sbatch slurm_scripts/disef.sh
62+
```

0 commit comments

Comments
 (0)