Skip to content
Merged
Show file tree
Hide file tree
Changes from 134 commits
Commits
Show all changes
218 commits
Select commit Hold shift + click to select a range
1acdd5c
Implement Trainer & TrainingArguments w. tests
tomaarsen Jan 11, 2023
89f4435
Readded support for hyperparameter tuning
tomaarsen Jan 11, 2023
5f2a6b3
Remove unused imports and reformat
tomaarsen Jan 11, 2023
622f33b
Preserve desired behaviour despite deprecation of keep_body_frozen pa…
tomaarsen Jan 11, 2023
ff59154
Ensure that DeprecationWarnings are displayed
tomaarsen Jan 11, 2023
3b4ef58
Set Trainer.freeze and Trainer.unfreeze methods normally
tomaarsen Jan 11, 2023
fd68274
Add TrainingArgument tests for num_epochs, batch_sizes, lr
tomaarsen Jan 11, 2023
14602ea
Convert trainer.train arguments into a softer deprecation
tomaarsen Jan 11, 2023
94106cc
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Jan 22, 2023
a39e772
Merge branch 'refactor_v2' of https://github.com/tomaarsen/setfit; br…
tomaarsen Jan 23, 2023
9fc55a6
Use body/head_learning_rate instead of classifier/embedding_learning_…
tomaarsen Jan 23, 2023
7d4ad00
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Jan 23, 2023
aab2377
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Feb 6, 2023
dee70b1
Reformat according to the newest black version
tomaarsen Feb 6, 2023
fb6547d
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Feb 6, 2023
abbbb03
Remove "classifier" from var names in SetFitHead
tomaarsen Feb 6, 2023
12d326e
Update DeprecationWarnings to include timeline
tomaarsen Feb 6, 2023
70c0295
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Feb 6, 2023
fc246cc
Convert training_argument imports to relative imports
tomaarsen Feb 6, 2023
57aa54f
Make conditional explicit
tomaarsen Feb 6, 2023
7ebdf93
Make conditional explicit
tomaarsen Feb 6, 2023
4695293
Use assertEqual rather than assert
tomaarsen Feb 6, 2023
4c6d0fd
Remove training_arguments from test func names
tomaarsen Feb 6, 2023
5937ec2
Replace loss_class on Trainer with loss on TrainArgs
tomaarsen Feb 6, 2023
f1e3de9
Removed dead class argument
tomaarsen Feb 6, 2023
6051095
Move SupConLoss to losses.py
tomaarsen Feb 6, 2023
bddd46a
Add deprecation to Trainer.(un)freeze
tomaarsen Feb 7, 2023
fa8a077
Prevent warning from always triggering
tomaarsen Feb 7, 2023
85a3684
Export TrainingArguments in __init__
tomaarsen Feb 7, 2023
ca625a2
Update & add important missing docstrings
tomaarsen Feb 7, 2023
868d7b7
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Feb 7, 2023
68e9094
Use standard dataclass initialization for SetFitModel
tomaarsen Feb 8, 2023
19a6fc8
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Feb 15, 2023
0b2efa1
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Feb 15, 2023
ca87c42
Remove duplicate space in DeprecationWarning
tomaarsen Feb 16, 2023
cc5282f
No longer require labeled data for DistillationTrainer
tomaarsen Mar 3, 2023
c6f5782
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Mar 3, 2023
36cbbfe
Update docs for v1.0.0
tomaarsen Mar 6, 2023
deb57ff
Remove references of SetFitTrainer
tomaarsen Mar 6, 2023
46922d5
Update expected test output
tomaarsen Mar 6, 2023
f43d5b2
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Apr 19, 2023
b0f9f58
Remove unused pipeline
tomaarsen Apr 19, 2023
339f332
Execute deprecations
tomaarsen Apr 19, 2023
9e0bf78
Stop importing now-removed function
tomaarsen Apr 19, 2023
ecabbcf
Initial setup for logging & callbacks
tomaarsen Jul 6, 2023
6e6720b
Move sentence-transformer training into trainer.py
tomaarsen Jul 6, 2023
826eb53
Add checkpointing, support EarlyStoppingCallback
tomaarsen Jul 28, 2023
019a971
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Jul 29, 2023
1930973
Run formatting
tomaarsen Jul 29, 2023
e4f3f76
Merge branch 'refactor_v2' of https://github.com/tomaarsen/setfit int…
tomaarsen Jul 29, 2023
0f66109
Merge pull request #4 from tomaarsen/feat/logging_callbacks
tomaarsen Jul 29, 2023
a87cdc0
Add additional trainer tests
tomaarsen Jul 29, 2023
d418759
Use isinstance, required by flake8 release from 1hr ago
tomaarsen Jul 29, 2023
08892f6
sampler for refactor WIP
danstan5 Sep 14, 2023
0a2b664
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Oct 17, 2023
429de0f
Merge branch 'refactor_v2' of https://github.com/tomaarsen/setfit int…
tomaarsen Oct 17, 2023
173f084
Run formatters
tomaarsen Oct 17, 2023
c23959a
Remove tests from modeling.py
tomaarsen Oct 17, 2023
0fa3870
Add missing type hint
tomaarsen Oct 17, 2023
3969f38
Adjust test to still pass if W&B/Tensorboard are installed
tomaarsen Oct 17, 2023
567f1c9
Merge branch 'refactor_v2' of https://github.com/tomaarsen/setfit int…
tomaarsen Oct 17, 2023
851f0bb
The log/eval/save steps should be saved on the state instead
tomaarsen Oct 17, 2023
67ddedc
Merge branch 'refactor_v2' of https://github.com/tomaarsen/setfit int…
tomaarsen Oct 17, 2023
d37ee09
sampler logic fix "unique" strategy
danstan5 Oct 19, 2023
0ef8837
add sampler tests (not complete)
danstan5 Oct 19, 2023
131aa26
add sampling_strategy into TrainingArguments
danstan5 Oct 19, 2023
c6c6228
Merge branch 'refactor-sampling' of https://github.com/danstan5/setfi…
danstan5 Oct 19, 2023
7431005
num_iterations removed from TrainingArguments
danstan5 Oct 19, 2023
3bd2acc
run_fewshot compatible with <v.1.0.0
danstan5 Oct 20, 2023
3d07e6c
Run make style
tomaarsen Oct 25, 2023
978daee
Use "no" as the default evaluation_strategy
tomaarsen Oct 25, 2023
2802a3f
Move num_iterations back to TrainingArguments
tomaarsen Oct 25, 2023
391f991
Fix broken trainer tests due to new default sampling
tomaarsen Oct 25, 2023
f8b7253
Use the Contrastive Dataset for Distillation
tomaarsen Oct 25, 2023
38e9607
Set the default logging steps at 50
tomaarsen Oct 25, 2023
4ead15d
Add max_steps argument to TrainingArguments
tomaarsen Oct 25, 2023
eb70336
Change max_steps conditional
tomaarsen Oct 25, 2023
3478799
Merge pull request #5 from danstan5/refactor-sampling
tomaarsen Oct 27, 2023
d9c4a05
Merge branch 'main' of https://github.com/huggingface/setfit into ref…
tomaarsen Nov 9, 2023
5b39f06
Seeds are now correctly applied for reproducibility
tomaarsen Nov 9, 2023
d8177db
Add files via upload
MosheWasserb Nov 9, 2023
7c3feed
Don't scale gradients during evaluation
tomaarsen Nov 9, 2023
cdc8979
Use evaluation_strategy="steps" if eval_steps is set
tomaarsen Nov 9, 2023
e040167
Run formatting
tomaarsen Nov 9, 2023
d2f2489
Implement SetFit for ABSA from Intel Labs (#6)
tomaarsen Nov 9, 2023
5c4569d
Import optuna under TYPE_CHECKING
tomaarsen Nov 9, 2023
ceeb725
Remove unused import, reformat
tomaarsen Nov 9, 2023
5c669b5
Add MANIFEST.in with model_card_template
tomaarsen Nov 9, 2023
8e201e5
Don't require transformers TrainingArgs in tests
tomaarsen Nov 9, 2023
6ae5045
Update URLs in setup.py
tomaarsen Nov 9, 2023
ecaabb4
Increase min hf_hub version to 0.12.0 for SoftTemporaryDirectory
tomaarsen Nov 9, 2023
4e79397
Include MANIFEST.in data via `include_package_data=True`
tomaarsen Nov 9, 2023
65aff32
Use kwargs instead of args in super call
tomaarsen Nov 9, 2023
eeeac55
Use v0.13.0 as min. version as huggingface/huggingface_hub#1315
tomaarsen Nov 9, 2023
3214f1b
Use en_core_web_sm for tests
tomaarsen Nov 10, 2023
2b78bb0
Remove incorrect spacy_model from AspectModel/PolarityModel
tomaarsen Nov 10, 2023
b68f655
Rerun formatting
tomaarsen Nov 10, 2023
d85f0d9
Run CI on pre branch & workflow dispatch
tomaarsen Nov 10, 2023
b636cd7
Merge pull request #265 from tomaarsen/refactor_v2
tomaarsen Nov 10, 2023
81952bf
Set development version to 1.0.0.dev0
tomaarsen Nov 10, 2023
5b76361
Extend training argument tests
tomaarsen Nov 10, 2023
54b5d55
Only create evaluation dataloader if eval_strat is set
tomaarsen Nov 14, 2023
4788713
Run formatting
tomaarsen Nov 14, 2023
74a5b7c
max_steps isn't optional
tomaarsen Nov 14, 2023
7ef5bbc
Fix indentation of docstring
tomaarsen Nov 15, 2023
ca3030f
Apply fixes for HPO
tomaarsen Nov 15, 2023
f114572
Remove outdated tests
tomaarsen Nov 15, 2023
8d118d5
Use SetFitModel as the model in CallbackHandler
tomaarsen Nov 21, 2023
b964238
Correctly set the total training steps based on args.max_steps
tomaarsen Nov 21, 2023
2f06847
Add missing comma
tomaarsen Nov 21, 2023
fcb38fc
Capitalize first letter of sentence
tomaarsen Nov 21, 2023
9fe6f0d
Run formatting
tomaarsen Nov 21, 2023
da338ad
Remove unused arguments in tests
tomaarsen Nov 21, 2023
be4c900
Initial documentation for SetFit v1.0.0
tomaarsen Nov 21, 2023
fb42dd7
Update the documentation related workflows
tomaarsen Nov 21, 2023
04c45d7
Merge branch 'main' of https://github.com/huggingface/setfit into v1.…
tomaarsen Nov 21, 2023
bfe6ef6
Add figure to zero-shot how-to guide
tomaarsen Nov 21, 2023
773b860
Add docs notebook building support
tomaarsen Nov 21, 2023
883889c
Update broken, redirecting links
tomaarsen Nov 21, 2023
b4e5db0
polarity -> label
tomaarsen Nov 22, 2023
dbd707b
Mention extra download requirements for ABSA
tomaarsen Nov 22, 2023
552cecc
Merge branch 'main' of https://github.com/huggingface/setfit into v1.…
tomaarsen Nov 24, 2023
0d32dd1
Implement 'batch_size' on model.predict
tomaarsen Nov 24, 2023
392cf0d
Add batch sizes to toctree
tomaarsen Nov 24, 2023
ee00c40
Merge pull request #443 from tomaarsen/feat/expose_batch_size
tomaarsen Nov 24, 2023
17d6513
Save model head on CPU
tomaarsen Nov 24, 2023
dca6fd0
torch.Module -> torch.nn.Module
tomaarsen Nov 24, 2023
4123609
Merge pull request #444 from tomaarsen/feat/cpu_load_diff_head
tomaarsen Nov 24, 2023
b5a6361
Add new top-level header to docs reference
tomaarsen Nov 24, 2023
6ca989e
Update docs about return value of metric function
tomaarsen Nov 24, 2023
93c52dd
Add "use_auth_token" to migration guide
tomaarsen Nov 24, 2023
44daad4
Allow 'device' on SetFitModel.from_pretrained()
tomaarsen Nov 24, 2023
6f06204
Add tests for SetFitABSA as well
tomaarsen Nov 24, 2023
c41b7c3
Merge pull request #445 from tomaarsen/feat/load_on_device
tomaarsen Nov 24, 2023
b8da4a3
Update which trainer methods are documented
tomaarsen Nov 24, 2023
639750f
Link to the Hub in d ocstring
tomaarsen Nov 27, 2023
9ffc262
Add scikit-learn API version of SetFit to related work
tomaarsen Nov 27, 2023
2ef61bb
Batch Sizes + "for Inference"
tomaarsen Nov 27, 2023
b8b8417
Make first column bold in Sampling Strategy table
tomaarsen Nov 27, 2023
a2fa84f
Remove comment about Google Colab with Python 3.7
tomaarsen Nov 27, 2023
e2cf782
Rename file, remove distilBERT, fix typos
tomaarsen Nov 27, 2023
c5ea28d
Merge branch 'v1.0.0-pre' of https://github.com/huggingface/setfit in…
tomaarsen Nov 27, 2023
c7f49ad
Add ONNX tutorial to docs
tomaarsen Nov 27, 2023
193f83f
Merge pull request #435 from huggingface/moshe
tomaarsen Nov 27, 2023
8e0c55c
Update docstring of from_pretrained!
tomaarsen Nov 28, 2023
19d6d9d
Revert "Update docstring of from_pretrained!"
tomaarsen Nov 28, 2023
5058e31
Update docstring of from_pretrained!
tomaarsen Nov 28, 2023
3e829ba
Update docstring edits of from_pretrained
tomaarsen Nov 28, 2023
d476ce0
Correctly format docstrings for API reference
tomaarsen Nov 28, 2023
dac5221
Also maybe log, evaluate & save at epoch end
tomaarsen Nov 28, 2023
5edf540
Update README in preparation for documentation
tomaarsen Nov 28, 2023
c1b2f20
Link to scripts rather than scripts/setfit
tomaarsen Nov 28, 2023
70bd935
Ensure correct device of "best model at the end"
tomaarsen Nov 28, 2023
c93b55a
Add "labels" in a configuration file
tomaarsen Nov 28, 2023
4c0f152
Resolve flake issues
tomaarsen Nov 28, 2023
1af337f
Add labels to migration guide
tomaarsen Nov 29, 2023
3876d62
Update returns docstring for predict & __call__
tomaarsen Nov 29, 2023
71be7a5
Use ndim rather than "multi_target_strategy is None"
tomaarsen Nov 29, 2023
298fe39
Merge pull request #447 from tomaarsen/feat/configuration
tomaarsen Nov 29, 2023
cc97d10
Allow passing strings to model.predict
tomaarsen Nov 29, 2023
d85d537
Merge pull request #448 from tomaarsen/feat/predict_singular
tomaarsen Nov 29, 2023
62f7eea
Allow partial column mappings
tomaarsen Nov 29, 2023
6f226e5
Allow normalize_embeddings with diff head
tomaarsen Nov 29, 2023
f04e997
Merge pull request #449 from tomaarsen/feat/partial_col_mapping
tomaarsen Nov 29, 2023
f021e13
Merge pull request #450 from tomaarsen/fix/normalize_with_diff_head
tomaarsen Nov 29, 2023
313bffc
Update phrasing in SetFit intro
tomaarsen Dec 1, 2023
9976bb5
Heavily improve automatic model card generation
tomaarsen Nov 29, 2023
bbad20d
Rewrite first paragraph somewhat
tomaarsen Nov 29, 2023
6cd51ed
Resolve issue with multi-label
tomaarsen Nov 30, 2023
4a6852b
Set inference=False for multilabel models
tomaarsen Nov 30, 2023
671611e
Add model card tests
tomaarsen Nov 30, 2023
5f36d0e
Reformat
tomaarsen Nov 30, 2023
086ee02
Satisfy flake8
tomaarsen Nov 30, 2023
4990b09
Make model card generation more robust
tomaarsen Dec 1, 2023
58d5815
Allow compute_metric to return a non-dict
tomaarsen Dec 1, 2023
61cf947
Update tests as datasets are now column-mapped at init
tomaarsen Dec 1, 2023
751ba80
Avoid bare except
tomaarsen Dec 1, 2023
4a7255d
Avoid walrus operator for now for Python 3.7 compat
tomaarsen Dec 1, 2023
8032131
Increase minimal datasets version for dataset inferring
tomaarsen Dec 1, 2023
54d7127
Keep datasets version low, but skip test if < 2.14
tomaarsen Dec 1, 2023
87420b3
Add reason to skipif
tomaarsen Dec 1, 2023
a73cb69
Always return dicts in id2label/label2id
tomaarsen Dec 4, 2023
859691b
Introduce "no aspect", "aspect" labels for AspectModel
tomaarsen Dec 4, 2023
f9e6acb
Extend model card generation to ABSA + Tests
tomaarsen Dec 4, 2023
4c4a9aa
Correctly use create_model_card in ABSA test
tomaarsen Dec 4, 2023
0beedf2
Speed up model card tests for ABSA
tomaarsen Dec 4, 2023
55e9380
Set default W&B project as "setfit" if not set via ENV var yet
tomaarsen Dec 4, 2023
8ad41a8
Run formatting
tomaarsen Dec 4, 2023
a65a4e7
Remove the old ABSA model card template
tomaarsen Dec 4, 2023
d0cda23
Set fsspec<2023.12.0 due to breakages with older datasets
tomaarsen Dec 4, 2023
7dcc35e
Make some model_card_data modifications for ABSA only once
tomaarsen Dec 4, 2023
2cd004f
Reorder arguments
tomaarsen Dec 4, 2023
3a5356e
Update absa models in docs
tomaarsen Dec 4, 2023
d513064
Move import
tomaarsen Dec 4, 2023
1a60d09
Add model_card_data to from_pretrained
tomaarsen Dec 4, 2023
681f8db
Remove useless brackets
tomaarsen Dec 4, 2023
d61ec69
Correct model_card docstring
tomaarsen Dec 4, 2023
77aff7c
Only use the gold aspects/labels for training the polarity model
tomaarsen Dec 4, 2023
2c09cfb
Merge branch 'v1.0.0-pre' of https://github.com/huggingface/setfit in…
tomaarsen Dec 4, 2023
9dbca0b
Use text classification dataset examples
tomaarsen Dec 4, 2023
368155c
Add model card generation documentation
tomaarsen Dec 4, 2023
9c87685
Add spaCy version to ABSA model card
tomaarsen Dec 5, 2023
b257e82
Map to int to avoid potential warning
tomaarsen Dec 5, 2023
3a6e23a
Store used spaCy model configuration in aspect/polarity model
tomaarsen Dec 5, 2023
3bc0125
Correctly test against log
tomaarsen Dec 5, 2023
35c7461
Reformat test imports
tomaarsen Dec 5, 2023
5d04965
Try to resolve failing test on CI
tomaarsen Dec 5, 2023
815e45a
debugging: test against trainer dataset size
tomaarsen Dec 5, 2023
267e21d
Ignore log tests
tomaarsen Dec 5, 2023
243fcb2
Add 'eval_max_steps', reduce load time before train
tomaarsen Dec 5, 2023
8dd930c
Merge branch 'main' of https://github.com/huggingface/setfit into v1.…
tomaarsen Dec 5, 2023
c039e17
Also pass metric_kwargs to custom metric callable
tomaarsen Dec 5, 2023
8d5fc46
Merge pull request #456 from tomaarsen/feat/use_metric_kwargs_with_cu…
tomaarsen Dec 5, 2023
37592eb
Use gold aspects as True, and non-overlapping pred aspects as False
tomaarsen Dec 6, 2023
522a420
Add missing +1 on edge case in Aspect Extractor
tomaarsen Dec 6, 2023
ebaf5a2
Update ABSA documentation slightly
tomaarsen Dec 6, 2023
937c491
Specify AbsaTrainer methods
tomaarsen Dec 6, 2023
3152e49
Update v1.0.0 migration; expand changelog
tomaarsen Dec 6, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/build_documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ jobs:
with:
commit_sha: ${{ github.sha }}
package: setfit
notebook_folder: setfit_doc
languages: en
secrets:
token: ${{ secrets.HUGGINGFACE_PUSH }}
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/quality.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,12 @@ on:
branches:
- main
- v*-release
- v*-pre
pull_request:
branches:
- main
- v*-pre
workflow_dispatch:

jobs:

Expand Down
5 changes: 5 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,12 @@ on:
branches:
- main
- v*-release
- v*-pre
pull_request:
branches:
- main
- v*-pre
workflow_dispatch:

jobs:

Expand Down Expand Up @@ -40,6 +43,8 @@ jobs:
run: |
python -m pip install --no-cache-dir --upgrade pip
python -m pip install --no-cache-dir ${{ matrix.requirements }}
python -m spacy download en_core_web_lg
python -m spacy download en_core_web_sm
if: steps.restore-cache.outputs.cache-hit != 'true'

- name: Install the checked-out setfit
Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -149,3 +149,7 @@ scripts/tfew/run_tmux.sh
# macOS
.DS_Store
.vscode/settings.json

# Common SetFit Trainer logging folders
wandb
runs/
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include src/setfit/span/model_card_template.md
Loading