Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Feature/refactor value #558

Draft
wants to merge 360 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
360 commits
Select commit Hold shift + click to select a range
e904874
Permutation samplers return full permutations
mdbenito Apr 27, 2024
7b9112d
Fix import
mdbenito Apr 27, 2024
7683c20
Return a copy of the values in Valuation
mdbenito Apr 27, 2024
0d39c5f
Rename n_iterations to max_samples.
janosg Apr 29, 2024
d07db78
Add exact tests for DeterministicUniformSampler.
janosg Apr 29, 2024
ec1602b
Add more tests for samplers and fix multiple bugs.
janosg Apr 29, 2024
9e3da5b
Add more tests for samplers.
janosg Apr 29, 2024
2bd6c26
Add a preliminary length method for samplers.
janosg Apr 30, 2024
b4447a0
Add better input processing.
janosg Apr 30, 2024
e5a3328
Re-add progress bars.
janosg Apr 30, 2024
ca6a4d4
Add docstrings.
janosg Apr 30, 2024
4ef1ad3
Add more samplers to test parametrization.
janosg Apr 30, 2024
0b5ebc1
Add typing extensions and use proper Self annotation for with_dataset.
janosg Apr 30, 2024
fc704ce
Implement with_dataset in base class instead of Utility.
janosg Apr 30, 2024
0e6356f
Run mypy and fix some typing problems.
janosg Apr 30, 2024
898d878
Fix permutation sampler to actually return full samples and some types ✈
mdbenito May 6, 2024
d2a2fff
Fix call to complement
mdbenito May 6, 2024
f0bbf57
Calls to utility with no sample return default value
mdbenito May 6, 2024
32b375c
Type fixes and some silly mistakes
mdbenito May 6, 2024
6855c5d
More specific imports
mdbenito May 6, 2024
0e78b9a
Fix merge conflicts.
janosg May 7, 2024
a63ae8e
Define length of IndexIteration and use it for length of sabplers.
janosg May 7, 2024
076c166
Redefine NoIndexIteration to always be finite.
janosg May 7, 2024
2b21dec
Only allow deterministic index iteration in deterministic samplers.
janosg May 7, 2024
b9f1384
Add seed handling for index iterators.
janosg May 7, 2024
5e16161
Add seed handling for index iterators.
janosg May 7, 2024
afa5d93
Only allow PowersetSamplers in least-core.
janosg May 7, 2024
597b698
Incorporate feedback.
janosg May 8, 2024
a6fb213
Make sure samplers behave reasonably with empty index sets.
janosg May 8, 2024
8e574eb
Write more tests.
janosg May 8, 2024
c612c39
Add tests for Stratified samplers.
janosg May 8, 2024
1bf9d1e
Add more tests for Stratified samplers.
janosg May 8, 2024
cc4f777
Run mypy and fix some typing problems.
janosg May 8, 2024
dcddaee
Adjust documentation notebook for least-core.
janosg May 8, 2024
644a9f0
Incorporate review comments.
janosg May 13, 2024
204ed9d
Create temp files.
janosg May 14, 2024
d01c7ce
Split history src/pydvl/value/least_core/common.py to src/pydvl/valua…
janosg May 14, 2024
b988fb1
Split history src/pydvl/value/least_core/common.py to src/pydvl/valua…
janosg May 14, 2024
4c7b002
Split history src/pydvl/value/least_core/common.py to src/pydvl/valua…
janosg May 14, 2024
4e3d550
Split history src/pydvl/value/least_core/common.py to src/pydvl/valua…
janosg May 14, 2024
10a5fdd
Split history tests/value/__init__.py to tests/valuation/__init__.py …
janosg May 14, 2024
78a5236
Split history tests/value/__init__.py to tests/valuation/__init__.py …
janosg May 14, 2024
f2033eb
Split history tests/value/__init__.py to tests/valuation/__init__.py …
janosg May 14, 2024
46d5e11
Split history tests/value/__init__.py to tests/valuation/__init__.py …
janosg May 14, 2024
4cca859
Split history tests/value/least_core/test_naive.py to tests/valuation…
janosg May 14, 2024
1292c40
Split history tests/value/least_core/test_naive.py to tests/valuation…
janosg May 14, 2024
6d31683
Split history tests/value/least_core/test_naive.py to tests/valuation…
janosg May 14, 2024
68c2ec2
Split history tests/value/least_core/test_naive.py to tests/valuation…
janosg May 14, 2024
c8168d2
Split history tests/value/least_core/test_common.py to tests/valuatio…
janosg May 14, 2024
951a421
Split history tests/value/least_core/test_common.py to tests/valuatio…
janosg May 14, 2024
bbb924e
Split history tests/value/least_core/test_common.py to tests/valuatio…
janosg May 14, 2024
f68520a
Split history tests/value/least_core/test_common.py to tests/valuatio…
janosg May 14, 2024
daf7c52
Split history tests/value/test_sampler.py to tests/valuation/samplers…
janosg May 14, 2024
28644ac
Split history tests/value/test_sampler.py to tests/valuation/samplers…
janosg May 14, 2024
bf834d8
Split history tests/value/test_sampler.py to tests/valuation/samplers…
janosg May 14, 2024
d3104f5
Split history tests/value/test_sampler.py to tests/valuation/samplers…
janosg May 14, 2024
802ed1a
Re-apply changes to split files.
janosg May 14, 2024
1fcc3cf
Remove temp files.
janosg May 14, 2024
75a15f5
Split history src/pydvl/value/least_core/common.py to src/pydvl/valua…
janosg May 14, 2024
b8cc727
Split history src/pydvl/value/least_core/common.py to src/pydvl/valua…
janosg May 14, 2024
c062fee
Split history src/pydvl/value/least_core/common.py to src/pydvl/valua…
janosg May 14, 2024
2cfc0a0
Split history src/pydvl/value/least_core/common.py to src/pydvl/valua…
janosg May 14, 2024
fd45c5b
Re-apply change to _solve_least_core_problems.py.
janosg May 14, 2024
ed02e5e
Adjust imports and delete old file.
janosg May 15, 2024
5ec7fc8
Split history tests/value/least_core/test_common.py to tests/valuatio…
janosg May 15, 2024
82e2028
Split history tests/value/least_core/test_common.py to tests/valuatio…
janosg May 15, 2024
dd4565e
Split history tests/value/least_core/test_common.py to tests/valuatio…
janosg May 15, 2024
1598aa3
Split history tests/value/least_core/test_common.py to tests/valuatio…
janosg May 15, 2024
dc7c38b
Re-apply changes to test_slove_least_core_problems.
janosg May 15, 2024
43fdb0f
Split history tests/value/test_sampler.py to tests/valuation/samplers…
janosg May 15, 2024
8ac8d2f
Split history tests/value/test_sampler.py to tests/valuation/samplers…
janosg May 15, 2024
d556248
Split history tests/value/test_sampler.py to tests/valuation/samplers…
janosg May 15, 2024
d8c4946
Split history tests/value/test_sampler.py to tests/valuation/samplers…
janosg May 15, 2024
54830f7
Re-apply changes to test_sampler.
janosg May 15, 2024
68aa81a
Split history tests/value/least_core/test_naive.py to tests/valuation…
janosg May 15, 2024
c3fef15
Split history tests/value/least_core/test_naive.py to tests/valuation…
janosg May 15, 2024
1dd0f47
Split history tests/value/least_core/test_naive.py to tests/valuation…
janosg May 15, 2024
3ee7097
Split history tests/value/least_core/test_naive.py to tests/valuation…
janosg May 15, 2024
2ab1434
Re-apply changes to test_least_core_valuation.
janosg May 15, 2024
ab86696
Restructure code to prepare for parallelization.
janosg May 15, 2024
2d19cda
Add parallelization.
janosg May 15, 2024
0a5b94f
Simplify parallelization in lc_solve_problems.
janosg May 21, 2024
32cd45f
Address comments from review.
janosg May 24, 2024
3251f03
Remove IndexSampler.from_data and renamne .from_indices to generate_b…
janosg May 24, 2024
af7ac86
Rename IndexSampler.length to sample_limit.
janosg May 24, 2024
2ba5306
Add better docstring to LeastCoreProblem.
janosg May 24, 2024
4800845
Use NotFittedException.
janosg May 24, 2024
6f1367e
Add subclasses for known least-core methods and improve default handl…
janosg May 24, 2024
6bce08c
Clarify the empty generator.
janosg May 24, 2024
e4e90c9
Remove lc_solve_problems and improve documentation of solver options.
janosg May 24, 2024
1099cab
Polish docstrings.
janosg May 24, 2024
cda307c
Used subclasses of LeastCoreValuation in documentation notebook.
janosg May 24, 2024
58daf66
Replace take_n by more_itertools.chunked.
janosg May 24, 2024
c6c0c4d
Update changelog.
janosg May 24, 2024
4307246
Merge pull request #580 from aai-institute/refactor/least-core
janosg May 24, 2024
1e9e511
Add simple interface tests.
janosg May 27, 2024
085b460
Split history tests/value/loo/test_loo.py to tests/valuation/methods/…
janosg May 28, 2024
a927189
Split history tests/value/loo/test_loo.py to tests/valuation/methods/…
janosg May 28, 2024
a2bfc9b
Split history tests/value/loo/test_loo.py to tests/valuation/methods/…
janosg May 28, 2024
91ae55c
Split history tests/value/loo/test_loo.py to tests/valuation/methods/…
janosg May 28, 2024
b52f85a
Add test for loo.
janosg May 28, 2024
b26e38e
Add docstrings to tests.
janosg May 28, 2024
2cd9511
Split history tests/value/shapley/test_naive.py to tests/valuation/me…
janosg May 28, 2024
47e6b28
Split history tests/value/shapley/test_naive.py to tests/valuation/me…
janosg May 28, 2024
64a76a3
Split history tests/value/shapley/test_naive.py to tests/valuation/me…
janosg May 28, 2024
0ba9fe1
Split history tests/value/shapley/test_naive.py to tests/valuation/me…
janosg May 28, 2024
2f8aa06
Add tests for deterministic shapley methods and fix several bugs.
janosg Jun 5, 2024
5f42840
Split history tests/value/shapley/test_montecarlo.py to tests/valuati…
janosg Jun 5, 2024
bd5eb6a
Split history tests/value/shapley/test_montecarlo.py to tests/valuati…
janosg Jun 5, 2024
ead5eaf
Split history tests/value/shapley/test_montecarlo.py to tests/valuati…
janosg Jun 5, 2024
6075c84
Split history tests/value/shapley/test_montecarlo.py to tests/valuati…
janosg Jun 5, 2024
199ef20
Temp commit.
janosg Jun 6, 2024
6b44472
Split history tests/utils/test_score.py to tests/valuation/scorers/te…
janosg Jun 6, 2024
77406a8
Split history tests/utils/test_score.py to tests/valuation/scorers/te…
janosg Jun 6, 2024
4bf21e8
Split history tests/utils/test_score.py to tests/valuation/scorers/te…
janosg Jun 6, 2024
1fd4d84
Split history tests/utils/test_score.py to tests/valuation/scorers/te…
janosg Jun 6, 2024
7b8ef9a
Start to add tests for montecarlo shapley methods.
janosg Jun 6, 2024
eeeea12
Fix control flow in fit method of SemiValue that lead to wrong result…
janosg Jun 7, 2024
9782eae
Fix seed test.
janosg Jun 7, 2024
209e500
Split history tests/value/test_stopping.py to tests/valuation/test_st…
janosg Jun 7, 2024
fea6dd1
Split history tests/value/test_stopping.py to tests/valuation/test_st…
janosg Jun 7, 2024
e6b5926
Split history tests/value/test_stopping.py to tests/valuation/test_st…
janosg Jun 7, 2024
0066435
Split history tests/value/test_stopping.py to tests/valuation/test_st…
janosg Jun 7, 2024
d595ba7
Add tests for stopping criteria.
janosg Jun 7, 2024
07346fc
Split history tests/value/test_semivalues.py to tests/valuation/metho…
janosg Jun 10, 2024
5472038
Split history tests/value/test_semivalues.py to tests/valuation/metho…
janosg Jun 10, 2024
b8bf14c
Split history tests/value/test_semivalues.py to tests/valuation/metho…
janosg Jun 10, 2024
f16503c
Split history tests/value/test_semivalues.py to tests/valuation/metho…
janosg Jun 10, 2024
3f9f595
Split history tests/value/utils.py to tests/valuation/utils.py - rena…
janosg Jun 10, 2024
d1a0e53
Split history tests/value/utils.py to tests/valuation/utils.py - rena…
janosg Jun 10, 2024
3d0106d
Split history tests/value/utils.py to tests/valuation/utils.py - reso…
janosg Jun 10, 2024
2acaa4c
Split history tests/value/utils.py to tests/valuation/utils.py - rest…
janosg Jun 10, 2024
a43d4af
Add tests for semivalues.
janosg Jun 10, 2024
1d7febb
Remove tolerate fixture and use pytest-rerunfailures instead.
janosg Jun 10, 2024
53b1390
Add first draft of owen valuations.
janosg Jun 11, 2024
7c3197e
Refactor montecarlo shapley tests.
janosg Jun 12, 2024
291d3ec
Write tests for owen shapley.
janosg Jun 12, 2024
d42b030
Write docstrings.
janosg Jun 12, 2024
b6c4ba9
Update changelog.
janosg Jun 12, 2024
98d82f2
Incorporate review comments.
janosg Jun 12, 2024
5612983
Make type checker a bit happier.
janosg Jun 12, 2024
224527e
Merge pull request #597 from aai-institute/feature/refactor-owen
janosg Jun 13, 2024
7b52c0f
First draft for new group testing.
janosg Jun 14, 2024
fd829e3
Add more tests.
janosg Jun 14, 2024
c31e309
Refactoring.
janosg Jun 14, 2024
4343c75
Refactoring.
janosg Jun 14, 2024
535c05b
Refactoring.
janosg Jun 14, 2024
07b8295
Fix type hints.
janosg Jun 14, 2024
07a7ab9
Update src/pydvl/valuation/methods/_utility_values_and_sample_masks.py
janosg Jun 17, 2024
9972650
Fix.
janosg Jun 18, 2024
5540fd4
Fix merge conflicts.
janosg Jun 18, 2024
fba3c82
Apply suggestions from code review
janosg Jun 18, 2024
02afd73
Fix merge conflicts.
janosg Jun 18, 2024
4c34312
Merge pull request #602 from aai-institute/feature/refactor-gt
janosg Jun 18, 2024
fe23e79
First draft for new MSR Banzhaf.
janosg Jun 19, 2024
9900c60
Polishing and docstrings.
janosg Jun 20, 2024
675934f
Run mypy.
janosg Jun 20, 2024
e4145f8
Update changelog.
janosg Jun 20, 2024
204dc53
Clarify documentation of variances and stderr in ValuationResult.
janosg Jun 24, 2024
d860268
Fix variances for MSR.
janosg Jul 1, 2024
5bd3ef0
Split history tests/test_results.py to tests/valuation/test_result.py…
janosg Jul 1, 2024
795c2bf
Split history tests/test_results.py to tests/valuation/test_result.py…
janosg Jul 1, 2024
ee8c693
Split history tests/test_results.py to tests/valuation/test_result.py…
janosg Jul 1, 2024
fa6bbfe
Split history tests/test_results.py to tests/valuation/test_result.py…
janosg Jul 1, 2024
667136c
Add and adjust old tests for ValuationResult.
janosg Jul 1, 2024
31f0dd5
First draft of new knn shapley.
janosg Jul 7, 2024
52a260b
Fix import order.
janosg Jul 8, 2024
348fcc9
Apply suggestions from code review
janosg Jul 9, 2024
1876b50
Address comments from code review.
janosg Jul 9, 2024
735c908
Update src/pydvl/valuation/methods/msr_banzhaf.py
janosg Jul 9, 2024
1cff43b
Merge pull request #605 from aai-institute/feature/refactor-msr-banzhaf
janosg Jul 9, 2024
4107489
Merge branch 'feature/refactor-value' into feature/refactor-knn-shapley
janosg Jul 9, 2024
4424ec2
Formatting.
janosg Jul 9, 2024
ccadf89
Add parallelization and progress bars.
janosg Jul 9, 2024
9380ae2
Renaming.
janosg Jul 9, 2024
b5aa3e8
Add docstrings.
janosg Jul 9, 2024
75f54b3
Incorporate comments from review.
janosg Jul 9, 2024
7c1a948
Apply suggestions from code review
janosg Jul 9, 2024
81a3bca
Merge branch 'feature/refactor-knn-shapley' of https://github.com/aai…
janosg Jul 9, 2024
6cad56e
Fix.
janosg Jul 9, 2024
8d09b81
Fix some type hints.
janosg Jul 9, 2024
069674e
Fix identation of TODO
schroedk Jul 9, 2024
063f532
Update changelog.
janosg Jul 28, 2024
6353dc4
Merge pull request #610 from aai-institute/feature/refactor-knn-shapley
janosg Jul 28, 2024
e7400c0
Add methods for cleanly chaning Sample idx and subset field values
AnesBenmerzoug Aug 8, 2024
83f08ba
Move score computation in ModelUtility to _compute_score method
AnesBenmerzoug Aug 8, 2024
ed2d818
Add method to set label for classwise scorer and fix bug with in clas…
AnesBenmerzoug Aug 8, 2024
397ae55
Fix typo and bug in IndexSampler
AnesBenmerzoug Aug 8, 2024
33b9732
Use with_idx and with_subset to create new sample instead of creating…
AnesBenmerzoug Aug 8, 2024
7cf6e7b
Fix bug with datatype of subset in DeterministicUniformSampler
AnesBenmerzoug Aug 8, 2024
d11ffd4
Implement classwise sampler
AnesBenmerzoug Aug 8, 2024
368b4fc
Implement classwise utility
AnesBenmerzoug Aug 8, 2024
9919344
Implement classwise shapley valuation
AnesBenmerzoug Aug 8, 2024
99303da
Put pymemcache imports inside related fixtures
AnesBenmerzoug Aug 8, 2024
9692287
Add classwise shapley tests
AnesBenmerzoug Aug 8, 2024
7e27908
Fix bug in classwise shapley method
AnesBenmerzoug Aug 8, 2024
1f5c93c
Add code to avoid an inifite loop inside classwise sampler
AnesBenmerzoug Aug 8, 2024
73c1b71
Cleanup imports in classwise shapley test module
AnesBenmerzoug Aug 8, 2024
5b3be04
Add test for classwise scorer
AnesBenmerzoug Aug 12, 2024
3296a60
Make sure label field is always set
AnesBenmerzoug Aug 15, 2024
0e6fb57
Fix classwise sampler
AnesBenmerzoug Aug 15, 2024
c65773d
Add tests for classwise sampler
AnesBenmerzoug Aug 15, 2024
e803216
Simplify classwise utility
AnesBenmerzoug Aug 15, 2024
5e6c89f
Rename method class to ClasswiseShapleyValuation
AnesBenmerzoug Aug 15, 2024
c82f921
Improve docstring of roundrobin function
AnesBenmerzoug Aug 15, 2024
1aca6fb
Rename CSSample to ClasswiseSample
AnesBenmerzoug Aug 15, 2024
beb3c3e
Add descriptions for Sample attributes
AnesBenmerzoug Aug 15, 2024
d55c9fa
Simplify setting scorer's label
AnesBenmerzoug Aug 15, 2024
60954c9
Apply suggestions from code review
AnesBenmerzoug Aug 15, 2024
e3a28c4
Docstrings
AnesBenmerzoug Aug 15, 2024
1a01333
Mark failing classwise test as xfail
AnesBenmerzoug Aug 15, 2024
830ffea
More comments
AnesBenmerzoug Aug 15, 2024
2fa3578
Add docstring to ClasswiseModelUtility
AnesBenmerzoug Aug 21, 2024
37acff9
Update changelog
AnesBenmerzoug Aug 21, 2024
85025c0
Merge pull request #616 from aai-institute/feature/refactor-classwise…
schroedk Aug 21, 2024
0d8dc4c
Merge remote-tracking branch 'origin/develop' into feature/refactor-v…
schroedk Aug 28, 2024
5ae4d4f
Extend test for random_powerset to include different values for the s…
schroedk Aug 30, 2024
c02b6ad
[skip ci] Add flaky decorator to test, which fail at random
schroedk Aug 30, 2024
fa1a027
Fix linting
schroedk Sep 2, 2024
e46120d
Revert "Fix linting"
schroedk Sep 2, 2024
217da00
Fix linting
schroedk Sep 2, 2024
048f243
Fix type annotations
schroedk Sep 2, 2024
34a9949
Add explicit type alias
schroedk Sep 2, 2024
2b3f548
Import TypeAlias from typing_extensions instead of typing
schroedk Sep 2, 2024
9716af9
Add further explicit type alias statements
schroedk Sep 2, 2024
066d900
Add explicit type hint to variable
schroedk Sep 2, 2024
784a40e
Use type cast instead of type annotation
schroedk Sep 2, 2024
b236c86
Add additional variable to pass mypy
schroedk Sep 2, 2024
459abcf
Use type:ignore as last resort
schroedk Sep 2, 2024
6b5a4dd
Fix wrong value check in owen method
schroedk Sep 2, 2024
e4b5c33
Increase rerun number for test
schroedk Sep 2, 2024
e2c0f5d
Remove unstable runtime duration comparison
schroedk Sep 2, 2024
db5bd76
Reduce test split from 4 to 2, due to memory issues with github workers
schroedk Sep 2, 2024
b20791b
Increase timeout for notebook tests
schroedk Sep 2, 2024
3df302e
Update test durations for notebook tests
schroedk Sep 2, 2024
ff1c5ef
Reduce notebook tests from 4 splits to 2 splits
schroedk Sep 2, 2024
e3bff6d
Merge pull request #618 from aai-institute/fix/valuation-owen-antithetic
schroedk Sep 2, 2024
d601e52
Consistently use seed in test
schroedk Sep 2, 2024
bc6aba8
Use only one group for testing
schroedk Sep 4, 2024
bdeae9b
Fix split size
schroedk Sep 4, 2024
b0a354e
Merge remote-tracking branch 'origin/develop' into feature/refactor-v…
schroedk Sep 4, 2024
77cfb39
Increase rerun tries for tests
schroedk Sep 4, 2024
ee1df38
Seperate legacy tests from test env
schroedk Sep 4, 2024
b536011
Add additional legacy test step
schroedk Sep 4, 2024
0a59946
Fix name of legacy test job
schroedk Sep 4, 2024
c2a1b17
Add xfail mark to test
schroedk Sep 4, 2024
48bc064
Exclude multi job test from CI
schroedk Sep 5, 2024
70fc2da
Fix name of legacy test step in CI
schroedk Sep 5, 2024
2741013
Add rerun decorator to randomly failing test
schroedk Sep 5, 2024
085f328
Use random seed to create samplers in test
schroedk Sep 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion .github/workflows/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,21 @@ jobs:
python_version: ${{ matrix.python_version }}
split_size: 4
group_number: ${{ matrix.group_number }}
needs: [code-quality]
needs: [code-quality, group-tests]

legacy-tests:
strategy:
fail-fast: false
matrix:
python_version: [ "3.11" ]
group_number: [ 1, 2, 3, 4 ]
name: Run Legacy tests - Python ${{ matrix.python_version }} - Group ${{ matrix.group_number }}
uses: ./.github/workflows/run-legacy-tests-workflow.yaml
with:
python_version: ${{ matrix.python_version }}
split_size: 4
group_number: ${{ matrix.group_number }}
needs: [ code-quality, group-tests, notebook-tests ]

push-docs-and-release-testpypi:
name: Push Docs and maybe release Package to TestPyPI
Expand Down
54 changes: 54 additions & 0 deletions .github/workflows/run-legacy-tests-workflow.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: Run Legacy Tests

on:
workflow_call:
inputs:
split_size:
description: "Determines the number of groups into which the tests should be split"
type: string
default: 4
group_number:
description: "Determines which which group of tests to run. Can be 1, 2, ..., split_size"
type: string
required: true
python_version:
description: "Determines which Python version to use"
type: string
required: true


env:
PY_COLORS: 1

jobs:
run-legacy-tests:
runs-on: ubuntu-latest
steps:
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
large-packages: false
docker-images: false
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Python ${{ inputs.python_version }}
uses: ./.github/actions/python
with:
python_version: ${{ inputs.python_version }}
- name: Cache Tox Directory for Tests
uses: actions/cache@v4
with:
key: tox-${{ runner.os }}-${{ github.ref }}-${{ hashFiles('tox.ini', 'requirements.txt') }}-${{ inputs.python_version }}
path: .tox
- name: Set up memcached
uses: niden/actions-memcached@v7
- name: Test Group ${{ inputs.group_number }}
run: tox -e legacy-tests -- --slow-tests --splits ${{ inputs.split_size }} --group ${{ inputs.group_number }}
- name: Upload coverage reports to Codecov
uses: codecov/codecov-action@v4
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: ./coverage.xml
env_vars: OS,PYTHON
verbose: false
23 changes: 12 additions & 11 deletions .notebook_test_durations
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
{
"notebooks/data_oob.ipynb::": 14.514983271001256,
"notebooks/influence_imagenet.ipynb::": 15.937124550999215,
"notebooks/influence_sentiment_analysis.ipynb::": 26.479645616000198,
"notebooks/influence_synthetic.ipynb::": 6.61773010700017,
"notebooks/influence_wine.ipynb::": 16.312171267998565,
"notebooks/least_core_basic.ipynb::": 14.375480750999486,
"notebooks/msr_banzhaf_digits.ipynb::": 106.6507187110019,
"notebooks/shapley_basic_spotify.ipynb::": 15.657225806997303,
"notebooks/shapley_knn_flowers.ipynb::": 3.9943819290019746,
"notebooks/shapley_utility_learning.ipynb::": 25.939783253001224
}
"notebooks/data_oob.ipynb::": 13.150942041,
"notebooks/influence_imagenet.ipynb::": 17.281671249999995,
"notebooks/influence_sentiment_analysis.ipynb::": 19.578478917000005,
"notebooks/influence_synthetic.ipynb::": 7.191153166999996,
"notebooks/influence_wine.ipynb::": 11.610076332999995,
"notebooks/least_core_basic.ipynb::": 14.069404709000011,
"notebooks/least_core_basic_new.ipynb::": 24.492538208000013,
"notebooks/msr_banzhaf_digits.ipynb::": 86.62082037599998,
"notebooks/shapley_basic_spotify.ipynb::": 15.088616748999982,
"notebooks/shapley_knn_flowers.ipynb::": 6.810235208000023,
"notebooks/shapley_utility_learning.ipynb::": 24.370409832999997
}
2,206 changes: 1,581 additions & 625 deletions .test_durations

Large diffs are not rendered by default.

82 changes: 52 additions & 30 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,19 @@

### Added

- New method `InverseHarmonicMeanInfluence`, implementation for the paper
`DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and
- Refactor Classwise Shapley valuation with the interfaces and sampler architecture [PR #616](https://github.com/aai-institute/pyDVL/pull/616).
- Refactoring KNN Shapley values with the new sampler architecture [PR #610](https://github.com/aai-institute/pyDVL/pull/610).
- Refactoring MSR Banzhaf semivalues with the new sampler architecture.
[PR #605](https://github.com/aai-institute/pyDVL/pull/605)
- Refactoring group-testing shapley values with new sampler architecture
[PR #602](https://github.com/aai-institute/pyDVL/pull/602)
- Refactoring of least-core data valuation methods with more supported sampling methods
and consistent interface.
[PR #580](https://github.com/aai-institute/pyDVL/pull/580)
- Refactoring of owen shapley valuation with new sampler architecture
[PR #597](https://github.com/aai-institute/pyDVL/pull/597)
- New method `InverseHarmonicMeanInfluence`, implementation for the paper
`DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and
Diffusion Models`
[PR #582](https://github.com/aai-institute/pyDVL/pull/582)
- Add new backend implementations for influence computation
Expand All @@ -16,7 +27,7 @@
[PR #591](https://github.com/aai-institute/pyDVL/pull/591)
- Extend `LissaInfluence` with block-diagonal and Gauss-Newton approximation
[PR #593](https://github.com/aai-institute/pyDVL/pull/593)
- Extend `NystroemSketchInfluence` with block-diagonal and Gauss-Newton
- Extend `NystroemSketchInfluence` with block-diagonal and Gauss-Newton
approximation
[PR #596](https://github.com/aai-institute/pyDVL/pull/596)
- Extend `ArnoldiInfluence` with block-diagonal and Gauss-Newton
Expand All @@ -30,9 +41,20 @@
- Replace `np.float_` with `np.float64` and `np.alltrue` with `np.all`,
as the old aliases are removed in NumPy 2.0
[PR #604](https://github.com/aai-institute/pyDVL/pull/604)

## Changed

- Fix a bug in pydvl.utils.numeric.random_subset where 1 - q was used instead of q
as the probability of an element being sampled
[PR #597](https://github.com/aai-institute/pyDVL/pull/597)
- Fix a bug in the calculation of variance estimates for MSR Banzhaf
[PR #605](https://github.com/aai-institute/pyDVL/pull/605)
- Fix a bug in KNN Shapley values. See [Issue 613](https://github.com/aai-institute/pyDVL/issues/613) for details.


### Changed

- Use tighter bounds for the calculation of the minimal sample size that guarantees
an epsilon-delta approximation in group testing (Jia et al. 2023)
[PR #602](https://github.com/aai-institute/pyDVL/pull/602)
- **Breaking Changes**
- Rename parameter `hessian_regularization` of `DirectInfluence`
to `regularization` and change the type annotation to allow
Expand All @@ -42,7 +64,7 @@
to `regularization` and change the type annotation to allow
for block-wise regularization parameters
[PR #593](https://github.com/aai-institute/pyDVL/pull/593)
- Remove parameter `h0` from init of `LissaInfluence`
- Remove parameter `h0` from init of `LissaInfluence`
[PR #593](https://github.com/aai-institute/pyDVL/pull/593)
- Rename parameter `hessian_regularization` of `NystroemSketchInfluence`
to `regularization` and change the type annotation to allow
Expand Down Expand Up @@ -77,25 +99,25 @@
### Added

- Add progress bars to the computation of `LazyChunkSequence` and
`NestedLazyChunkSequence`
`NestedLazyChunkSequence`
[PR #567](https://github.com/aai-institute/pyDVL/pull/567)
- Add a device fixture for `pytest`, which depending on the availability and
- Add a device fixture for `pytest`, which depending on the availability and
user input (`pytest --with-cuda`) resolves to cuda device
[PR #574](https://github.com/aai-institute/pyDVL/pull/574)

### Fixed

- Fixed logging issue in decorator `log_duration`
[PR #567](https://github.com/aai-institute/pyDVL/pull/567)
- Fixed missing move of tensors to model device in `EkfacInfluence`
- Fixed missing move of tensors to model device in `EkfacInfluence`
implementation [PR #570](https://github.com/aai-institute/pyDVL/pull/570)
- Missing move to device of `preconditioner` in `CgInfluence` implementation
[PR #572](https://github.com/aai-institute/pyDVL/pull/572)
- Raise a more specific error message, when a `RunTimeError` occurs in
- Raise a more specific error message, when a `RunTimeError` occurs in
`torch.linalg.eigh`, so the user can check if it is related to a known
issue
[PR #578](https://github.com/aai-institute/pyDVL/pull/578)
- Fix an edge case (empty train data) in the test
- Fix an edge case (empty train data) in the test
`test_classwise_scorer_accuracies_manual_derivation`, which resulted
in undefined behavior (`np.nan` to `int` conversion with different results
depending on OS)
Expand All @@ -113,7 +135,7 @@

### Fixed

- `FutureWarning` for `ParallelConfig` constantly raised without actually
- `FutureWarning` for `ParallelConfig` constantly raised without actually
instantiating the object
[PR #562](https://github.com/aai-institute/pyDVL/pull/562)

Expand All @@ -129,7 +151,7 @@
- New preconditioned block variant of conjugate gradient
[PR #507](https://github.com/aai-institute/pyDVL/pull/507)
- Improvements to documentation: fixes, links, text, example gallery, LFS and
more [PR #532](https://github.com/aai-institute/pyDVL/pull/532),
more [PR #532](https://github.com/aai-institute/pyDVL/pull/532),
[PR #543](https://github.com/aai-institute/pyDVL/pull/543)
- Glossary of data valuation and influence terms in the documentation
[PR #537](https://github.com/aai-institute/pyDVL/pull/537
Expand All @@ -142,11 +164,11 @@
[PR #495](https://github.com/aai-institute/pyDVL/pull/495)
- Memory issue with `CgInfluence` and `ArnoldiInfluence`
[PR #498](https://github.com/aai-institute/pyDVL/pull/498)
- Raising specific error message with install instruction, when trying to load
- Raising specific error message with install instruction, when trying to load
`pydvl.utils.cache.memcached` without `pymemcache` installed.
If `pymemcache` is available, all symbols from `pydvl.utils.cache.memcached`
If `pymemcache` is available, all symbols from `pydvl.utils.cache.memcached`
are available through `pydvl.utils.cache`
[PR #509](https://github.com/aai-institute/pyDVL/pull/509)
[PR #509](https://github.com/aai-institute/pyDVL/pull/509)

### Changed

Expand Down Expand Up @@ -175,9 +197,9 @@
### Fixed

- Bug in using `DaskInfluenceCalcualator` with `TorchnumpyConverter`
for single dimensional arrays
for single dimensional arrays
[PR #485](https://github.com/aai-institute/pyDVL/pull/485)
- Fix implementations of `to` methods of `TorchInfluenceFunctionModel`
- Fix implementations of `to` methods of `TorchInfluenceFunctionModel`
implementations [PR #487](https://github.com/aai-institute/pyDVL/pull/487)
- Fixed bug with checking for converged values in semivalues
[PR #341](https://github.com/appliedAI-Initiative/pyDVL/pull/341)
Expand All @@ -197,15 +219,15 @@
- New influence function interface `InfluenceFunctionModel`
- Data parallel computation with `DaskInfluenceCalculator`
[PR #26](https://github.com/aai-institute/pyDVL/issues/26)
- Sequential batch-wise computation and write to disk with
`SequentialInfluenceCalculator`
- Sequential batch-wise computation and write to disk with
`SequentialInfluenceCalculator`
[PR #377](https://github.com/aai-institute/pyDVL/issues/377)
- Adapt notebooks to new influence abstractions
[PR #430](https://github.com/aai-institute/pyDVL/issues/430)

### Changed

- Refactor and simplify caching implementation
- Refactor and simplify caching implementation
[PR #458](https://github.com/aai-institute/pyDVL/pull/458)
- Simplify display of computation progress
[PR #466](https://github.com/aai-institute/pyDVL/pull/466)
Expand All @@ -230,8 +252,8 @@

- New method: Class-wise Shapley values
[PR #338](https://github.com/aai-institute/pyDVL/pull/338)
- New method: Data-OOB by @BastienZim
[PR #426](https://github.com/aai-institute/pyDVL/pull/426),
- New method: Data-OOB by @BastienZim
[PR #426](https://github.com/aai-institute/pyDVL/pull/426),
[PR $431](https://github.com/aai-institute/pyDVL/pull/431)
- Added `AntitheticPermutationSampler`
[PR #439](https://github.com/aai-institute/pyDVL/pull/439)
Expand Down Expand Up @@ -270,7 +292,7 @@ randomness.
- Added more abbreviations to documentation
[PR #415](https://github.com/aai-institute/pyDVL/pull/415)
- Added seed to functions from `pydvl.utils.numeric`, `pydvl.value.shapley` and
`pydvl.value.semivalues`. Introduced new type `Seed` and conversion function
`pydvl.value.semivalues`. Introduced new type `Seed` and conversion function
`ensure_seed_sequence`.
[PR #396](https://github.com/aai-institute/pyDVL/pull/396)
- Added `batch_size` parameter to `compute_banzhaf_semivalues`,
Expand All @@ -287,7 +309,7 @@ randomness.
[PR #352](https://github.com/aai-institute/pyDVL/pull/352)
- Made ray an optional dependency, relying on joblib as default parallel backend
[PR #408](https://github.com/aai-institute/pyDVL/pull/408)
- Decoupled `ray.init` from `ParallelConfig`
- Decoupled `ray.init` from `ParallelConfig`
[PR #373](https://github.com/aai-institute/pyDVL/pull/383)
- **Breaking Changes**
- Signature change: return information about Hessian inversion from
Expand Down Expand Up @@ -329,7 +351,7 @@ randomness.
(TMCS) starting too many processes and dying, plus other small changes
[PR #329](https://github.com/aai-institute/pyDVL/pull/329)
- Fix creation of GroupedDataset objects using the `from_arrays`
and `from_sklearn` class methods
and `from_sklearn` class methods
[PR #324](https://github.com/aai-institute/pyDVL/pull/334)
- Fix release job not triggering on CI when a new tag is pushed
[PR #331](https://github.com/aai-institute/pyDVL/pull/331)
Expand Down Expand Up @@ -386,13 +408,13 @@ randomness.
[PR #268](https://github.com/aai-institute/pyDVL/pull/268)
- Splitting of problem preparation and solution in Least-Core computation.
Umbrella function for LC methods.
[PR #257](https://github.com/aai-institute/pyDVL/pull/257)
[PR #257](https://github.com/aai-institute/pyDVL/pull/257)
- Operations on `ValuationResult` and `Status` and some cleanup
[PR #248](https://github.com/aai-institute/pyDVL/pull/248)
- **Bug fix and minor improvements**: Fixes bug in TMCS with remote Ray cluster,
raises an error for dummy sequential parallel backend with TMCS, clones model
inside `Utility` before fitting by default, with flag `clone_before_fit`
to disable it, catches all warnings in `Utility` when `show_warnings` is
inside `Utility` before fitting by default, with flag `clone_before_fit`
to disable it, catches all warnings in `Utility` when `show_warnings` is
`False`. Adds Miner and Gloves toy games utilities
[PR #247](https://github.com/aai-institute/pyDVL/pull/247)

Expand All @@ -402,7 +424,7 @@ randomness.
[PR #201](https://github.com/aai-institute/pyDVL/pull/201)
- Disabled caching of Utility values as well as repeated evaluations by default
[PR #211](https://github.com/aai-institute/pyDVL/pull/211)
- Test and officially support Python version 3.9 and 3.10
- Test and officially support Python version 3.9 and 3.10
[PR #208](https://github.com/aai-institute/pyDVL/pull/208)
- **Breaking change:** Introduces a class ValuationResult to gather and inspect
results from all valuation algorithms
Expand Down
Loading
Loading