Bringing laplace-torch to foundation-model era #144

wiseodd · 2024-02-24T13:00:23Z

Main features of this pull request:

Support only doing Laplace on params that require grad. Use case: PEFT (like LoRA) on top of a frozen foundation model. This is more efficient than SubnetLaplace since the latter still computes the full Jacobians.
Add support to multiple leading dims in classification likelihood. E.g. the logits is (batch_size, seq_len, n_classes). Useful for language modeling and reward modeling.
1. This PR also contains the integrate-latest-asdl changes. I tested it with my ASDL fork (only a couple of light changes to support weight-sharing dim and ignore_index; so crucial in language modeling): https://github.com/wiseodd/asdl/commits/dev/. Please also check this and let me know what's the most elegant way.
Support Huggingface dataset. The assumption is that x is a UserDict containing input_ids, attention_mask, etc., things that are produced by HF dataloader.
Add a new likelihood called reward_modeling where the classification likelihood is used during training and the regression likelihood is used during prediction.
Add support to torchmetrics for gridsearch. The benefit is that it supports running metrics => less memory overhead (vis-a-vis gathering all the logits first).
Add Jacobian computation with torch.func (functorch, really) as a general Jacobian computation for GLM predictive. Useful for Bayesian optimization/invariance learning where you need to backprop through the variance. Much more elegant than to change ASDL.

Relevant unit tests are provided. All tests passed; the only ones failed are the old LowRankLaplace issues.

…w version

…n arbitrary parameter indices as in SubnetLaplace, but per-parameter (i.e. weight, bias) subsets.

…softmax) in NN predictive; 3. Pass model kwargs in NN predictive.

…rec; add support for softmax temp for classification predictive

Add support for cross entropy loss inputs with multiple leading dimensions

… cross val

wiseodd · 2024-04-25T22:57:06Z

I replaced get_nll in crossval with RunningNLLMetrics(). So this PR will close #160 .
(I thought before that I RunningNLLMetics hadn't been implemented, so would have involve major work.)

…t on the likelihood

wiseodd · 2024-04-25T23:35:29Z

Might as well fixes #156 while we're at it.

README.md

examples/huggingface_example.md

laplace/laplace.py

setup.cfg

This reverts commit 98a9800.

wiseodd · 2024-04-26T16:43:24Z

All tasks finished!

wiseodd · 2024-04-27T18:53:10Z

Double-checked and everything looks fine! Merging this.

aleximmer and others added 30 commits August 9, 2022 13:25

Enable regression with new ASDL version; update ASDL interface for ne…

c604e75

…w version

Preliminary tests for asdl curvature backend

f7d952f

Account for latest asdl-0.1 code base

210547d

Set asdl as default backend

36954e0

Minor change to asdl imports

2795163

Merge branch 'main' into integrate-latest-asdl

44fc1c2

Integrate dev-grad-maker instead of 0.1

a5d5465

Update asdl conv

7f7fcd1

Detach function in asdl

ef21a9a

ASDL hotfix

332a2af

Add option for doing Laplace on a subset of parameters. This is not a…

7b39043

…n arbitrary parameter indices as in SubnetLaplace, but per-parameter (i.e. weight, bias) subsets.

Switch of gradient for params outside of .

c1d95e0

Unit tests for subset_params feature

efd856b

Add tests for backends

56ac6e2

Infer subset_params from requires_grad

0c7a230

Revert comments in test_baselaplace

af71cfd

Update test_matrix

4a86109

Revert back KronLaplace initialization

239bea6

1. Handle multi-dim targets in fit(); 2. Add temperature option (for …

f332ad3

…softmax) in NN predictive; 3. Pass model kwargs in NN predictive.

Remove flatten_y

062d5ec

Add option for showing progress bar when fitting & optimizing prior p…

42276f1

…rec; add support for softmax temp for classification predictive

Support Huggingface dataset for fitting LA

fa9ce9b

Use UserDict instead of dict to check Huggingface dataset

8d1cbde

Enable layerwise prior only for grad-enabled parameters

2f18949

Remove temp scaling

99278be

Add support for CE loss inputs with multiple leading dims

d846f61

Cleanup

309981e

Merge pull request #132 from AlexImmer/fix-losses

6aa68c2

Add support for cross entropy loss inputs with multiple leading dimensions

Computing the classification BMA, i.e. average of softmaxes, online.

eed0795

Rename function

fb83637

This was referenced Apr 23, 2024

Add example & doc for reward modeling functionality #166

Closed

Doc & example for reward modeling #167

Merged

Make the dict keys for models with dict-like inputs general #168

Merged

wiseodd linked an issue Apr 24, 2024 that may be closed by this pull request

Feature Request - Implementation for BERT #114

Closed

This was referenced Apr 25, 2024

Feature caching mechanism in LLLA #170

Merged

Add an option to reduce LLM features in LLLaplace #172

Merged

Prevent computing posterior precision in KronLaplace when it's not fitted #173

Merged

Replace get_nll with RunningNLLMetric() as the default metric for…

5bc24ac

… cross val

wiseodd force-pushed the mc-subset2 branch from f6a50f1 to 5bc24ac Compare April 25, 2024 22:54

wiseodd linked an issue Apr 25, 2024 that may be closed by this pull request

Make torchmetrics metrics the default in cross validation #160

Closed

Fixes #156: Make the default cross validation (running) loss dependan…

5d043f9

…t on the likelihood

wiseodd linked an issue Apr 25, 2024 that may be closed by this pull request

Remove default loss for optimize_prior_precision #156

Closed

aleximmer reviewed Apr 26, 2024

View reviewed changes

README.md Show resolved Hide resolved

aleximmer reviewed Apr 26, 2024

View reviewed changes

examples/huggingface_example.md Outdated Show resolved Hide resolved

laplace/laplace.py Show resolved Hide resolved

setup.cfg Show resolved Hide resolved

This was referenced Apr 26, 2024

[WIP] Proposal for black code style #63

Closed

[WIP] Integration of the latest asdl package #47

Closed

wiseodd added 3 commits April 26, 2024 11:59

Revert "Make Kron init agnostic to the shape of the Kronecker factors"

22449ef

This reverts commit 98a9800.

Typo in examples md

d37cdc9

Add back asdfghjkl backend alongside asdl

4a9c3fa

Remove eig_lowrank from ASDL

a4d3ed6

This was referenced Apr 26, 2024

Replace hardcoded class index with logit_class_dim argument #177

Open

Fix all linting issues and format all files with ruff #179

Closed

Typehinting #180

Merged

wiseodd merged commit 76a04eb into main Apr 27, 2024

wiseodd deleted the mc-subset2 branch April 27, 2024 18:53

wiseodd mentioned this pull request May 31, 2024

subnetwork with Kronecker covariance #189

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bringing laplace-torch to foundation-model era #144

Bringing laplace-torch to foundation-model era #144

wiseodd commented Feb 24, 2024 •

edited

Loading

wiseodd commented Apr 25, 2024

wiseodd commented Apr 25, 2024

wiseodd commented Apr 26, 2024

wiseodd commented Apr 27, 2024

Bringing laplace-torch to foundation-model era #144

Bringing laplace-torch to foundation-model era #144

Conversation

wiseodd commented Feb 24, 2024 • edited Loading

wiseodd commented Apr 25, 2024

wiseodd commented Apr 25, 2024

wiseodd commented Apr 26, 2024

wiseodd commented Apr 27, 2024

wiseodd commented Feb 24, 2024 •

edited

Loading