[Serialization] support loading torch state dict from disk #2687

hanouticelina · 2024-12-02T17:49:04Z

Implement helpers to load a torch state dict from disk. For the implementation, it's mostly an importing from transformers and diffusers implementation with additional error handling and some refactoring. the loading can be done from a single file or from shards. It handles both safetensors and pickle files. Saving torch state dict has been previously added in #2314.

This PR:

adds load_torch_model() helper function that takes a nn.Module and a checkpoint path (either a single file or a directory) as input and loads the weights into the model.

usage example:

from huggingface_hub import load_torch_model
model = ... # A PyTorch model

# load the weights into model
load_torch_model(model, "path/to/checkpoint")

adds a low-level helper that can be used directly by transformers, diffusers and accelerate:
- load_state_dict_from_file(): loads a single checkpoint file.
tests have been added and documentation have also been updated.

Note: PRs will be opened in transformers, diffusers and accelerate to integrate these helpers once huggingface_hub v0.27.0 is released.
here is example of an integration in transformers: hanouticelina/transformers#1

cc @sayakpaul for diffusers, @muellerzr and @SunMarc for accelerate and @ArthurZucker for transformers. Happy to get any feedback on this. The goal is the same as for the saving helpers: standardize things across our libraries and establish consistent conventions!

HuggingFaceDocBuilderDev · 2024-12-02T17:53:02Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Wauplin

Nice PR @hanouticelina ! That's promising for our other libs :) I've made a first pass on the PR and left a few comments. Overall looks good though I think we can update a few things to expose only 1 or 2 methods in the library.

Also, would it be possible to open a PR on transformers to showcase how it would be used? No need to make updates everywhere in the lib', just 1 example is enough for now

src/huggingface_hub/serialization/_torch.py

docs/source/en/package_reference/serialization.md

src/huggingface_hub/serialization/_torch.py

hanouticelina · 2024-12-10T15:18:31Z

@Wauplin thanks for the review! I've addressed the comments and created a draft PR hanouticelina/transformers#1 to illustrate the integration (I've opened it on my personal fork for now while this PR is still WIP) –would love to have your feedback on that!

sayakpaul · 2024-12-10T15:20:08Z

Sorry for the policing, but does a similar PR need to be opened in diffusers too? 👀

hanouticelina · 2024-12-10T15:26:30Z

Sorry for the policing, but does a similar PR need to be opened in diffusers too? 👀

@sayakpaul, yes! we plan to open a PR in diffusers once these helpers are released. For now, I've created hanouticelina/transformers/pull/1 just as an example integration

Wauplin

Great job @hanouticelina ! This is a super well documented and tested PR which is much appreciated for such a key-part of the library! Tested it locally and it seems to work as I'd expect it^^

@LysandreJik @ArthurZucker would it be possible to take a closer look at this PR and especially hanouticelina/transformers#1 to confirm everything's fine for you as well (worth case we ship and make hot-fixes if necessary). We'd like to ship this quickly :)

src/huggingface_hub/__init__.py

src/huggingface_hub/serialization/_torch.py

Wauplin · 2024-12-12T14:32:44Z

src/huggingface_hub/serialization/_torch.py

+                f"The safetensors archive passed at {checkpoint_file} does not contain the valid metadata. Make sure "
+                "you save your model with the `save_torch_model` method."
+            )
+        return load_file(checkpoint_file)


safetensors supports loading directly to a device: https://huggingface.co/docs/safetensors/api/torch#safetensors.torch.load_file

indeed! thanks for pointing this out.
Note that in transformers, we always load the state dict in cpu or in a meta device and then the weights get moved to their respective devices during model dispatch in the from_pretrained() method.

Also meta device is not supported with safetensors. i guess we can fall back to cpu with a warning when meta device is specified in this case.

src/huggingface_hub/serialization/_torch.py

LysandreJik

From a quick look these two helpers could indeed replace a few codepaths in transformers. There is quite a bit of complexity there so I would recommend merging an initial version, setting up a branch in transformers to switch to these helpers and running the CI on it.

Thanks for your work @hanouticelina ! Any work that makes our from_pretrained method and our modeling_utils module simpler are welcome

hanouticelina · 2024-12-13T09:51:43Z

thanks @LysandreJik, indeed, the best way to test these helpers is to run transformers CI on it! I will merge this and then will create the branch directly in transformers

hanouticelina added 8 commits December 2, 2024 16:39

add first version of state dict loading helpers

f074647

Rename function

a29dc6e

Update documentation

d3cff9e

Update documentation

e9e54ed

Fix typo

393aba4

change titles

032471b

remove file

5075b8f

fix docstrings

259c8f3

hanouticelina requested review from Wauplin and LysandreJik December 2, 2024 17:49

fix test for torch<=2.1.0

9cb50c5

Wauplin reviewed Dec 5, 2024

View reviewed changes

hanouticelina added 5 commits December 9, 2024 17:05

changes post-review

0cce776

fix importing

ec60a3c

fix static imports

ed9cd99

fix documentation

67eebfd

add requires decorator to the test

1576ba5

hanouticelina requested a review from Wauplin December 10, 2024 15:18

hanouticelina added 3 commits December 10, 2024 17:09

Add mmap parameter

dd50cf4

Merge branch 'main' into load-state-dict

efdbe28

fix Windows path escaping issue in regex match

43eb7f6

Wauplin approved these changes Dec 12, 2024

View reviewed changes

pass device when loading safetensors

2ed6aa5

LysandreJik approved these changes Dec 13, 2024

View reviewed changes

hanouticelina merged commit b75f8d9 into main Dec 13, 2024
17 checks passed

hanouticelina deleted the load-state-dict branch December 13, 2024 09:52

hanouticelina mentioned this pull request Dec 13, 2024

[Serialization] Fix: remove context manager when loading shards + handle mlx weights #2709

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Serialization] support loading torch state dict from disk #2687

[Serialization] support loading torch state dict from disk #2687

hanouticelina commented Dec 2, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 2, 2024

Wauplin left a comment

hanouticelina commented Dec 10, 2024

sayakpaul commented Dec 10, 2024

hanouticelina commented Dec 10, 2024

Wauplin left a comment

Wauplin Dec 12, 2024

hanouticelina Dec 12, 2024

LysandreJik left a comment

hanouticelina commented Dec 13, 2024

[Serialization] support loading torch state dict from disk #2687

[Serialization] support loading torch state dict from disk #2687

Conversation

hanouticelina commented Dec 2, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Dec 2, 2024

Wauplin left a comment

Choose a reason for hiding this comment

hanouticelina commented Dec 10, 2024

sayakpaul commented Dec 10, 2024

hanouticelina commented Dec 10, 2024

Wauplin left a comment

Choose a reason for hiding this comment

Wauplin Dec 12, 2024

Choose a reason for hiding this comment

hanouticelina Dec 12, 2024

Choose a reason for hiding this comment

LysandreJik left a comment

Choose a reason for hiding this comment

hanouticelina commented Dec 13, 2024

hanouticelina commented Dec 2, 2024 •

edited

Loading