-
Notifications
You must be signed in to change notification settings - Fork 671
Making wrapper tensor subclass to work in serialization #2440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
a9911b8
Making wrapper tensor subclass to work in huggingface_hub serializati…
jerryzh168 4ebe8aa
add tests
jerryzh168 0a1cc96
update signature to include new changes for tensor subclass
jerryzh168 8fd26b7
add torch version checks and move around import
jerryzh168 b889019
more fixes
jerryzh168 c91fe18
tested with torch 2.0.0 and 2.5.0
jerryzh168 83f6884
Merge branch 'main' into non-safetensor-ser
jerryzh168 a02657d
Merge branch 'main' into non-safetensor-ser
Wauplin 94b3885
remove torch_version_at_least from _torch.py
jerryzh168 1b25148
simplify code for checking if tensor subclass is available or not
jerryzh168 ca3c228
minor fix
jerryzh168 122a68a
addressing comments and run tests with torch 2.4.0
jerryzh168 3c79607
Merge branch 'main' into non-safetensor-ser
Wauplin 7fb689b
some linting
Wauplin 45cb98a
add test_split_torch_state_dict_into_shards for tensor subclass state…
jerryzh168 1486253
lint
Wauplin 7850025
style
Wauplin 473c317
quality
Wauplin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, two "meta" tensors can have the exact same
_get_unique_id(tensor)
, the exact sametensor.device
but still be different, correct? If different, how can we be sure their storage size distinguish them? Can it happen that they randomly happen to have the same storage size?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it just means the current approach does not generalize to meta tensor, does it work previously?
I think we'd need to reimplement the higher level sharding logic in the end in pytorch, I added some PoC in the slack, let me make a quick intro there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so since we never had to serialize meta tensors. The only use case that could benefit from that is in accelerate (find tied parameters from the meta model). Right now, this is how we do for meta tensors: https://github.com/huggingface/accelerate/blob/726140cad2f2361d79da7786a7b96d0bee591c48/src/accelerate/utils/modeling.py#L677