-
Notifications
You must be signed in to change notification settings - Fork 32k
Detect and fix most _init_weights() issues - make it work for composite models
#37070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
40 commits
Select commit
Hold shift + click to select a range
47c8e54
Update test_modeling_common.py
Cyrilvallez abe0488
Fix Llama and its modular children
Cyrilvallez 4749c0d
Update test_modeling_common.py
Cyrilvallez 49d7625
qwen3
Cyrilvallez 80e52d7
first try at prioritizing models
Cyrilvallez 155969d
Update test_modeling_common.py
Cyrilvallez 9a15450
Update test_modeling_common.py
Cyrilvallez 62725b4
Update test_modeling_common.py
Cyrilvallez b5eb6cd
test
Cyrilvallez d0a016a
fix
Cyrilvallez dceab88
fix
Cyrilvallez 3251852
more models
Cyrilvallez 15bcf97
more
Cyrilvallez 997ba7e
more
Cyrilvallez 0551911
more
Cyrilvallez eb4bfd3
smarter init for composite models!
Cyrilvallez 2082647
fix post rebase
Cyrilvallez 9a83073
smol
Cyrilvallez 4fc6898
fix missing args
Cyrilvallez fa736c9
more
Cyrilvallez 7001d13
typo
Cyrilvallez b083965
Super elegant and efficient init for submodels
Cyrilvallez 7adb98c
Update modeling_utils.py
Cyrilvallez bf9b49f
style
Cyrilvallez e4141c0
last fixes
Cyrilvallez a04f7d5
cleanup
Cyrilvallez 8ed50ab
finalize cleanup
Cyrilvallez a9c303e
CIs
Cyrilvallez 23eb8c1
improve docstring
Cyrilvallez 1aa8914
Update modeling_utils.py
Cyrilvallez 28f8657
llama4
Cyrilvallez ce281b8
style
Cyrilvallez c135488
CIs
Cyrilvallez f41f9cc
style
Cyrilvallez 6f6364c
add dpt
Cyrilvallez ce665b8
granite speech
Cyrilvallez e3ccb5f
qwen 2.5 omni
Cyrilvallez 1034c8c
better fix
Cyrilvallez 25d47e4
Parse the config file instead
Cyrilvallez 4400c52
CIs
Cyrilvallez File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the most important change to review @ArthurZucker. It's the most efficient and elegant way to handle it, as we only need to traverse modules once. However, it requires to hot-patch
torch.nn.Module, which is a bummer but fine IMO.Other options to avoid doing so all require to traverse the modules several times (at least 2 times) which is less efficient.