-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Add warning and info message for beta and gamma parameters #33192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
cc @amyeroberts |
amyeroberts
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this @zly-idleness!
At the moment, this involves iterating over the loaded keys twice - once on L4116 and then on L4146. It also involves a redefinition of original_loaded_keys. We should rework the code to reduce the double logic
Thank you for pointing that out. I'll refactor the code to eliminate redundancy |
src/transformers/modeling_utils.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work - this will modify both loaded_keys and original_loaded_keys:
In [1]: li_0 = list(range(10))
In [2]: li_1 = li_0
In [3]: for i in range(10):
...: if i % 2:
...: li_0[i] = -1
...:
In [4]: li_0
Out[4]: [0, -1, 2, -1, 4, -1, 6, -1, 8, -1]
In [5]: li_1
Out[5]: [0, -1, 2, -1, 4, -1, 6, -1, 8, -1]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you sir ! You are totally right , I forget add a copy() to ensure that the original_loaded_keys remain unaltered.
amyeroberts
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating!
At the moment, the logic is being forced around the old renaming code. There are also changes in the PR to the olmoe model which should be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There shouldn't be any changes to this file in the PR
src/transformers/modeling_utils.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message isn't consistent - at the moment all of the renamed keys will be listed. Like in the other places where this logic is added, let's just take the first renames
src/transformers/modeling_utils.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than try and use the existing _fix_key logic - it would be better to rework this to:
- Not use
copy - To only add the first renaming case for gamma and beta respectively
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok , I will do it
What does this PR do?
This adds a warning message to notify about the renaming of gamma and beta parameters during initialisation and also during loading.
before:
after:
Fixes #29554 and #33190 (issue)
Before submitting
Pull Request section?
to it if that's the case.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag

members/contributors who may be interested in your PR.