-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Use config.layer_norm_eps in some nn.LayerNorm.
#20699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| """Construct the overlapping patch embeddings.""" | ||
|
|
||
| def __init__(self, patch_size, stride, num_channels, hidden_size): | ||
| def __init__(self, config, patch_size, stride, num_channels, hidden_size): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need this new argument so we can use eps=config.layer_norm_eps. As this is an internal class, should be fine.
| """Construct the overlapping patch embeddings.""" | ||
|
|
||
| def __init__(self, patch_size, stride, num_channels, hidden_size): | ||
| def __init__(self, config, patch_size, stride, num_channels, hidden_size): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need this new argument so we can use eps=config.layer_norm_eps. As this is an internal class, should be fine.
|
The documentation is not available anymore as the PR was closed or merged. |
|
Just to confirm:
It has impact even for integration tests: the change is the constant
I agree - but I am not sure, for recent models, if all these attributes are set according to the papers, or people just used add new model like templates ... |
|
This is too breaking I think. We need to be more careful on new models added that this attribute is consistently used but I don't think we should touch old models like this as it will change the results of the forward. |
|
OK! I will keep this list of models to skip in the WIP PR where we add a test for checking unused config attributes. |
|
Close as it is too breaking! |
What does this PR do?
epsof those LayerNorm layers from (the default)1e-5to1e-12, and the outputs will have slightly differences before/after this PR.Similar to #20554, but this time instead of removing the attribute from config, we use
config.layer_norm_epsin somenn.LayerNorm.