-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add log_grad_norm
hook to LightningModule
#7873
Conversation
Codecov Report
@@ Coverage Diff @@
## master #7873 +/- ##
======================================
- Coverage 93% 92% -0%
======================================
Files 202 202
Lines 13123 13114 -9
======================================
- Hits 12156 12111 -45
- Misses 967 1003 +36 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great, already integrated in new loop!
607d61c
to
db7864a
Compare
@@ -440,6 +440,20 @@ def __sync( | |||
def __check_none(name: str, value: Any, _) -> Any: | |||
raise ValueError(f'`self.log({name}, {value})` was called, but `None` values cannot be logged') | |||
|
|||
def log_grad_norm(self, grad_norm_dict: Dict[str, torch.Tensor]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is adding a new hook the only option here? his feels very strange that this only controls the log behavior, and not the calculation. what's the relationship between this and on_after_bacwkard
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ananthsub. Users wanted to customise how they aggregate the grad_norm and log them.
Here, they are free to:
- Change reduce_fx
- Compute extra values such as standard deviation or total mean norm.
- Perform custom aggregation.
on_after_bacwkard
is independent for this function.
pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py
Show resolved
Hide resolved
def log_grad_norm(self, grad_norm_dict): | ||
self.log_dict(grad_norm_dict, on_step=False, on_epoch=True, prog_bar=False, logger=True) | ||
""" | ||
self.log_dict(grad_norm_dict, on_step=True, on_epoch=True, prog_bar=True, logger=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while not BC, is prog_bar=False
a better default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carmocca Any thoughts on this one ? I see value in both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pros for true:
- Directly visible
- Quick experimentation
Cons for true:
- Usually too cluttered
I think I prefer to have it as True
which seems to me like the most convenient option when manually debugging your model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
What does this PR do?
Part of #7631
Before submitting
PR review