-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU Usage Logger #2932
GPU Usage Logger #2932
Conversation
Hello @groadabike! Thanks for updating this PR.
Comment last updated at 2020-08-12 14:17:24 UTC |
how is this different than the flag to log gpu usage? |
the log_gpu_memory flag in trainer just log the memory.used or the min and max memory used. |
Codecov Report
@@ Coverage Diff @@
## master #2932 +/- ##
=======================================
+ Coverage 86% 90% +3%
=======================================
Files 80 81 +1
Lines 7449 7542 +93
=======================================
+ Hits 6430 6755 +325
+ Misses 1019 787 -232 |
ok cool. makes sense! let's clean up the docs for this |
* GPU utilisation Callback * GPU utilisation Callback * Fixing style * Fixing style * Fixing CodeFactor: partial executable path * Fix a misspelling in the Class name
What does this PR do?
This is a Callback that logs the GPU utilisation during the training stage, giving a sense of how the resources have been used, helping to measure the effect of any implementation improvement.
Callback that query NVIDIA-SMI some GPU stats and log it:
Also, It measures the time between batches (inter_step_time) and in batches (intra_step_time)
These last two parameters can give a sense of how the "total batch time" is distributed (loading batch + batch pass).
For example, a large inter_step_time could mean a slow dataload and any improvement in the DataSet class it will be reflected in a reduction of this parameter.
Fixes # (2074)
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃