Don't log variables (un)set + raw output of module command when cleaning up fake module#4942
Conversation
In some cases the output might be shown by other means so this parameter can be used to not show it in the log (at INFO level) which can reduce the amount of information logged to the necessary amount.
Unloading a module (and its dependencies) changes a massive amount of variables. As the changes done by *unloading* the fake module aren't really relevant for the remaining build (it is usually done at the end) do not log them. This avoids filling up the part of the log uploaded with test reports with irrelevant information.
lexming
left a comment
There was a problem hiding this comment.
I like this change.
The failed test looks a fluke failure, can you re-trigger it?
And I just have another minor comment,
|
|
||
|
|
||
| def setvar(key, value, verbose=True): | ||
| def setvar(key, value, verbose=True, log_changes=True): |
There was a problem hiding this comment.
can we use the existing verbose for this? something like
if verbose:
<print log>
if verbose and dry_run:
<do the existing dry run prints>
There was a problem hiding this comment.
verbose is currently solely used for the extended-dry-run and I wasn't sure of the impact.
My new idea would be:
if verbose or debug-option: Print setting
Set verbose=False for all module command uses and the fake module unset.
Also only log output of module commands (lmod calls) when debug-option is set
The changes by the module commands are usually not interesting and just make the log harder to read. So suppressing them by default but with an option to show them sounds like the most sensible approach.
There was a problem hiding this comment.
Actually at least the reports from boegelbot runs include the debug flag so that wouldn't make those reports more readable :-/
There was a problem hiding this comment.
Given the limited length of the failed report, we can very well disable debug in the bot. Most of the time debug logs are useless when a build fails anyhow, in that case you just need the output of the failing command/test/sanity check. We can discuss about this in the next call.
There was a problem hiding this comment.
So shall we go forward with my proposal above or wait for the call too?
There was a problem hiding this comment.
I strongly prefer keeping debug enabled in the bot, since it does result in a better log (though less easy to parse).
I think selectively making setvar log less as is being done here makes sense.
There was a problem hiding this comment.
I'd want to make a more general decision here. Basically what we usually don't want is output from a) module commands (proposed, not yet implemented) b) variables set by loading modules c) variables changed by unloading modules, especially the fake/real-module in the sanity check
That information might still be useful in limited contexts, e.g. during debugging strange failures, so logging it with --debug sounds reasonable as otherwise there won't be any way to show those.
But if those information is shown with --debug and the test reports use --debug we haven't gained anything for that common use case.
How do we address this?
We could introduce debug levels, similar to how -vv is more verbose than -v in other programs.
Or we could use --debug --trace as "log more debug info"
There was a problem hiding this comment.
We have an extra level of logging (--devel), we could only log those details when devel log level is used?
There was a problem hiding this comment.
BTW, we're sort of discussing this in the wrong place.
I'm about to merge this PR, and discussing stuff in closed PRs is something we should definitely avoid...
@Flamefire Can you open an issue to follow-up on this, beyond the changes made in this PR?
boegel
left a comment
There was a problem hiding this comment.
@Flamefire Looks OK, just some suggestion w.r.t. naming of the extra options, to make it clear those impact (only) logging (and not trace output, for example).
I've implemented the suggestions myself in:
module command when cleaning up fake module
Unloading a module (and its dependencies) changes a massive amount of variables.
As the changes done by unloading the fake module aren't really relevant for the remaining build (it is usually done at the end) do not log them.
This avoids filling up the part of the log uploaded with test reports with irrelevant information.
Example: https://gist.github.com/Flamefire/afae1a8e2b18a4bc2786f539d5f439d3
This change silences the output of the
module unload <fake module>command by introducing an additional parameter torun_shell_commandthat suppresses output for successful commands.Another parameter for the
environmentmodule allows to suppress output for each set variable, e.g.Closes #4937 (mostly, we might want to suppress output of all module commands or print them only in debug logs)
I think that unloading the fake module and removing the module path is not useful. The succeeding
restore_envalready undoes all changes done by loading the module.Note that e.g. loading the fake module unsets many environment variables. E.g.
CUDA_CACHE_DISABLEwhich therestore_envthen sets again. So directly callingrestore_envwould already reduce the amount of changes done and hence reduce the number of log lines.Why don't we do that?