[develop] Fix for error at the end of monitor_jobs.py, other minor improvements#600
Conversation
…t skip checking DEAD, ERROR, or COMPLETE status jobs. This will alow the use of this script to, for example, make changes to a failed experiment and re-run after the appropriate rocotorewind command(s) have been run
|
Machine: hera |
|
Machine: jet |
MichaelLueken
left a comment
There was a problem hiding this comment.
@mkavulich Thank you for quickly addressing the issue with encountered at the end of monitor_jobs.py! I was able to successfully test this on Jet as well. Approving this work.
|
@MichaelLueken Do you know what it means when it says the CI build was "aborted"? I didn't make any changes to the build system so I'm assuming it's not related to this change? |
@mkavulich Even though the |
|
If you have the time, please review this PR and @danielabdi-noaa PR #612. These two bug fixes are important to get into develop as soon as possible. Thank you very much for your time. |
DESCRIPTION OF CHANGES:
This is another round of improvements for the pythonized WE2E test scripts. I was originally going to wait until the script was ready to replace, but I accidentally forgot to make a change to a final log message in monitor_jobs.py; this results in a scary-looking error message even though all experiments may have completed successfully.
The solution was simply to replace the undefined variable with a correct one.
This PR comes from an in-progress branch for improvements to the python scripts, so a few other improvements are coming along with this bug fix:
run_envir, and add all needed variables forrun_envir=ncomode.Type of change
TESTS CONDUCTED:
This change mostly impacts the new python-based tests which are not yet used for official testing. Ran some tests on Hera to ensure the problem with the WE2E testin scripts was fixed. Also running fundamental tests on Hera and Jet due to change in setup.py.
DEPENDENCIES:
None
DOCUMENTATION:
None
ISSUE:
Related to #586, more work needed.
CHECKLIST
CONTRIBUTORS (optional):
Thanks to @MichaelLueken for pointing out the error.