[develop] Improvements for WE2E tests: script features, additional tests, remove unsupported domains#871
Conversation
…etatasks that do not exist or have been removed by other checks.
- Remove extraneous empty metatask dependencies from MET_verification_only_vx test now that these are handled in setup.py - Add ability to specify a subdirectory of test_configs to run all tests in it - Move custom grid yamls to separate directory
non-config files are skipped in 'all' and directory tests
…is most expensive at ~60 core hours, rest are 30 or less.
…t can take the values "python", "cron", or "none" - Change names in argparse section to shorten lines
- Update custom grids for more appropriate physics suites - Speed up Peru 12km case with more nodes and shorter forecast time - Distribute new custom domains to comprehensive and coverage tests
symlink - Unless running in debug mode, call "set_template" with "quiet" flag
for more verbose output - Also add debug flag for set_FV3nml_sfc_climo_filenames.py - Convert most print messages to debug-only for cleaner output - Do not read VERBOSE flag for generate script; this flag should only affect experiments, not initial scripts
…oes not have HPSS access)
…that will only run rocotorun once for each experiment rather than continuously monitoring
20fdb32 to
2b9ac58
Compare
567c5ca to
7567188
Compare
| help='Explicitly set EXPT_BASEDIR for all experiments') | ||
| ap.add_argument('--exec_subdir', type=str, | ||
| help='Explicitly set EXEC_SUBDIR for all experiments') | ||
| ap.add_argument('--use_cron_to_relaunch', action='store_true', |
There was a problem hiding this comment.
@mkavulich Seeing the cron option reminded me of a bug that I've run into with the run_WE2E_tests.py script, which is that when the user's cron table contains a commented line (job) that is exactly identical to the line that this script would otherwise insert into the table, the script doesn't insert the line (so the experiment doesn't get (re)launched). I guess it's because the script doesn't check that the existing line is commented out and so thinks the line is already in the table. Can you run a quick test to see if that is still happening? Thanks.
There was a problem hiding this comment.
@mkavulich Just realized the bug (if it still exists) might be in the script get_crontab_contents.py.
There was a problem hiding this comment.
@gsketefian Just included a fix for that bug as well, feel free to test it out.
There was a problem hiding this comment.
@mkavulich Ok, thanks. I'll trust you that it works!
| @@ -0,0 +1,26 @@ | |||
| metadata: | |||
| description: |- | |||
| This test checks the capability of the workflow to retrieve from NOAA | |||
There was a problem hiding this comment.
@mkavulich Please add in the description that this test is also to check tha the weather model can perform a relatively long forecast.
…if the command already exists but is commented, we still want to add it
|
@mkavulich - The Jenkins tests failed on August 26 due to an issue with Orion and the
I'm fairly certain that a rerun would allow the test to successfully run without issue. Before rerunning, I've been told that the EPIC role account now has rstprod permission. Would you be okay with uncommenting the @mkavulich and @gsketefian - I agree with @mkavulich, it is certainly a good idea to trim/remove the NCO WE2E tests. However, I don't feel that this should be done in this PR. |
|
@mkavulich - It also sounds like the various RDHPCS machines are all using the EPIC role accounts. Would you be able to change the |
|
@MichaelLueken I have made those suggested changes, but they are not appearing here. According to https://www.githubstatus.com/, PRs are currently "degraded" so it seems like there's a problem on Github's end that's preventing the PR from being updated. Hopefully it's resolved soon and you can re-run the tests. |
|
@mkavulich - The changes look good and I have resubmitted the Jenkins tests. Once the retests are complete, this work should be ready to be merged. Thanks! |
|
@mkavulich - The reruns of the Hera Intel tests are still seeing failures in the new The good news is that the tests successfully passed on Gaea and Orion. Additionally, outside of the tests noted above, all other tests have successfully passed. |
|
@MichaelLueken That is so strange, I've run the |
|
@mkavulich - Unfortunately, I'm not aware of the reason for the last failures of the test (though, it is likely due once again to the weird I've also cancelled the Jet testing, since HPSS is down for maintenance as well and the |
|
@MichaelLueken Ahh, I somehow forgot that the Intel tests are the ones run in NCO mode. So then this is another fleeting symptom of #652? |
|
@mkavulich - Yes, I suspect that this is a continuation of what was being encountered with Issue #652. Now, however, instead of mislinking COM directories, it is now leading to these weird NetCDF failures. |
|
I was able to successfully run the Hera Intel tests manually this morning: I had to use rocotorewind/boot on the I'm now waiting to see if the |
|
The The error is: Please comment out the two verification tests until this issue can be sorted. Thanks! |
|
@mkavulich - I have sent a request to the RDHPCS HPSS Helpdesk to ask that the EPIC role account be granted rstprod permissions on HPSS. Hopefully they are able to add this permission quickly, then I can submit the Hera GNU tests to make sure that the tests that require HPSS data are able to successfully pull the data from HPSS. |
|
@mkavulich - Good news! Skylar has added the EPIC role account to the rstprod group on HPSS. Once the current Jet tests finish, then the Hera GNU tests will run. If the |
There was a problem hiding this comment.
Why the file name has .com extension?.. Was the intent to have *.nco file with the list of tests?
There was a problem hiding this comment.
@natalie-perlin - The .com extension is to ensure that all tests in the file run in the community environment. Within this file, there is an nco test, nco_grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16. When this test is launched, the nco environment is overwritten so that it will run in community mode.
This is similar to the coverage.hera.intel.nco file. All of the tests in this file are normally run in community mode, but the .nco extension ensures that the tests use the nco environment instead.
There are no plans to include either a coverage.hera.intel.com or coverage.hera.gnu.nco file.
|
The Jenkins automated
Now moving forward with merging this work. |
DESCRIPTION OF CHANGES:
This represents the final round of WE2E test improvements related to Issue #587, including new tests, additional features, script improvements, and removal of unsupported domains.
Script improvements
run_WE2E_tests.py--use_cron_to_relaunchwith a--launchargument that can take the values "python", "cron", or "none" (the last of which will create the experiments but not run them)monitor_jobs.py--modeflag, which if specified as "advance" will only runrocotorunonce for each experiment, then quit.generate_FV3LAM_wflow.pyandset_FV3nml_sfc_climo_filenames.py--debugflag that will give more verbose output. No longer read "VERBOSE" variable from config.yaml for this script.get_crontab_contents.pyNew tests
custom_gridsdirectory. The new custom domains were chosen to span a variety of locations, terrain types, and dates. Aside fromcustom_ESGgrid_Great_Lakes_snow_8km, these are basic tests that don't use many non-default settings, and so are good candidates for testing new SRW capabilities in the future. See the "Documentation" section for more details.Additional changes
custom_ESGgrid_Great_Lakes_snow_8km(introduced in [develop] Consolidate verification tasks using retrieve_data.py #864) moved from verification to custom_tests, with a symbolic link (test_configs/verification/config.MET_verification_winter_wx.yaml) left behind. Also the ICs/LBCs are now from RAP output, retrieved from HPSS (since the observation data requires HPSS access anyway this is fine)Type of change
--use_cron_to_relaunchargument torun_WE2E_tests.py(gives user a verbose message about replacement flag)TESTS CONDUCTED:
The new script behavior was tested for each of the new flags, ensured that they behaved as expected.
DEPENDENCIES:
This branch currently contains changes from #859 and #864, and so should not be merged until those PRs are approved.Merged!DOCUMENTATION:
Updated command-line arguments were documented within the scripts. Additional documentation for the updates was provided as part of PR #880.
Here are some details/sample plots for the new custom domains:
custom_ESGgrid_Central_Asia_3km: A 6-hour forecast starting 2019070100 over central Asia, including much of the Tibetan Plateau and Tarim Basin; this area was chosen for its extreme topography and climate range, with no ocean points. Uses staged FV3GFS nemsio data.custom_ESGgrid_Great_Lakes_snow_8km: A 6-hour forecast starting 2023021700 over the North American Great Lakes region. Uses RAP netCDF data retrieved from HPSS, and tests a non-standard resolution (8 km). Tests deterministic verification for snow accumulation (as well as other variables), retrieving observations from HPSS (therefore can only be run on Jet and Hera).custom_ESGgrid_IndianOcean_6km: A 12-hour forecast starting 2019061518 over the central Indian Ocean; this area was chosen for its lack of land points and as a southern hemisphere test. Uses staged FV3GFS grib2 data, and tests a non-standard resolution (6 km).custom_ESGgrid_NewZealand_3km: A 6-hour forecast starting 2019061518 over New Zealand; this area was chosen for its extreme topography and mix of land/ocean points, as well as spanning the international date line. Uses staged FV3GFS grib2 data.custom_ESGgrid_Peru_12km: A 12-hour forecast starting 2019061500 over western South America, centered over Peru; this area was chosen for its extreme topography and mix of land/ocean/lake points, as well as spanning the equator. Uses staged FV3GFS grib2 data, and tests a non-standard resolution (12 km).custom_ESGgrid_SF_1p1km: A 6-hour forecast starting 2019061500 over central coastal California, centered over the San Francisco Bay area. Uses staged FV3GFS grib2 data, and tests a non-standard resolution (1.1 km).ISSUE:
CHECKLIST
CONTRIBUTORS:
Thanks to @gsketefian for providing some custom grids for new tests