[develop] Enable deterministic verification to be run from staged forecast files#566
Conversation
…orming the name of the Lua files.
…isting MET vx tests to it; add new vx test that only calls the vx tasks.
… in dependencies only if it is activated in the workflow.
…UT_BASEDIR. Details below: 1) Rename MODEL to VX_FCST_MODEL_NAME to clarify that this is the name of the forecast model in the context of verification. 2) Create the new variable VX_FCST_INPUT_BASEDIR to allow the user to specify a directory in which to look for staged forecast output (instead of running a forecast). 3) Place both VX_FCST_MODEL_NAME and VX_FCST_INPUT_BASEDIR in a new mapping named "verification" in config_defaults.yaml. Other vx-related workflow variables will be placed under "verification" in later PRs.
…ll tasks so that new external model data is not needed.
MichaelLueken
left a comment
There was a problem hiding this comment.
@gsketefian Overall, the changes in this PR look good! I have provided some minor feedback for some of the changes. I was also able to run the new MET_verification_only_vx test and it successfully ran.
| # the GET_OBS_... tasks and instead specify the obs staging directories. | ||
| # | ||
| RUN_TASK_GET_OBS_CCPA: false | ||
| CCPA_OBS_DIR: '/scratch2/BMC/det/Gerard.Ketefian/UFS_CAM/DTC_ensemble_task/staged/obs/ccpa/proc' |
There was a problem hiding this comment.
Are there plans to make the staged obs available on other machines? Or is it planned that this test will only be able to be run on Hera?
There was a problem hiding this comment.
@MichaelLueken I was working under the impression that yes, the obs (and staged forecasts) need to be made available on other machines. I think at least Jet and Cheyenne but likely also Orion and NOAAcloud. This work is partly for the development of a community framework for testing RRFS prototypes that needs to be available to the general user, so I would think Cheyenne would be especially important. I'll also let @JeffBeck-NOAA, @michelleharrold, @willmayfield, and @mkavulich chime in.
Btw, in my next vx PR, I'm going to need to stage forecast data for an ensemble. It currently has 9 members, and I don't think it's a lot of data, but we can reduce the number of members if that becomes an issue.
There was a problem hiding this comment.
Thank you very much for the explanation, @gsketefian. Shall I add this as a topic for an upcoming SRW App CM meeting, so that AUS can assist with getting this data to the NOAAcloud and other machines?
There was a problem hiding this comment.
@MichaelLueken Yes, good idea! I don't know what their procedure is, but if it's a significant effort, it might be worthwhile to first get all the data on Hera (including the ensemble vx data) and then do the transfer to other platforms in one go. That would mean I first have to get a PR into develop for ensemble vx. We can discuss on Thursday.
There was a problem hiding this comment.
Seems like it would be easy enough to include obs data in the static data that's already staged. There are already observations staged under UFS_SRW_App/develop/obs_data/ on all platforms.
There was a problem hiding this comment.
@mkavulich Yeah as you say there's already a place for obs in the data directories. We need to decide on where to put staged forecast data from the SRW itself (not from other external models), including ensemble output (which will be needed for a future PR). I'm open to suggestions.
There was a problem hiding this comment.
I am curious why staged forecast data would be needed. Why can't we have a test that runs a forecast and then runs VX on that forecast?
There was a problem hiding this comment.
@MichaelLueken Sorry just saw this. The reason is that someone else may have run the forecasts beforehand, and the user needs to only run vx on that output. For example, for the DTC Ensemble Design task that I'm part of, the RRFS development group at GSL already ran a bunch of ensemble forecasts, and we want to only run vx on it. In addition, one may want to run vx on forecasts not generated by the SRW, e.g. there's another (I think non-DTC) task that is using the version of the SRW App I've put together in my fork (that I'm now bringing into the develop branch) to run vx on HREF output.
MichaelLueken
left a comment
There was a problem hiding this comment.
@gsketefian Thank you very much for addressing my concerns! I will go ahead and approve these changes now and submit the Jenkins tests.
|
@gsketefian If possible, could you try running the
Interestingly, the Hera Intel Are you aware of why |
|
@gsketefian I'm also seeing similar errors:
On Cheyenne with GNU compilers while running the |
|
@MichaelLueken Ok, I'll give it a shot. Some questions:
Thanks. |
Yes, the develop branch was succeeding with the gnu compiler. The Jenkins tests have all passed up to this point.
Correct, the only difference is adding --compiler=gnu (-c=gnu) to the ./devbuild.sh call.
If you add the test to
Unfortunately, I made changes to my working copy that has made the tests pass (I added back in the changes that you had removed in For my manual tests that currently pass, you can see -
Please let me know if there is any additional information you need. Thanks! |
|
@MichaelLueken Thanks for the answers. I will add Related question: |
…WE2E tests that gets run on Jenkins for PR testing purposes.
@gsketefian Yes, @danielabdi-noaa added logic to the |
@MichaelLueken Ok, thanks, I hadn't noticed that new code. |
|
@MichaelLueken I think I found the problem. In modifying the task module file Turns out I should have kept this line from the original file: Now the vx tasks can find that
Once I verify that they are successful, I'll push my latest changes. Question: |
|
@gsketefian I think the issue with GNU is that metplus was likely built on the various machines using Intel compilers, so I think this is an issue (at least currently) only for Hera and Cheyenne. As for checking the "Logging error" messages, I can certainly check the log files for those. Having said that, |
@MichaelLueken Yes, good point about METplus being build only with intel, so we have to keep specifying the path to the intel library in the task module file. Since the vx tests are currently only run on Hera, if it's ok with you, to simplify/modularize the work, I'd like to get all the vx changes working on Hera and worry about porting to other platforms in a separate set of PRs (since that will likely require installation or updating of MET/METplus and thus admin help). |
|
@gsketefian That sounds good to me. I think it would be fine to add: Once your tests on Hera with Intel and GNU compilers are complete and you have pushed your changes, I will resubmit the Jenkins Hera tests to make sure that those pass as well. |
… running with GNU compiler, the intel libraries are still needed because MET/METplus is built with intel only).
|
@MichaelLueken I just made the change to the Hera module file for vx tasks. All 3 vx tasks were successful (on Hera) with both intel and gnu. So ready for the Jenkins tests now. Thanks. |
|
Relaunching the Jenkins test on Hera, it looks like all of the Hera GNU tests have failed with the following: The experiment directories for the Hera GNU tests can be found: It is unclear to me why |
|
@MichaelLueken Thanks for looking into it. Hopefully the |
mkavulich
left a comment
There was a problem hiding this comment.
Looks good to me, just a couple requested changes.
| <metataskdep metatask="&RUN_POST_TN;{{ uscore_ensmem_name }}"/> | ||
| {%- endif %} | ||
| {%- endif %} | ||
| </and> |
There was a problem hiding this comment.
It seems like this logic can be simplified to the following:
<and>
{#- Redundant dependency to simplify jinja code. #}
<streq><left>TRUE</left><right>TRUE</right></streq>
{%- if run_task_get_obs_ccpa %}
<taskdep task="&GET_OBS_CCPA_TN;"/>
{%- endif %}
{%- if write_dopost %}
<taskdep task="&RUN_FCST_TN;{{ uscore_ensmem_name }}"/>
{%- elif run_task_run_post %}
<metataskdep metatask="&RUN_POST_TN;{{ uscore_ensmem_name }}"/>
{%- endif %}
</and>
The nested "and"s are superfluous, and if run_task_get_obs_ccpa does not require an "else" statement.
The same logic should apply to the subsequent changes for run_task_get_obs_mrms and run_task_get_obs_ndas as well.
There was a problem hiding this comment.
@mkavulich Yes I noticed that too but had decided to make only the minimal change necessary (in terms of when you do a diff) since all this will get rewritten anyway pretty soon. I'll put in your suggestion and rerun.
There was a problem hiding this comment.
@mkavulich I put in the new simpler logic and reran the vx tests (with intel compiler only) and all 3 succeeded. The other (fundamental) tests aren't affected by this, so I didn't rerun them. I'll let the Jenkins testing handle those.
| # the GET_OBS_... tasks and instead specify the obs staging directories. | ||
| # | ||
| RUN_TASK_GET_OBS_CCPA: false | ||
| CCPA_OBS_DIR: '/scratch2/BMC/det/Gerard.Ketefian/UFS_CAM/DTC_ensemble_task/staged/obs/ccpa/proc' |
There was a problem hiding this comment.
Seems like it would be easy enough to include obs data in the static data that's already staged. There are already observations staged under UFS_SRW_App/develop/obs_data/ on all platforms.
| #---------------------------- | ||
| # verification parameters | ||
| # | ||
| # VX_FCST_MODEL_NAME: |
There was a problem hiding this comment.
Glad to see a more descriptive variable name here 👍
mkavulich
left a comment
There was a problem hiding this comment.
Thanks for the changes, looks good
PR #566 changed the variable "MODEL" to a more descriptive name, but failed to make this change in config.community.yaml. The unit tests for generate_FV3LAM_wflow.py make use of this file as an input config.yaml, so they are now failing due to this incorrect variable name. This wasn't caught because prior to #558 the unit tests were broken for a different reason. This change simply makes the appropriate rename, which should fix the failing unit test. Also created an f-string that was missed in a setup.py error message.
|
@MichaelLueken Thanks for merging this PR. One thing that still needs to be done is to move the staged forecast data that's in my directory (on Hera) to the official SRW data directory. I don't think we decided an exact location within where to put it. We can do that (along with @mkavulich) after Hera is back up this afternoon or tomorrow. For other platforms, I'd like to wait until a couple more of the vx PRs are in since they will have more data to stage. Thanks. |
DESCRIPTION OF CHANGES:
This PR enables running of only the SRW App's deterministic verification (vx) tasks on staged forecast files from previous runs of the App. It partially resolves Issue #565 (it resolves the issue for deterministic vx but not ensemble vx).
Specific changes:
MODELtoVX_FCST_MODEL_NAMEto clarify that this is the name of the forecast model in the context of verification (and which will be used in the vx output files). This requires updates to most (all?) of the METplus configuration files and the verification ex-scripts.VX_FCST_INPUT_BASEDIRto allow the user to specify a directory in which to look for staged forecast output (instead of running a forecast).FV3LAM_wflow.xml) to make dependencies of vx tasks on post-processing tasks appear only when the post tasks are enabled.verificationin which to group all vx tests (since more vx tasks will be coming in future PRs). Move the two existing testsMET_verificationandMET_ensemble_verificationfromwflow_featurestoverification, and add a new test namedMET_verification_only_vxto test the capability that this PR introduces.Note: The new WE2E test
MET_verification_only_vxrequires new data, specifically post-processed forecast output from the SRW App. This data needs to be staged on each platform; currently, it is located in a personal directory on Hera.Type of change
TESTS CONDUCTED:
All tests were run on hera.intel only.
The following fundamental WE2E tests (not involving vx) were run:
In addition, the following WE2E verification tests were run:
All tests were successful.
DOCUMENTATION:
This PR does require some documentation changes, but I would prefer to do that after also merging a follow-up PR for enabling vx from staged files for ensemble verification.
ISSUE:
Partially fixes Issue #565. It enables deterministic verification from staged forecast files but not yet ensemble verification.
CHECKLIST
LABELS (optional):
A Code Manager needs to add the following labels to this PR:
CONTRIBUTORS (optional):
@michelleharrold, @willmayfield, @JeffBeck-NOAA, @mkavulich