[develop] Add workflow for radar reflectivity processing.#807
Conversation
- "archive_internal_dir" must be an empty string for top-level archive files - Rename "file_type" argument to "file_fmt" - Rename "external_model" argument to more generic/correct "data_type" - If only some files can not be un-zipped, do not fail. We'll handle this more gracefully later if it needs to be an error - Remove default location logic for now; couldn't get it working - Re-add explicit disk path requirement; may re-visit later - For fatal error, do not imply that script continues - Remove debug prints
"rap_e" files in addition to "rap".
specifically HPSS fails with some kind of logging error. Needs more investigation. This commit removes some un-needed, un-initialized variables, adds modulefiles for the new task on each platform, and adds some unit tests for retrieve_data.py to test new capabilities. The new task entry has been added to parm/wflow/coldstart.yaml for testing, but it should be removed prior to merging this PR since the task is not integrated with the rest of the workflow yet.
… tasks going forward. Clearly the boundary conditions tasks will need to eventually change also.
- Link obs needed for task_process_bufr - Put obs in $DATA/obs rather than $DATA/gsi - Put in logic for getting "rap_e" files at 00 and 12z cycles; I hope we can extract this logic into the config file later with Jinja - Add "task_process_bufr" entry to warmstart.yaml with a dependency on task get_da_obs - Add "CYCLE_TYPE" variable to config_defaults - Start a new config.rrfs.yaml file
- Set machine-specific paths in machine files, this should have been obvious... - Use RRFS_NA_13km domain for RRFS - Re-organize Jet and Hera machine files for section consistency - Fix grammar error in retrieve_data.py
- Move date variables to J-job level - Keep da_obs in same directory for community and nco mode (since process_bufr does the same) - In process_bufr, look for da_obs in staging directory
- Add default directory to Jet machine file...only available there for now - Add task_process_lightning to warmstart workflow
…ve to Hera. Also fix config.rrfs.yaml to better settings for test case.
…to avoid duplicate error messages, search for RRFS data on disk as well as hpss
Removing the CYCLE_TYPE from config_defaults because it is a run-time, cycle-dependent setting, not an experiment level one.
This is similar to what is done for external model files so that we can use the machine file information about which data stores a platform has access to.
Matching new standard outlined in Issue ufs-community#633
…obs_task Had to manually revert some renaming changes in ush/test_retrieve_data.py
MichaelLueken
left a comment
There was a problem hiding this comment.
@christinaholtNOAA Overall, these changes look good. I was able to run the process_obs test on both Hera and Jet and the test successfully ran on both machines.
I have noted that the paths on ush/machine/jet.yaml should use /mnt/lfs4, rather than /scratch2.
I also encountered failures running the fundamental tests on both Hera and Jet. The nco_grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_timeoffset_suite_GFS_v16 test fails on both machines in the make_ics/lbcs step on both machines. Removing the expand_source_paths and going back to source_paths in ush/retrieve_data.py allows the test to run smoothly. It's not clear to me why the modifications you made to expand_source_paths are causing the necessary data not to be pulled.
Co-authored-by: Michael Lueken <63728921+MichaelLueken@users.noreply.github.com>
Co-authored-by: Michael Lueken <63728921+MichaelLueken@users.noreply.github.com>
MichaelLueken
left a comment
There was a problem hiding this comment.
@christinaholtNOAA Thank you very much for finding the issue with the failure of the test so quickly! The fundamental tests were rerun and all successfully passed. As noted previously, the process_obs test was successfully run on Hera and Jet and now that the fundamental tests are passing, I will give my approval to this PR.
|
@MichaelLueken Thanks for bringing my attention to it and rerunning the tests! I opened a separate PR to more directly address the fix, but wanted to also test it with this PR to make sure it wasn't a problem here. |
|
Is anyone else available to provide a 2nd review for this PR? It would be helpful to get it merged so that the cycled DA work could follow it in. |
EdwardSnyder-NOAA
left a comment
There was a problem hiding this comment.
I left a few comments in the get_mrms_file.sh script.
When retrieving the da data from either NSSLMOSAIC or RAP_OBS_BUFR, these locations on Hera and Jet are the current live stream and not a historical location? Is that correct?
| do | ||
| min=$( printf %2.2i $((timelevel+min)) ) | ||
| echo "Looking for data valid:"${YYYY}"-"${MM}"-"${DD}" "${cyc}":"${min} | ||
| nsslfiles=${NSSL}/*${mrms}_00.50_${YYYY}${MM}${DD}-${cyc}${min}??.${OBS_SUFFIX} |
There was a problem hiding this comment.
Would the '00.50' need to be a wildcard? In line 50, it is a wildcard.
There was a problem hiding this comment.
Given the original logic, I believe this is the level that needs to exist before all the others are gathered.
There was a problem hiding this comment.
Ah okay. That makes sense but wasn't sure. Thanks for clarifying and making those changes. Approving.
| for min in ${RADARREFL_MINS[@]} | ||
| do | ||
| min=$( printf %2.2i $((timelevel+min)) ) | ||
| echo "Looking for data valid:"${YYYY}"-"${MM}"-"${DD}" "${cyc}":"${min} |
There was a problem hiding this comment.
Maybe there is a reason behind all these double quotes, but you could just have one set of double quotes around this string.
| # 10 represents a significant number of vertical levels of data | ||
| if [ ${numgrib2} -ge 10 ] && [ ! -e filelist_mrms ]; then | ||
| cp_vrfy ${NSSL}/${nsslfile1} ${output_path} | ||
| ls ${nsslfile1} > ${output_path}/filelist_mrms |
There was a problem hiding this comment.
I believe you need to add ${NSSL}/ in front of ${nsslfile1} in order for this command to work.
| if [ ${numgrib2} -ge 10 ] && [ ! -e filelist_mrms ]; then | ||
| cp_vrfy ${NSSL}/${nsslfile1} ${output_path} | ||
| ls ${nsslfile1} > ${output_path}/filelist_mrms | ||
| echo 'Copying mrms files for ${YYYY}${MM}${DD}-${cyc}${min}' |
There was a problem hiding this comment.
This prints out: Copying mrms files for ${YYYY}${MM}${DD}-${cyc}${min}. Replacing the single quotes with double quotes will allow the variables to be substituted.
| if [ -s $nsslfile ]; then | ||
| echo 'Found '${nsslfile} | ||
| nsslfile1=*${mrms}_*_${YYYY}${MM}${DD}-${cyc}${min}*.${OBS_SUFFIX} | ||
| numgrib2=${#nsslfiles1} |
There was a problem hiding this comment.
I believe this should be nsslfile1
|
@christinaholtNOAA Since potential changes to the |
|
@christinaholtNOAA The Jenkins functional tests failed on Hera and due to issues on Orion, the coverage WE2E tests failed there as well. The coverage WE2E tests were run on Hera and Orion. The tests successfully passed. Once you and @EdwardSnyder-NOAA are happy with the changes in |
|
Thanks @MichaelLueken. I just finished the fundamental tests after the conflict resolution and addressing @EdwardSnyder-NOAA's comments. |
|
@EdwardSnyder-NOAA Please look over @christinaholtNOAA's latest modifications to |
|
@EdwardSnyder-NOAA Thank you very much for looking back over the changes! @christinaholtNOAA I will now proceed with merging this work. |
DESCRIPTION OF CHANGES:
Add MRMS data to the get_da_obs task, and turn on the radar reflectivity processing task in the workflow for data assimilation pre-processing.
This is an addition of a new feature that should not impact any other features (outside recent RRFS-feature additions)
Type of change
TESTS CONDUCTED:
DEPENDENCIES:
none
DOCUMENTATION:
None.
ISSUE:
Issue #782
CHECKLIST
LABELS (optional):
A Code Manager needs to add the following labels to this PR: