Bug fixes in machine files#692
Conversation
1) Need space after "print_info_msg" and "print_err_msg_exit" and next character. 2) Initialize "location" variable to a null string. 3) Change variable FV3GFS_FILE_FMT_ICS to FV3GFS_FILE_FMT_LBCS at appropriate locations. 4) In stampede.sh, change variable name "SYSBASEDIR_ICS" to "EXTRN_MDL_SYSBASEDIR_ICS".
1) Remove printing out of informational messages from the "file_location" function in the machine files because this causes the EXTRN_MDL_SYSBASEDIR_ICS|LBCS variables to wrongly get set to these messages, and that causes scripts downstream to crash. 2) Remove calls to print_err_msg_exit in two of the machine files because we do not want the calling scripts to exit; they can continue by trying HPSS, for example.
| @@ -71,6 +68,6 @@ NDAS_OBS_DIR="/glade/p/ral/jntp/UFS_SRW_app/develop/obs_data/ndas/proc" | |||
| MET_BIN_EXEC="bin" | |||
There was a problem hiding this comment.
Does it matter that RUN_CMD_SERIAL and MET_BIN_EXEC use double quotes, and the others use single quotes? Are the single quotes used for commands that are "expandable"?
There was a problem hiding this comment.
If you want to use templated variables (as we do for RUN_CMD_UTILS since they contain references to variables like nprocs that are not yet defined), then use single quotes. For variables whose definitions don't include references to other variables, either single or double quotes work.
| # Test Data Locations | ||
| TEST_PREGEN_BASEDIR=/glade/p/ral/jntp/UFS_CAM/FV3LAM_pregen | ||
| TEST_COMINgfs=/glade/scratch/ketefian/NCO_dirs/COMGFS | ||
| TEST_EXTRN_MDL_SOURCE_BASEDIR=/glade/p/ral/jntp/UFS_SRW_app/staged_extrn_mdl_files |
There was a problem hiding this comment.
These paths are in double quotes, but the "location" path above is in single quotes. Is that OK?
…to bugfix/cleanup_machine_files
| location="" | ||
| case ${external_model} in | ||
|
|
||
| "*") |
There was a problem hiding this comment.
@EdwardSnyder-NOAA I'm wondering how you were able to run a successful WE2E test with this print_info_msg call since this causes EXTRN_MDL_SYSBASEDIR_ICS|LBCS to be set to the message itself, and that leads to failures downstream in the exregional_... scripts -- at least on the on-prem machines. I wanted to remove this case statement since it doesn't have any stanzas for which location is set to something (just like orion.sh above). But wanted to ask you first.
There was a problem hiding this comment.
It is fine to remove that case statement. We haven't ran the WE2E on the singularity container yet.
…e print statement, and we don't want the print statement.
|
@JeffBeck-NOAA @christinaholtNOAA Thanks for the reviews. Merging now. |
DESCRIPTION OF CHANGES:
Cleaning up bugs in the machine files. The first bug prompted this PR, and the rest were found subsequently. The bugs (and their fixes) are as follows:
print_info_msgandprint_err_msg_exitfunction calls in thefile_locationfunctions. Inserting a space gets passed this bug, but subsequent issues were found as described below.For machine files that call the
print_info_msgfunction infile_location(cheyenne.sh,hera.sh,jet.sh, andorion.sh):Fixing this bug leads to other failures because when the "*" stanza is encountered in the
file_locationfunction,the
EXTRN_MDL_SYSBASEDIR_ICS|LBCSvariable gets set to the message thatfile_locationreturns. Since that message contains spaces, it leads to other failures in downstream scripts (the ex-scripts). Simply removing the printing out of the message (thus causingEXTRN_MDL_SYSBASEDIR_ICS|LBCSto be set to a null string) fixes the failures, so this was the fix implemented. If desired, a message for an empty value forEXTRN_MDL_SYSBASEDIR_ICS|LBCScan be placed in another script (where those variables are used).For machine files that use
print_err_msg_exitinfile_location(stampede.shandwcoss_dell_p3.sh):These should not exit if the file location is not available since the experiment can still complete successfully. So just removing the
print_err_msg_exitcall should work (and make the behavior of these machine files consistent with the set above).In all the machine files, the variable
FV3GFS_FILE_FMT_ICSshould be changed toFV3GFS_FILE_FMT_LBCSin the definition ofEXTRN_MDL_SYSBASEDIR_LBCS. This was fixed in all the files.In
stampede.sh, a variable namedSYSBASEDIR_ICSis defined. This is a typo. Modify toEXTRN_MDL_SYSBASEDIR_ICS.TESTS CONDUCTED:
Ran the WE2E test
grid_RRFS_CONUS_25km_ics_HRRR_lbcs_RAP_suite_GSD_SARon:The UPP task failures are new and being experienced by other PRs as well (e.g. #689). The original issue with machine files seems resolved.
CONTRIBUTORS (optional):
@JeffBeck-NOAA encountered and reported the original error.