Skip to content

[develop] Integrate UW CLI tool for templater and remove external dependency.#994

Merged
MichaelLueken merged 7 commits into
ufs-community:developfrom
christinaholtNOAA:update_templater
Jan 11, 2024
Merged

[develop] Integrate UW CLI tool for templater and remove external dependency.#994
MichaelLueken merged 7 commits into
ufs-community:developfrom
christinaholtNOAA:update_templater

Conversation

@christinaholtNOAA

Copy link
Copy Markdown
Collaborator

DESCRIPTION OF CHANGES:

The workflow-tools package was initially integrated with SRW as an external repository under ush/python_utils. Since then, we have packaged the code as a conda package and it is now installed automatically on most platforms (WCOSS excluded, but with workarounds in place).

In this PR, I am removing the prior integration and leaning on the UW command line tools available from the conda package. For now, this involves calling the command line tools in a subprocess from Python code. We have an API under development that will replace this in the near future, so this will not likely be the final result for the Python-based scripts you see here.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

TESTS CONDUCTED:

  • hera.intel
  • orion.intel
  • hercules.intel
  • cheyenne.intel
  • cheyenne.gnu
  • derecho.intel
  • gaea.intel
  • gaeac5.intel
  • jet.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)

DEPENDENCIES:

None

DOCUMENTATION:

None. UW Documentation is currently being updated to reflect changes in CLI tools.

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

set -e -u

export PYTHONPATH=${workspace}/ush/python_utils/workflow-tools:${workspace}/ush/python_utils/workflow-tools/src

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PYTHONPATH shenanigans are no longer needed. In fact, they cause problems when trying to call the command line tools from the conda environment.

python3 $USHdir/python_utils/workflow-tools/scripts/templater.py \
-c "${tmpfile}" \

uw template render \

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was made in several of the ex-scripts. This is how the command line tool can be called to render a template instead of relying on a script on disk.

Comment thread ush/create_aqm_rc_file.py

@mkavulich mkavulich left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments and questions

Comment thread .pylintrc
Comment thread scripts/exregional_run_met_gridstat_or_pointstat_vx.sh Outdated
Comment thread scripts/exregional_run_met_genensprod_or_ensemblestat.sh Outdated
Comment thread ush/create_aqm_rc_file.py

@christinaholtNOAA christinaholtNOAA left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkavulich Thanks for you review. Great questions! Let me know if I can provide more information.

Comment thread .pylintrc
Comment thread ush/create_aqm_rc_file.py

@mkavulich mkavulich left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments and making those requested changes

@MichaelLueken MichaelLueken left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA -

These changes look good to me! I was also able to run the coverage tests on Hercules and they all passed:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
custom_GFDLgrid__GFDLgrid_USE_NUM_CELLS_IN_FILENAMES_eq_FALSE      COMPLETE               7.55
grid_CONUS_25km_GFDLgrid_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16      COMPLETE              10.82
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_RRFS_v1beta      COMPLETE              27.64
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot  COMPLETE              17.34
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_HRRR             COMPLETE              24.36
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_RAP              COMPLETE              49.93
grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16   COMPLETE              12.69
grid_RRFS_NA_13km_ics_FV3GFS_lbcs_FV3GFS_suite_RAP                 COMPLETE              64.96
grid_SUBCONUS_Ind_3km_ics_NAM_lbcs_NAM_suite_GFS_v16               COMPLETE              27.81
MET_verification_only_vx                                           COMPLETE               0.38
specify_EXTRN_MDL_SYSBASEDIR_ICS_LBCS                              COMPLETE               7.82
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE             251.30

Approving now.

@MichaelLueken MichaelLueken added the run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests label Jan 8, 2024
@MichaelLueken

Copy link
Copy Markdown
Collaborator

The coverage tests were manually run on Derecho and all successfully passed:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
custom_ESGgrid_IndianOcean_6km                                     COMPLETE              23.35
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16_plot     COMPLETE              37.49
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16                COMPLETE              44.89
grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_HRRR           COMPLETE              29.37
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta    COMPLETE              17.90
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_HRRR                COMPLETE              40.68
nco_grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_timeoffset_suite_  COMPLETE              24.75
pregen_grid_orog_sfc_climo                                         COMPLETE              14.85
specify_template_filenames                                         COMPLETE              15.07
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE             248.35

@christinaholtNOAA

Copy link
Copy Markdown
Collaborator Author

Just out of curiosity, what's the hang up on the Jenkins tests? I know Hera was down yday, but it seems bigger than that given they were kicked off Monday.

@MichaelLueken

Copy link
Copy Markdown
Collaborator

@christinaholtNOAA - The get_from_HPSS_ics_FV3GFS_lbcs_FV3GFS_fmt_netcdf_2022060112_48h test is failing on Jet in the make_ics and make_lbcs tasks (with terminate called after throwing an instance of 'std::bad_alloc' error messages).

While the tests were able to successfully complete after several repeated rewinds/boots, the run_post tasks all failed due to missing dynf000.nc and phyf000.nc files. I have relaunched the tests on Jet and Jet only has two core hours for the month, so we are regrettably running in windfall on the machine.

@MichaelLueken

Copy link
Copy Markdown
Collaborator

@christinaholtNOAA - Just wanting to give you a head's up, the current WE2E test runs on Jet look like they are successfully passing this time. No signs of DEAD in the WE2E_tests_20240110191405.yaml file and all of the tests have made it to the run_fcst task. I should be able to merge this PR later today or in the morning. The run directory, if you would like to check, is /lfs1/NAGAPE/epic/role.epic/jenkins/workspace/fs-srweather-app_pipeline_PR-994/jet/

@MichaelLueken

Copy link
Copy Markdown
Collaborator

The rerun of the WE2E tests on Jet have successfully passed:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used
----------------------------------------------------------------------------------------------------
community                                                          COMPLETE              17.69
custom_ESGgrid                                                     COMPLETE              18.51
custom_ESGgrid_Great_Lakes_snow_8km                                COMPLETE              12.22
custom_GFDLgrid                                                    COMPLETE               9.86
get_from_HPSS_ics_FV3GFS_lbcs_FV3GFS_fmt_nemsio_2021032018         COMPLETE               9.36
get_from_HPSS_ics_FV3GFS_lbcs_FV3GFS_fmt_netcdf_2022060112_48h     COMPLETE              54.47
get_from_HPSS_ics_RAP_lbcs_RAP                                     COMPLETE              15.46
grid_RRFS_AK_3km_ics_FV3GFS_lbcs_FV3GFS_suite_HRRR                 COMPLETE             223.49
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16_plot     COMPLETE              41.33
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2        COMPLETE               8.90
grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_RRFS_v1beta       COMPLETE             511.77
nco_grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_HRRR       COMPLETE              10.72
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE             933.78

@MichaelLueken MichaelLueken merged commit cd972f2 into ufs-community:develop Jan 11, 2024
MichaelLueken added a commit to MichaelLueken/ufs-srweather-app that referenced this pull request Jan 24, 2024
…ing PR ufs-community#994:

* run_vx.local.lua files were updated to load the srw_app conda environment for verification tasks.
* wflow_jet.lua file was updated to remove unload("python") and load("set_pythonpath").
* verify_pre.yaml was updated to add native: '{% if platform.get("SCHED_NATIVE_CMD_HPSS") %}{{ platform.SCHED_NATIVE_CMD_HPSS }}{% else %}{{ platform.SCHED_NATIVE_CMD}}{% endif %}' to the get_verification_obs tasks.
* comprehensive.derecho was updated to add the MET_ensemble_verification_winter_wx verification test.
* jet.yaml was updated to add SCHED_NATIVE_CMD_HPSS: -n 1 --export=NONE to allow the service partition to work.
@christinaholtNOAA christinaholtNOAA deleted the update_templater branch July 2, 2024 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants