Skip to content

[develop] Fix issue 555 (restore default behavior of EXPT_BASEDIR option)#562

Merged
MichaelLueken merged 3 commits into
ufs-community:developfrom
mkavulich:feature/issue_555
Jan 31, 2023
Merged

[develop] Fix issue 555 (restore default behavior of EXPT_BASEDIR option)#562
MichaelLueken merged 3 commits into
ufs-community:developfrom
mkavulich:feature/issue_555

Conversation

@mkavulich
Copy link
Copy Markdown
Collaborator

DESCRIPTION OF CHANGES:

This PR restores the previous behavior of the variable EXPT_BASEDIR (Item 2 is the behavior that was previously broken), which has the following effect on the experiment directory EXPTDIR:

  1. If EXPT_BASEDIR is not set or set to a null value, the default value (${HOMEdir}/../expt_dirs) will be used
  2. If EXPT_BASEDIR is set to a relative path (i.e. the first character is not /), the user-specified path will be appended to the default value ${HOMEdir}/../expt_dirs (for example if the user specifies EXPT_BASEDIR=some/relative/path in their config.yaml, it will be updated to EXPT_BASEDIR=${HOMEdir}/../expt_dirs/some/relative/path in the workflow
  3. If EXPT_BASEDIR is set to an absolute path, that path will be used as entered

After the above logic is applied, EXPTDIR will be created by joining the paths EXPT_BASEDIR and EXPT_SUBDIR as usual.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • This change requires a documentation update

TESTS CONDUCTED:

Ran fundamental test suite on Hera, confirmed that the old, expected behavior for the expt_basedir argument has been restored. Ran python unit tests, only UFS_plot_domains.py failed which is a known pre-existing error.

  • hera.intel
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)

DEPENDENCIES:

None

DOCUMENTATION:

While this behavior had been documented in comments and usage doc string for run_WE2E_tests.sh, it had not been included in configure_defaults.yaml or the Users Guide; I updated the documentation in those locations.

ISSUE:

Fixes issue mentioned in #555

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing tests pass with my changes

@venitahagerty
Copy link
Copy Markdown
Collaborator

Machine: jet
Compiler: intel
Job: WE
Repo location: /lfs1/BMC/nrtrr/rrfs_ci/autoci/pr/1217442740/20230126052012/ufs-srweather-app
Build was Successful
Rocoto jobs started
Long term tracking will be done on 10 experiments
If test failed, please make changes and add the following label back:
ci-jet-intel-WE

@MichaelLueken MichaelLueken changed the title Feature/issue 555 [develop] Feature/issue 555 Jan 26, 2023
@mkavulich mkavulich changed the title [develop] Feature/issue 555 [develop] Fix issue 555 (restore default behavior of EXPT_BASEDIR option) Jan 26, 2023
Copy link
Copy Markdown
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkavulich These changes look good to me! Approving now.

@MichaelLueken MichaelLueken added the run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests label Jan 26, 2023
Copy link
Copy Markdown
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkavulich All of the WE2E tests are failing in Jenkins. Attempting to manually run the WE2E tests on Jet was encountering issues as well. I'm seeing:

GLOBAL_VAR_DEFNS_FP = 'var_defns.sh'

rather than:

GLOBAL_VAR_DEFNS_FP = '/mnt/lfs4/HFIP/hfv3gfs/Michael.Lueken/exp_dirs/grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2/var_defns.sh'

while generating the experiment.

The following files are present in ush/, but not in the expt_dirs - ush/data_table, ush/fd_nems.yaml, ush/field_table, ush/fix_am, ush/input.nml, ush/nems.configure, ush/suite_FV3_GFS_v15p2.xml, and ush/var_defns.sh. Thus, I'm only seeing FV3LAM_wflow.xml, config.yaml, crontab.bak.2023-01-27_18:46:39, launch_FV3LAM_wflow.sh, log.generate_FV3LAM_wflow, and log.launch_FV3LAM_wflow in the experiment directory.

Can you think of what might be happening to keep var_defns.sh from being brought in?

@mkavulich mkavulich added the DO_NOT_MERGE Ensure that a PR isn't merged label Jan 27, 2023
@mkavulich
Copy link
Copy Markdown
Collaborator Author

@MichaelLueken I'm not sure what happened here, I must have run my tests on the incorrect code, because I am also seeing these same failures now. Clearly the code doesn't work as-is, I'll need to investigate what's going on.

@mkavulich mkavulich added ci-hera-intel-WE Kicks off automated workflow test on hera with intel ci-jet-intel-WE Kicks off automated workflow test on jet with intel and removed DO_NOT_MERGE Ensure that a PR isn't merged labels Jan 30, 2023
@venitahagerty venitahagerty removed ci-jet-intel-WE Kicks off automated workflow test on jet with intel ci-hera-intel-WE Kicks off automated workflow test on hera with intel labels Jan 30, 2023
@venitahagerty
Copy link
Copy Markdown
Collaborator

Machine: hera
Compiler: intel
Job: WE
Repo location: /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app

  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/chgres_cube does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/emcsfc_ice_blend does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/emcsfc_snow2mdl does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/filter_topo does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/fregrid does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/fvcom_to_FV3 does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/global_cycle does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/global_equiv_resol does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/inland does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/lakefrac does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/make_hgrid does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/make_solo_mosaic does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/upp.x does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/orog does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/orog_gsl does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/regional_esg_grid does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/sfc_climo_gen does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/shave does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/ufs_model does NOT exist'
  • echo 'FAIL: intel executable file /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1217442740/20230130175017/ufs-srweather-app/tests/../install_intel/exec/vcoord_gen does NOT exist'
  • echo 'BUILD(S) FAILED'
  • msg=FAIL
  • echo FAIL
    Build failed
    If test failed, please make changes and add the following label back:
    ci-hera-intel-WE

@venitahagerty
Copy link
Copy Markdown
Collaborator

venitahagerty commented Jan 30, 2023

Machine: jet
Compiler: intel
Job: WE
Repo location: /lfs1/BMC/nrtrr/rrfs_ci/autoci/pr/1217442740/20230130175016/ufs-srweather-app
Build was Successful
Rocoto jobs started
Long term tracking will be done on 10 experiments
If test failed, please make changes and add the following label back:
ci-jet-intel-WE
Experiment Succeeded on jet: custom_ESGgrid
Experiment Succeeded on jet: grid_RRFS_CONUS_25km_ics_GSMGFS_lbcs_GSMGFS_suite_GFS_v15p2
Experiment Succeeded on jet: custom_GFDLgrid
Experiment Succeeded on jet: grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2
Experiment Succeeded on jet: nco_grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_HRRR
Experiment Succeeded on jet: specify_DT_ATMOS_LAYOUT_XY_BLOCKSIZE
Experiment Succeeded on jet: grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_RAP_suite_HRRR
Experiment Succeeded on jet: specify_DOT_OR_USCORE
Experiment Succeeded on jet: grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
Experiment Succeeded on jet: specify_RESTART_INTERVAL
All experiments completed

@mkavulich
Copy link
Copy Markdown
Collaborator Author

@MichaelLueken I have applied the correct fix for this issue. I discussed with @venitahagerty offline and it looks like the Hera test is failing due to running out of disk space in the test location; I ran the Hera fundamental tests successfully on my own.

@MichaelLueken
Copy link
Copy Markdown
Collaborator

@mkavulich resubmission of the Jenkins tests showed that they all passed, with the exception of Orion. The community_ensemble_008mems test failed due to one of the post jobs exceeding the walltime. I've resubmitted the Orion tests one more time. Once it passes, I will approve and merge these changes.

Copy link
Copy Markdown
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkavulich Following the modification you pushed on Friday (and a resubmission of the Orion tests this morning), all Jenkins tests are now passing successfully. Reapproving these changes.

@MichaelLueken MichaelLueken merged commit 25418e8 into ufs-community:develop Jan 31, 2023
gspetro-NOAA pushed a commit to gspetro-NOAA/ufs-srweather-app that referenced this pull request Feb 9, 2023
@mkavulich mkavulich deleted the feature/issue_555 branch July 19, 2023 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

generate_FV3LAM_workflow.py no longer deals with relative paths correctly for "EXPT_BASEDIR"

4 participants