Skip to content

[develop] Add informative error messages to all err_exit calls#4793

Merged
DavidHuber-NOAA merged 15 commits into
NOAA-EMC:developfrom
DavidHuber-NOAA:feature/err_exit_messages
Apr 22, 2026
Merged

[develop] Add informative error messages to all err_exit calls#4793
DavidHuber-NOAA merged 15 commits into
NOAA-EMC:developfrom
DavidHuber-NOAA:feature/err_exit_messages

Conversation

@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor

@DavidHuber-NOAA DavidHuber-NOAA commented Apr 16, 2026

Description

This adds informative error messages to all err_exit calls. This was flagged as an issue for GCAFS and is being addressed across the board. It also removes err_exit calls from ush/ scripts in favor of exit calls.

Refs #4722
Resolves #4811

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this change expected to change outputs NO
  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO

How has this been tested?

Running CI on Ursa

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added

@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor Author

Launching CI on Ursa to test the new HPC account. CC: @TerrenceMcGuinness-NOAA.

@emcbot emcbot added CI-Ursa-Ready **CM use only** PR is ready for CI testing on Ursa CI-Ursa-Building **Bot use only** CI testing is cloning/building on Ursa CI-Ursa-Running **Bot use only** CI testing on Ursa for this PR is in-progress and removed CI-Ursa-Ready **CM use only** PR is ready for CI testing on Ursa CI-Ursa-Building **Bot use only** CI testing is cloning/building on Ursa labels Apr 17, 2026
Comment thread dev/jobs/JGLOBAL_ENKF_SELECT_OBS Outdated
Comment thread dev/scripts/exgdas_atmos_chgres_forenkf.sh Outdated
Copy link
Copy Markdown
Contributor

@TravisElless-NOAA TravisElless-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of suggestions for the jedi letkf ones

Comment thread dev/jobs/JGLOBAL_ATMENS_ANALYSIS_OBS Outdated
Comment thread dev/jobs/JGLOBAL_ATMENS_ANALYSIS_SOL Outdated
Co-authored-by: Travis Elless <113720457+TravisElless-NOAA@users.noreply.github.com>
@TerrenceMcGuinness-NOAA
Copy link
Copy Markdown
Collaborator

@DavidHuber-NOAA Testing of HPC_ACCOUNT propagating to CI pipeline experiments failed:

[role.glopara@ufe03 pr_cases_4793_39c2a28e_12973]$ source global-workflow/dev/ci/platforms/config.ursa 
[role.glopara@ufe03 pr_cases_4793_39c2a28e_12973]$ env | grep HPC
HPC_ACCOUNT=hfv3gfs
[role.glopara@ufe03 pr_cases_4793_39c2a28e_12973]$ grep account /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/URSA/BUILDS/GITLAB/pr_cases_4793_39c2a28e_12973/RUNTESTS/EXPDIR/C48_ATM_39c2a28e-12973/C48_ATM_39c2a28e-12973.xml 
        <account>fv3-cpu</account>
                <account>fv3-cpu</account>
                <account>fv3-cpu</account>
        <account>fv3-cpu</account>
        <account>fv3-cpu</account>
                <account>fv3-cpu</account>
        <account>fv3-cpu</account>
        <account>fv3-cpu</account>
[role.glopara@ufe03 pr_cases_4793_39c2a28e_12973]$ cat ~/.gwrc 
user:
   ACCOUNT: {{ 'HPC_ACCOUNT' | getenv }}

Falling back on hardcoding if in role account for now (stopping test and restarting it):

[role.glopara@ufe03 pr_cases_4793_39c2a28e_12973]$ vim ~/.gwrc
[role.glopara@ufe03 pr_cases_4793_39c2a28e_12973]$ cat ~/.gwrc 
user:
   ACCOUNT: hfv3gfs
#  ACCOUNT: {{ 'HPC_ACCOUNT' | getenv }}

@TerrenceMcGuinness-NOAA
Copy link
Copy Markdown
Collaborator

TerrenceMcGuinness-NOAA commented Apr 17, 2026

Erased the RUNDIR on disk and re-ran C48_ATM case from GitLab and HPC_ACCOUNT propagates from hard coded ~/.gwrc

[role.glopara@ufe03 C48_ATM_39c2a28e-12973]$ cat ~/.gwrc
user:
   ACCOUNT: hfv3gfs
#  ACCOUNT: {{ 'HPC_ACCOUNT' | getenv }}
[role.glopara@ufe03 C48_ATM_39c2a28e-12973]$ pwd
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/URSA/BUILDS/GITLAB/pr_cases_4793_39c2a28e_12973/RUNTESTS/EXPDIR/C48_ATM_39c2a28e-12973

[role.glopara@ufe03 C48_ATM_39c2a28e-12973]$ grep account C48_ATM_39c2a28e-12973.xml 
        <account>hfv3gfs</account>
                <account>hfv3gfs</account>
                <account>hfv3gfs</account>
        <account>hfv3gfs</account>
        <account>hfv3gfs</account>
                <account>hfv3gfs</account>
        <account>hfv3gfs</account>
        <account>hfv3gfs</account>

And they do not get suck in PRIORITY and are running:

[role.glopara@ufe03 C48_ATM_39c2a28e-12973]$ q | grep 12973
  11931177 u1-comp C48_ATM_39c2a28e-12973_gfs_atmos_prod_f021-f023_12         role.glopara  R     0:01    1 u02c07
  11931176 u1-comp C48_ATM_39c2a28e-12973_gfs_atmos_prod_f018-f020_12         role.glopara  R     0:02    1 u01c24
  11931175 u1-comp C48_ATM_39c2a28e-12973_gfs_atmos_prod_f015-f017_12         role.glopara  R     0:17    1 u12c21
  11931174 u1-comp C48_ATM_39c2a28e-12973_gfs_atmos_prod_f012-f014_12         role.glopara  R     0:29    1 u12c09
  11931173 u1-comp C48_ATM_39c2a28e-12973_gfs_atmos_prod_f009-f011_12         role.glopara  R     0:44    1 u13c09
  11931171 u1-comp C48_ATM_39c2a28e-12973_gfs_atmos_prod_f003-f005_12         role.glopara  R     0:49    1 u03c28
  11931172 u1-comp C48_ATM_39c2a28e-12973_gfs_atmos_prod_f006-f008_12         role.glopara  R     0:49    1 u13c03
  11931170 u1-comp C48_ATM_39c2a28e-12973_gfs_atmos_prod_f000-f002_12         role.glopara  R     1:23    1 u08c16
  11929187 u1-comp            C48_ATM_39c2a28e-12973_gfs_fcst_seg0_12         role.glopara  R     7:27    1 u11c26

Copy link
Copy Markdown
Contributor

@ChristopherHill-NOAA ChristopherHill-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggesting modifications to several of the proposed downstream product err_exit messages. Correction of spelling error within JGDAS_ATMOS_GEMPAK included.

Comment thread dev/jobs/JGDAS_ATMOS_GEMPAK Outdated
Comment thread dev/jobs/JGDAS_ATMOS_GEMPAK Outdated
Comment thread dev/jobs/JGDAS_ATMOS_GEMPAK Outdated
Comment thread dev/scripts/exgfs_atmos_nawips.sh Outdated
Comment thread dev/jobs/JGDAS_ATMOS_GEMPAK_META_NCDC Outdated
Comment thread dev/jobs/JGFS_ATMOS_GEMPAK_NCDC_UPAPGIF Outdated
Comment thread dev/jobs/JGFS_ATMOS_GEMPAK_PGRB2_SPEC Outdated
Comment thread dev/jobs/JGFS_ATMOS_PGRB2_SPEC_NPOESS Outdated
Comment thread dev/jobs/JGFS_ATMOS_AWIPS_20KM_1P0DEG Outdated
@TerrenceMcGuinness-NOAA TerrenceMcGuinness-NOAA removed the CI-Ursa-Running **Bot use only** CI testing on Ursa for this PR is in-progress label Apr 17, 2026
@BoCui-NOAA
Copy link
Copy Markdown
Contributor

The updates to the bufr sounding job JGFS_ATMOS_POSTSND and the ush script look good to me.

Copy link
Copy Markdown
Contributor

@EdwardSafford-NOAA EdwardSafford-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DA monitor changes look good.

Co-authored-by: Christopher Hill <102273578+ChristopherHill-NOAA@users.noreply.github.com>
@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor Author

Thank you for the suggestions @ChristopherHill-NOAA. I've incorporated them.

@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor Author

Launching tests on WCOSS2.

@DavidHuber-NOAA DavidHuber-NOAA added the CI-Wcoss2-Running CI testing on WCOSS for this PR is in-progress label Apr 21, 2026
@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor Author

All tests passed on WCOSS2 except C96_atm3DVar_extended. That test failed the gfs_gempakmeta job on the last cycle (2021122118) due to a timeout. The walltime request is only 5 minutes. Increasing it to 10 minutes was successful.

I have committed that change and am marking WCOSS2 CI as passed.

@DavidHuber-NOAA DavidHuber-NOAA added CI-Wcoss2-Passed CI testing on WCOSS for this PR has completed successfully and removed CI-Wcoss2-Running CI testing on WCOSS for this PR is in-progress labels Apr 22, 2026
@DavidHuber-NOAA DavidHuber-NOAA merged commit c176171 into NOAA-EMC:develop Apr 22, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI-Wcoss2-Passed CI testing on WCOSS for this PR has completed successfully

Projects

None yet

Development

Successfully merging this pull request may close these issues.

(develop) Add reason for failure into 'FATAL ERROR' message

8 participants