Skip to content

Support global-workflow using Rocky 8 on CSPs#2998

Merged
WalterKolczynski-NOAA merged 37 commits into
NOAA-EMC:developfrom
NOAA-EPIC:csps-rocky8
Dec 24, 2024
Merged

Support global-workflow using Rocky 8 on CSPs#2998
WalterKolczynski-NOAA merged 37 commits into
NOAA-EMC:developfrom
NOAA-EPIC:csps-rocky8

Conversation

@weihuang-jedi
Copy link
Copy Markdown
Contributor

@weihuang-jedi weihuang-jedi commented Oct 10, 2024

Description

With ParallelWorks now default Rocky 8 on CSPs, and move to Rocky 8 only after 1/1/2025,
we need to modify global-workflow module files to use Rocky 8 supported spack-stack,
and test compile and run to make sure all works under Rocky 8.

i) Rocky 8 update new features:

a. Wave worked in C48_S2SWA_gefs case, so turn SUPPORT_WAVES to "YES" in awspw.yaml.
Actually, if we did not set SUPPORT_WAVES to "YES", setup_expt.py will rise exception.

b. Using two type of nodes (chips/queues) on AWS, compute/process, where forecasts run in "compute" queue,
which is a big node (more cores), others run in "process" queue, which has small node (less cores).

ii) Rocky 8 update needs the following submodules at or newer than the tags below.

  1. gfs_utils:

commit 4848ecbb5e713b16127433e11f7d3edc6ac784c4 (HEAD, origin/develop, origin/HEAD, develop)
Author: Wei Huang wei.huang@noaa.gov
Date: Fri Oct 18 10:41:25 2024 -0600

Make gfs-utils compile on CSPs with Rocky 8 (#81)

Support Rocky 8 on CSPs.
  1. ufs_utils:

commit 23237610845c3a4438b21b25e9b3dc25c4c15b73 (HEAD)
Author: Wei Huang wei.huang@noaa.gov
Date: Wed Oct 9 11:55:13 2024 -0600

Support UFS_UTILS on CSPs under Rocky 8 (#989)

Fixes #982.
  1. upp:

commit 66a422db80ea129dd87285fe6e811d4b6e1fe29b (HEAD)
Author: Wei Huang wei.huang@noaa.gov
Date: Wed Oct 2 14:38:22 2024 -0600

Make UPP works with Rocky 8 on CSPs (#1034)

* Make UPP works with Rocky 8 on CSPs

* Remove unneeded path

* simplify modulefile
  1. ufs_model:

commit 29c2703 (HEAD)
Author: Cameron Book 43379611+ulmononian@users.noreply.github.com
Date: Tue Nov 12 13:08:12 2024 -0800

Add developmental test cases: idealized baroclinic wave and 2020 July CAPE cases + https://github.com/ufs-community/ufs-weather-model/pull/2459 (#2461)

* UFSWM - Add tests-dev ATM-only idealized dry baroclinic wave test and a 2020 July CAPE case
* UFSWM - Update modulefile to support Rocky 8 on CSPs, with ParallelWorks

---------

Co-authored-by: Wei Huang <wei.huang@noaa.gov>
Co-authored-by: Jong Kim <jong.kim@noaa.gov>

Resolves #2997

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

How has this been tested?

  • Clone and build on CSPs
  • Forecast-only on AWS
  • GEFS test on AWS

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • I have made corresponding changes to the system documentation if necessary

Comment thread env/AZUREPW.env Fixed
Comment thread env/AZUREPW.env Fixed
Comment thread env/AZUREPW.env Fixed
Comment thread env/GOOGLEPW.env Fixed
Comment thread env/GOOGLEPW.env Fixed
Comment thread env/GOOGLEPW.env Fixed
Comment thread parm/config/gefs/config.resources.AWSPW Fixed
@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor

Thank you @DavidNew-NOAA for looking at the output. As David notes, the reference check failed not due to DA but rather due to changes in the background used by H(x) in enkfgdas_atmensanlobs.

Do we expect this PR to alter forecast fields (deterministic and/or ensemble)? The modeling team should confirm that observed differences in model output are acceptable.

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor

This PR does not change the sorc/ufs_model.fd hash. g-w PR #3145 updated the sorc/ufs_model.fd hash. This updated hash is included in this PR.

It was noted in g-w PR #3163 that the updated sorc/ufs_model.fd hash altered forecast output. GDASApp reference files have been updated and are included in the updated sorc/gdas.cd hash in g-w PR #3163. As documented in g-w PR #3163, g-w CI passes on WCOSS2 (Dogwood), Hera, and Orion.

One path forward is to

  1. merge g-w PR #3163 into g-w develop
  2. update NOAA-EPIC:csps-rocky8 with the updated g-w develop
  3. rerun Hera CI for this PR

@aerorahul aerorahul added CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera and removed CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed labels Dec 23, 2024
@emcbot emcbot added CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera and removed CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera labels Dec 23, 2024
@emcbot emcbot added CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully and removed CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress labels Dec 23, 2024
@emcbot
Copy link
Copy Markdown

emcbot commented Dec 24, 2024

CI Passed on Hera in Build# 1
Built and ran in directory /scratch1/NCEPDEV/global/CI/2998


Experiment C48_ATM_c157e5e2 Completed 2 Cycles: *SUCCESS* at Mon Dec 23 21:59:23 UTC 2024
Experiment C48mx500_3DVarAOWCDA_c157e5e2 Completed 2 Cycles: *SUCCESS* at Mon Dec 23 21:59:26 UTC 2024
Experiment C48mx500_hybAOWCDA_c157e5e2 Completed 2 Cycles: *SUCCESS* at Mon Dec 23 22:05:32 UTC 2024
Experiment C96_S2SWA_gefs_replay_ics_c157e5e2 Completed 1 Cycles: *SUCCESS* at Mon Dec 23 22:12:18 UTC 2024
Experiment C96C48_hybatmaerosnowDA_c157e5e2 Completed 3 Cycles: *SUCCESS* at Mon Dec 23 23:12:46 UTC 2024
Experiment C96C48_hybatmDA_c157e5e2 Completed 3 Cycles: *SUCCESS* at Mon Dec 23 23:12:54 UTC 2024
Experiment C96_atm3DVar_c157e5e2 Completed 3 Cycles: *SUCCESS* at Mon Dec 23 23:18:50 UTC 2024
Experiment C96C48_ufs_hybatmDA_c157e5e2 Completed 3 Cycles: *SUCCESS* at Mon Dec 23 23:56:03 UTC 2024
Experiment C48_S2SW_c157e5e2 Completed 2 Cycles: *SUCCESS* at Tue Dec 24 00:15:46 UTC 2024
Experiment C48_S2SWA_gefs_c157e5e2 Completed 1 Cycles: *SUCCESS* at Tue Dec 24 00:39:50 UTC 2024

Copy link
Copy Markdown
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. tests have passed on Hera and Hercules.

@WalterKolczynski-NOAA WalterKolczynski-NOAA added CI-Wcoss2-Building CI testing is cloning/building on WCOSS2 CI-Wcoss2-Failed CI testing on WCOSS for this PR has failed CI-Wcoss2-Running CI testing on WCOSS for this PR is in-progress and removed CI-Wcoss2-Building CI testing is cloning/building on WCOSS2 CI-Wcoss2-Running CI testing on WCOSS for this PR is in-progress labels Dec 24, 2024
@WalterKolczynski-NOAA
Copy link
Copy Markdown
Contributor

gdas build fails on WCOSS, but it fails in develop too

@WalterKolczynski-NOAA WalterKolczynski-NOAA added CI-Wcoss2-Building CI testing is cloning/building on WCOSS2 CI-Wcoss2-Running CI testing on WCOSS for this PR is in-progress and removed CI-Wcoss2-Failed CI testing on WCOSS for this PR has failed CI-Wcoss2-Building CI testing is cloning/building on WCOSS2 labels Dec 24, 2024
@WalterKolczynski-NOAA
Copy link
Copy Markdown
Contributor

CI Tests set up to run in /lfs/h2/emc/ptmp/walter.kolczynski/PR/PR_2998/RUNTESTS on WCOSS

@WalterKolczynski-NOAA WalterKolczynski-NOAA added CI-Wcoss2-Passed CI testing on WCOSS for this PR has completed successfully and removed CI-Wcoss2-Running CI testing on WCOSS for this PR is in-progress labels Dec 24, 2024
@WalterKolczynski-NOAA WalterKolczynski-NOAA merged commit 290f1d2 into NOAA-EMC:develop Dec 24, 2024
tsga added a commit to tsga/global-workflow that referenced this pull request Jan 4, 2025
* develop:
  Ensure OCNRES and ICERES have 3 digits in the archive script (NOAA-EMC#3199)
  Set runtime shell requirements within Jenkins Pipeline (NOAA-EMC#3171)
  Add efcs and epos to ufs_hybatm xml (NOAA-EMC#3192) (NOAA-EMC#3193)
  Fix GEFS and SFS compile flags in build_all.sh (NOAA-EMC#3197)
  Remove early-cycle EnKF forecast (NOAA-EMC#3185)
  Fix mod_icec bug in atmos_prod (NOAA-EMC#3167)
  Create compute build option (NOAA-EMC#3186)
  Support global-workflow using Rocky 8 on CSPs (NOAA-EMC#2998)
danholdaway added a commit to danholdaway/global-workflow that referenced this pull request Jan 27, 2025
* develop:
  Remove WAFS files and references from `develop` (NOAA-EMC#3263)
  fix intel stack version number on c5 (NOAA-EMC#3258)
  Update gsi_monitor and ufs_utils hashes to recent hashes for C5/C6 build and run (NOAA-EMC#3252)
  Enable DA cycling on gaea C5/C6 (NOAA-EMC#3255)
  Copy post-processed sea ice increment for diagnostics (NOAA-EMC#3235)
  Only run METplus in the 3Dvar tests (NOAA-EMC#3245)
  Clone, build, and run C48_ATM and C48_S2SW on Gaea C5 and C6 (NOAA-EMC#3106)
  Add echgres as a dependency only for RUN=enkfgdas, not enkfgfs (NOAA-EMC#3246)
  Add domain level to wave gridded COM path (NOAA-EMC#3137)
  CI JJOB Tests using CMake (NOAA-EMC#3214)
  Make assorted updates to waves (NOAA-EMC#3190)
  Move WCOSS2 LD_LIBRARY_PATH patches to load_ufsda_modules.sh (NOAA-EMC#3236)
  Adding a gefs_arch task to GEFS workflow (NOAA-EMC#3211)
  Add additional GEFS variables needed for AI/ML applications  (NOAA-EMC#3221)
  Add bmat task dependency to marine LETKF task (NOAA-EMC#3224)
  Resolve bug with LMOD_TMOD_FIND_FIRST setting affecting build on WCOSS2 (NOAA-EMC#3229)
  Reinstate product groups (NOAA-EMC#3208)
  Additional fixes for downstream jobs (NOAA-EMC#3187)
  Turn IAU off during staging job for cold start experiments (NOAA-EMC#3215)
  Update the gdas.cd hash and enable GDASApp to run on WCOSS2 (NOAA-EMC#3220)
  Update upload-artifact to v4 (NOAA-EMC#3216)
  Prevent duplicate case generation in generate_workflows.sh (NOAA-EMC#3217)
  Update g-w to cycle with C1152 ATM (NOAA-EMC#3206)
  Separate use of initial increment/perturbation file from REPLAY/+03 ICs  (NOAA-EMC#3119)
  Update gsi_enkf hash and gsi_ver (NOAA-EMC#3207)
  Remove cpus-per-task from APRUN_OCNANALECEN on WCOSS2 (NOAA-EMC#3212)
  Remove 5WAVH from AWIPS GRIB2 parm files (NOAA-EMC#3146)
  Remove multi-grid wave support (NOAA-EMC#3188)
  Add echgres as a dependency for earc (NOAA-EMC#3202)
  Ensure OCNRES and ICERES have 3 digits in the archive script (NOAA-EMC#3199)
  Set runtime shell requirements within Jenkins Pipeline (NOAA-EMC#3171)
  Add efcs and epos to ufs_hybatm xml (NOAA-EMC#3192) (NOAA-EMC#3193)
  Fix GEFS and SFS compile flags in build_all.sh (NOAA-EMC#3197)
  Remove early-cycle EnKF forecast (NOAA-EMC#3185)
  Fix mod_icec bug in atmos_prod (NOAA-EMC#3167)
  Create compute build option (NOAA-EMC#3186)
  Support global-workflow using Rocky 8 on CSPs (NOAA-EMC#2998)
  Change orog gravity wave drag scheme for grid sizes less than 10km (NOAA-EMC#3175)
  Switch snow DA to use 2DVar for deterministic and ensemble mean (NOAA-EMC#3163)
  Update compression options for GEFS history files (NOAA-EMC#3184)
  Update compression options for high res history files (NOAA-EMC#3178)
  Turn DO_TEST_MODE off (NOAA-EMC#3177)
  Hotfix for gdas_arch div/0 (NOAA-EMC#3169)
  Allow building of the ufs-weather-model, WW3 pre/post execs for GFS, GEFS, SFS in the same clone of global-workflow (NOAA-EMC#3098)
  Switch Aerosol DA to use JCB and Jedi class (NOAA-EMC#3125)
  Update ufs-weather-model to 2024-12-06 commit  (NOAA-EMC#3145)
  Enable traditional threading as an option (NOAA-EMC#3149)
  Update HPC_ACCOUNT on Hercules to fv3-cpu (NOAA-EMC#3164)
  Turn C96C48_ufs_hybatmDA and C48mx500_3DVarAOWCDA into a regression test (NOAA-EMC#3120)
  Update GSI analysis jobs to use COMIN/COMOUT (NOAA-EMC#3092)
  Update HPC Tier Definitions (NOAA-EMC#3138)
  Add marine hybrid envar (NOAA-EMC#3041)
  Archive the experiment directory along with git status/diff output (NOAA-EMC#3105)
  Use stochastic restart patterns on rerun (NOAA-EMC#3077)
  Point Jenkinsfile back to CI/ (NOAA-EMC#3139)
  Fix wave restart for cold start and add ic version file (NOAA-EMC#3112)
  Allow users to override the default account at setup time (NOAA-EMC#3127)
  Refactor gridded wave post (NOAA-EMC#3014)
  Update docs related to NOAA CSPs (NOAA-EMC#3043)
  Allow APP to differ between RUNs (NOAA-EMC#2943)
  Run one executable for soca2cice (instead of two) (NOAA-EMC#3118)
  Speed up GSI analysis jobs in CI testing (NOAA-EMC#3115)
  Make aerosol output frequency variable (NOAA-EMC#2982)
  Add new stations to GFS BUFR sounding products (NOAA-EMC#3107)
  JCB-based obs+bias staging, Jedi class updates, and marine B-matrix refactoring (NOAA-EMC#2992)
  Enable tapering of atm ens perts at the model top (NOAA-EMC#3097)
  Update JGDAS ENKF POST  job  (NOAA-EMC#3090)
  SFS Runs at C96mx100  (NOAA-EMC#2960)
  Move machine-based options from config.base to host files (NOAA-EMC#3053)
  Remove RUNDIRS before running CI cases to cover re-run events (NOAA-EMC#3076)
  CI GitHub pipeline (hotfix) update for fetching repo name (NOAA-EMC#3084)
  Update JGDAS ENKF ECEN job  (NOAA-EMC#3050)
  Update snow obs processing job (NOAA-EMC#3055)
  Update to action workflow pipeline in default repo for development  (NOAA-EMC#3062)
  Update to action workflow pipeline in default repo for development (NOAA-EMC#3061)
  Update workflow pipeline (NOAA-EMC#3060)
  PW CI pipeline update5 ready for review so it can be merged and tested (NOAA-EMC#3059)
  Revert "GitHub CI Pipeline update for debugging forked PR support" (NOAA-EMC#3057)
  GitHub CI Pipeline update for debugging forked PR support (NOAA-EMC#3056)
  Add more ocean variables for post-processing in GEFS (NOAA-EMC#2995)
  Auto provisioning of PW clusters from GitHub CI added (NOAA-EMC#3051)
  Fix the name of the TC tracker filenames in archive.py (NOAA-EMC#3030)
  Make wxflow links static instead of from link_workflow (NOAA-EMC#3008)
  Update global jdas enkf diag job with COMIN/COMOUT for COM prefix (NOAA-EMC#2959)
  Add run and finalize methods to marine LETKF task (NOAA-EMC#2944)
  Fix wave restarts and GEFS FHOUT/FHMAX (NOAA-EMC#3009)
  Disabling hyper-threading (NOAA-EMC#2965)
  GitHub Actions Pipeline Updates for Self-Hosted Runners on PW (NOAA-EMC#3018)
  CI jekninsfile update hotfix (NOAA-EMC#3038)
  Update gdas.cd (NOAA-EMC#2978)
  Add ability to add tag to pslots with generate_workflows (NOAA-EMC#3036)
  CI update to shell environment with HOMEgfs to HOME_GFS for systems that need the path (NOAA-EMC#3013)
  Quick updated to Jenkins (health check) launch script (NOAA-EMC#3033)
  Document the generate_workflows.sh script (NOAA-EMC#3028)
  Replace gfs_cyc with an interval (NOAA-EMC#2928)
  Hotfix: Fix generate_workflows.sh optional build flags (NOAA-EMC#3024)
  Add a tool to run multiple YAML cases locally (NOAA-EMC#3004)
  Hotfix: Correctly set overwrite option when specified (NOAA-EMC#3021)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully CI-Hercules-Passed **Bot use only** CI testing on Hercules for this PR has completed successfully CI-Wcoss2-Passed CI testing on WCOSS for this PR has completed successfully

Projects

None yet

Development

Successfully merging this pull request may close these issues.

support global-workflow on CSPs with Rocky 8

8 participants