Skip to content

Update the gdas.cd hash and enable GDASApp to run on WCOSS2#3220

Merged
WalterKolczynski-NOAA merged 4 commits into
NOAA-EMC:developfrom
RussTreadon-NOAA:feature/update_gdas
Jan 14, 2025
Merged

Update the gdas.cd hash and enable GDASApp to run on WCOSS2#3220
WalterKolczynski-NOAA merged 4 commits into
NOAA-EMC:developfrom
RussTreadon-NOAA:feature/update_gdas

Conversation

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor

@RussTreadon-NOAA RussTreadon-NOAA commented Jan 10, 2025

Description

This PR does the following:

  1. update the sorc/gdas.cd hash to bring new GDASApp functionality into g-w
  2. update env/WCOSS2.env
  3. update the WCOSS2 section of ush/module-setup.sh

The change to WCOSS2.env is due to changes introduced during the fall 2024 WCOSS2 upgrade. The change to module-setup.sh is required when using spack-stack on WCOSS2.

Resolves #3219
Resolves #3100

Type of change

  • Maintenance (update gdas.cd hash)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? YES
    • GDAS - this PR points at the updated sorc/gdas.cd hash. No PRs are pending.

How has this been tested?

  • Clone and build on WCOSS2, Hera, Hercules, and Orion
  • Run g-w CI on WCOSS2, Hera, Hercules, and Orion

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • New and existing tests pass with my changes

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

This PR is opened in draft mode until g-w CI has been run on WCOSS2, Hera, Hercules, and Orion.

The g-w team is invited to review and comment on changes to env/WCOSS2.env and ush/module-setup.sh. The changes in env/WCOSS2.env originate from the discussion in WCOSS Ticket#2024111410000051.

@aerorahul
Copy link
Copy Markdown
Contributor

No issues here with the hash update.
I am not sure we are cleared to use spack-stack on WCOSS2 for regular development. The installed stack was for demonstration and testing purposes for NCO staff.

Comment thread env/WCOSS2.env Outdated
Comment on lines +16 to +19
# Add path to GDASApp libraries
export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${HOMEgfs}/sorc/gdas.cd/build/lib"
export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/opt/cray/pe/mpich/8.1.19/ofi/intel/19.0/lib"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now, but will not be acceptable for implementation. I hope there is a more robust solution than this by that time.

More importantly, this has an impact on every executable in every job -- not just GDASApp executables.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree. I strongly dislike these two lines. They are temporary patches to allow GFS v17 testing and development to continue on WCOSS2.

The line

export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${HOMEgfs}/sorc/gdas.cd/build/lib"

was added because craype/2.7.17 adds

-static-libgcc -static-libstdc++ -Bstatic -lstdc++ -Bdynamic -lm -lpthread

to the ftn command. GDASApp executables failed because they could not find JEDI libraries. Might the addition of a GDASApp install option (something we must have) resolve this problem?

Another concern with the added ftn options is the following warning found in build_gdas.log

icpc: warning #10315: specifying -lm before files may supersede the Intel(R) math library and affect performance
ifort: warning #10315: specifying -lm before files may supersede the Intel(R) math library and affect performance

It would be unfortunate if default compiler options resulted in degraded code performance.

The line

export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/opt/cray/pe/mpich/8.1.19/ofi/intel/19.0/lib"

was recommended by GDIT. GDASApp testing identified inconsistencies in across system modules. Some GDASApp executables failed with undefined symbol messages for mpi routines. GDIT is working on a solution.

@RussTreadon-NOAA RussTreadon-NOAA mentioned this pull request Jan 11, 2025
2 tasks
@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

g-w CI

RussTreadon-NOAA:feature/update_gdas at 4e73a31 installed on WCOSS2 (Cactus), Hera, Hercules, and Orion. The following g-w CI streams were run on each machine

  • C48_ATM
  • C48_S2SWA_gefs
  • C48_S2SW
  • C48mx500_3DVarAOWCDA
  • C48mx500_hybAOWCDA
  • C96C48_hybatmDA
  • C96C48_hybatmaerosnowDA
  • C96C48_ufs_hybatmDA
  • C96_S2SWA_gefs_replay_ics
  • C96_atm3DVar

All jobs in all streams successfully ran to completion on all machines except for C48mx500_hybAOWCDA job gdas_marineanlletkf on Hercules. Jobs from all other stream successfully ran to completion on Hercules. Only a single job from C48mx500_hybAOWCDA failed.

Investigation of the failure indicates a missing job dependency in the experiment xml. A rewind and rerun resulted in successful completion of the job and, as a result, the entireC48mx500_hybAOWCDA stream. @guillaumevernieres and @AndrewEichmann-NOAA have been contacted.

See issue #3219 for additional details.

@RussTreadon-NOAA RussTreadon-NOAA marked this pull request as ready for review January 13, 2025 11:20
@RussTreadon-NOAA RussTreadon-NOAA self-assigned this Jan 13, 2025
@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

This PR is ready for review.

aerorahul
aerorahul previously approved these changes Jan 13, 2025
Copy link
Copy Markdown
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the comments on LD_LIBRARY_PATH affecting non-gdasapp executables.
Acknowledging that gdasapp is now using space-stack on wcoss but other components are not.

Comment thread env/WCOSS2.env Outdated
@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

@aerorahul , incorporating @guillaumevernieres suggestion dismissed your approval.

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

g-w issue #3222 has been opened to report the missing job dependency for C48mx500_hybAOWCDA gdas_marineanlletkf. @AndrewEichmann-NOAA will follow up on this item.

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

NCO confirmed that it is OK for GDASApp to use /apps/ops/test/spack-stack-1.6.0-nco/envs/nco-intel-19.1.3.304/install/modulefiles/Core with the understanding that it is being worked on actively for new libraries or new versions, it might change from time to time.

/apps/ops/test/spack-stack-nco/modulefiles/Core was shared as a more stable version. Tests of this version show that it does not work in GDASApp. The GDASApp build fails with

-- [bufr_query] (2.8.0)
-- Feature TESTS enabled
CMake Error at bufr-query/CMakeLists.txt:23 (find_package):
  By not providing "Findeckit.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "eckit", but
  CMake did not find one.

  Could not find a package configuration file provided by "eckit" (requested
  version 1.23.0) with any of the following names:

We can not build GDASApp using /apps/dev/lmodules/core as we did in the past. Attempts to do so fail with the oops configure error

-- Adding bundle project oops
CMake Error at oops/CMakeLists.txt:14 (cmake_minimum_required):
  CMake 3.23 or higher is required.  You are running version 3.20.2


-- Configuring incomplete, errors occurred!

cmake/3.23 is not available with hpc-stack.

We will stick with /apps/ops/test/spack-stack-1.6.0-nco/envs/nco-intel-19.1.3.304/install/modulefiles/Core for the time being. This allows GFS v17 aerosol, snow, and marine DA development to continue on WCOSS2.

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

g-w CI summary

CI-Hera-Passed, CI-Orion-Passed, CI-Wcoss2-Passed labels can be applied to this PR. Tests were manually run.

CI-Hercules-Passed applies to all cases except C48mx500_hybAOWCDA. Testing discovered a missing job dependency for marine_gdasanlletkf. g-w issue #3222 has been opened to track resolution of this problem. All other g-w CI cases passed on Hercules.

@RussTreadon-NOAA RussTreadon-NOAA added CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully CI-Orion-Passed **Bot use only** CI testing on Orion for this PR has completed successfully CI-Wcoss2-Passed CI testing on WCOSS for this PR has completed successfully labels Jan 13, 2025
@WalterKolczynski-NOAA WalterKolczynski-NOAA merged commit 26fb850 into NOAA-EMC:develop Jan 14, 2025
@RussTreadon-NOAA RussTreadon-NOAA deleted the feature/update_gdas branch January 14, 2025 11:08
@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

Thank you @WalterKolczynski-NOAA

KateFriedman-NOAA added a commit to KateFriedman-NOAA/global-workflow that referenced this pull request Jan 15, 2025
…kf_sfc_update_com_in_out

* upstream/develop:
  Resolve bug with LMOD_TMOD_FIND_FIRST setting affecting build on WCOSS2 (NOAA-EMC#3229)
  Reinstate product groups (NOAA-EMC#3208)
  Additional fixes for downstream jobs (NOAA-EMC#3187)
  Turn IAU off during staging job for cold start experiments (NOAA-EMC#3215)
  Update the gdas.cd hash and enable GDASApp to run on WCOSS2 (NOAA-EMC#3220)
  Update upload-artifact to v4 (NOAA-EMC#3216)
  Prevent duplicate case generation in generate_workflows.sh (NOAA-EMC#3217)
  Update g-w to cycle with C1152 ATM (NOAA-EMC#3206)
KateFriedman-NOAA added a commit to KateFriedman-NOAA/global-workflow that referenced this pull request Jan 15, 2025
…kf_sfc_update_com_in_out

* upstream/develop:
  Resolve bug with LMOD_TMOD_FIND_FIRST setting affecting build on WCOSS2 (NOAA-EMC#3229)
  Reinstate product groups (NOAA-EMC#3208)
  Additional fixes for downstream jobs (NOAA-EMC#3187)
  Turn IAU off during staging job for cold start experiments (NOAA-EMC#3215)
  Update the gdas.cd hash and enable GDASApp to run on WCOSS2 (NOAA-EMC#3220)
  Update upload-artifact to v4 (NOAA-EMC#3216)
  Prevent duplicate case generation in generate_workflows.sh (NOAA-EMC#3217)
  Update g-w to cycle with C1152 ATM (NOAA-EMC#3206)
KateFriedman-NOAA added a commit to KateFriedman-NOAA/global-workflow that referenced this pull request Jan 15, 2025
…kf_sfc_update_com_in_out

* upstream/develop:
  Resolve bug with LMOD_TMOD_FIND_FIRST setting affecting build on WCOSS2 (NOAA-EMC#3229)
  Reinstate product groups (NOAA-EMC#3208)
  Additional fixes for downstream jobs (NOAA-EMC#3187)
  Turn IAU off during staging job for cold start experiments (NOAA-EMC#3215)
  Update the gdas.cd hash and enable GDASApp to run on WCOSS2 (NOAA-EMC#3220)
  Update upload-artifact to v4 (NOAA-EMC#3216)
  Prevent duplicate case generation in generate_workflows.sh (NOAA-EMC#3217)
  Update g-w to cycle with C1152 ATM (NOAA-EMC#3206)
tsga added a commit to tsga/global-workflow that referenced this pull request Jan 22, 2025
* develop:
  Add echgres as a dependency only for RUN=enkfgdas, not enkfgfs (NOAA-EMC#3246)
  Add domain level to wave gridded COM path (NOAA-EMC#3137)
  CI JJOB Tests using CMake (NOAA-EMC#3214)
  Make assorted updates to waves (NOAA-EMC#3190)
  Move WCOSS2 LD_LIBRARY_PATH patches to load_ufsda_modules.sh (NOAA-EMC#3236)
  Adding a gefs_arch task to GEFS workflow (NOAA-EMC#3211)
  Add additional GEFS variables needed for AI/ML applications  (NOAA-EMC#3221)
  Add bmat task dependency to marine LETKF task (NOAA-EMC#3224)
  Resolve bug with LMOD_TMOD_FIND_FIRST setting affecting build on WCOSS2 (NOAA-EMC#3229)
  Reinstate product groups (NOAA-EMC#3208)
  Additional fixes for downstream jobs (NOAA-EMC#3187)
  Turn IAU off during staging job for cold start experiments (NOAA-EMC#3215)
  Update the gdas.cd hash and enable GDASApp to run on WCOSS2 (NOAA-EMC#3220)
  Update upload-artifact to v4 (NOAA-EMC#3216)
  Prevent duplicate case generation in generate_workflows.sh (NOAA-EMC#3217)
  Update g-w to cycle with C1152 ATM (NOAA-EMC#3206)
  Separate use of initial increment/perturbation file from REPLAY/+03 ICs  (NOAA-EMC#3119)
  Update gsi_enkf hash and gsi_ver (NOAA-EMC#3207)
  Remove cpus-per-task from APRUN_OCNANALECEN on WCOSS2 (NOAA-EMC#3212)
  Remove 5WAVH from AWIPS GRIB2 parm files (NOAA-EMC#3146)
  Remove multi-grid wave support (NOAA-EMC#3188)
  Add echgres as a dependency for earc (NOAA-EMC#3202)
danholdaway added a commit to danholdaway/global-workflow that referenced this pull request Jan 27, 2025
* develop:
  Remove WAFS files and references from `develop` (NOAA-EMC#3263)
  fix intel stack version number on c5 (NOAA-EMC#3258)
  Update gsi_monitor and ufs_utils hashes to recent hashes for C5/C6 build and run (NOAA-EMC#3252)
  Enable DA cycling on gaea C5/C6 (NOAA-EMC#3255)
  Copy post-processed sea ice increment for diagnostics (NOAA-EMC#3235)
  Only run METplus in the 3Dvar tests (NOAA-EMC#3245)
  Clone, build, and run C48_ATM and C48_S2SW on Gaea C5 and C6 (NOAA-EMC#3106)
  Add echgres as a dependency only for RUN=enkfgdas, not enkfgfs (NOAA-EMC#3246)
  Add domain level to wave gridded COM path (NOAA-EMC#3137)
  CI JJOB Tests using CMake (NOAA-EMC#3214)
  Make assorted updates to waves (NOAA-EMC#3190)
  Move WCOSS2 LD_LIBRARY_PATH patches to load_ufsda_modules.sh (NOAA-EMC#3236)
  Adding a gefs_arch task to GEFS workflow (NOAA-EMC#3211)
  Add additional GEFS variables needed for AI/ML applications  (NOAA-EMC#3221)
  Add bmat task dependency to marine LETKF task (NOAA-EMC#3224)
  Resolve bug with LMOD_TMOD_FIND_FIRST setting affecting build on WCOSS2 (NOAA-EMC#3229)
  Reinstate product groups (NOAA-EMC#3208)
  Additional fixes for downstream jobs (NOAA-EMC#3187)
  Turn IAU off during staging job for cold start experiments (NOAA-EMC#3215)
  Update the gdas.cd hash and enable GDASApp to run on WCOSS2 (NOAA-EMC#3220)
  Update upload-artifact to v4 (NOAA-EMC#3216)
  Prevent duplicate case generation in generate_workflows.sh (NOAA-EMC#3217)
  Update g-w to cycle with C1152 ATM (NOAA-EMC#3206)
  Separate use of initial increment/perturbation file from REPLAY/+03 ICs  (NOAA-EMC#3119)
  Update gsi_enkf hash and gsi_ver (NOAA-EMC#3207)
  Remove cpus-per-task from APRUN_OCNANALECEN on WCOSS2 (NOAA-EMC#3212)
  Remove 5WAVH from AWIPS GRIB2 parm files (NOAA-EMC#3146)
  Remove multi-grid wave support (NOAA-EMC#3188)
  Add echgres as a dependency for earc (NOAA-EMC#3202)
  Ensure OCNRES and ICERES have 3 digits in the archive script (NOAA-EMC#3199)
  Set runtime shell requirements within Jenkins Pipeline (NOAA-EMC#3171)
  Add efcs and epos to ufs_hybatm xml (NOAA-EMC#3192) (NOAA-EMC#3193)
  Fix GEFS and SFS compile flags in build_all.sh (NOAA-EMC#3197)
  Remove early-cycle EnKF forecast (NOAA-EMC#3185)
  Fix mod_icec bug in atmos_prod (NOAA-EMC#3167)
  Create compute build option (NOAA-EMC#3186)
  Support global-workflow using Rocky 8 on CSPs (NOAA-EMC#2998)
  Change orog gravity wave drag scheme for grid sizes less than 10km (NOAA-EMC#3175)
  Switch snow DA to use 2DVar for deterministic and ensemble mean (NOAA-EMC#3163)
  Update compression options for GEFS history files (NOAA-EMC#3184)
  Update compression options for high res history files (NOAA-EMC#3178)
  Turn DO_TEST_MODE off (NOAA-EMC#3177)
  Hotfix for gdas_arch div/0 (NOAA-EMC#3169)
  Allow building of the ufs-weather-model, WW3 pre/post execs for GFS, GEFS, SFS in the same clone of global-workflow (NOAA-EMC#3098)
  Switch Aerosol DA to use JCB and Jedi class (NOAA-EMC#3125)
  Update ufs-weather-model to 2024-12-06 commit  (NOAA-EMC#3145)
  Enable traditional threading as an option (NOAA-EMC#3149)
  Update HPC_ACCOUNT on Hercules to fv3-cpu (NOAA-EMC#3164)
  Turn C96C48_ufs_hybatmDA and C48mx500_3DVarAOWCDA into a regression test (NOAA-EMC#3120)
  Update GSI analysis jobs to use COMIN/COMOUT (NOAA-EMC#3092)
  Update HPC Tier Definitions (NOAA-EMC#3138)
  Add marine hybrid envar (NOAA-EMC#3041)
  Archive the experiment directory along with git status/diff output (NOAA-EMC#3105)
  Use stochastic restart patterns on rerun (NOAA-EMC#3077)
  Point Jenkinsfile back to CI/ (NOAA-EMC#3139)
  Fix wave restart for cold start and add ic version file (NOAA-EMC#3112)
  Allow users to override the default account at setup time (NOAA-EMC#3127)
  Refactor gridded wave post (NOAA-EMC#3014)
  Update docs related to NOAA CSPs (NOAA-EMC#3043)
  Allow APP to differ between RUNs (NOAA-EMC#2943)
  Run one executable for soca2cice (instead of two) (NOAA-EMC#3118)
  Speed up GSI analysis jobs in CI testing (NOAA-EMC#3115)
  Make aerosol output frequency variable (NOAA-EMC#2982)
  Add new stations to GFS BUFR sounding products (NOAA-EMC#3107)
  JCB-based obs+bias staging, Jedi class updates, and marine B-matrix refactoring (NOAA-EMC#2992)
  Enable tapering of atm ens perts at the model top (NOAA-EMC#3097)
  Update JGDAS ENKF POST  job  (NOAA-EMC#3090)
  SFS Runs at C96mx100  (NOAA-EMC#2960)
  Move machine-based options from config.base to host files (NOAA-EMC#3053)
  Remove RUNDIRS before running CI cases to cover re-run events (NOAA-EMC#3076)
  CI GitHub pipeline (hotfix) update for fetching repo name (NOAA-EMC#3084)
  Update JGDAS ENKF ECEN job  (NOAA-EMC#3050)
  Update snow obs processing job (NOAA-EMC#3055)
  Update to action workflow pipeline in default repo for development  (NOAA-EMC#3062)
  Update to action workflow pipeline in default repo for development (NOAA-EMC#3061)
  Update workflow pipeline (NOAA-EMC#3060)
  PW CI pipeline update5 ready for review so it can be merged and tested (NOAA-EMC#3059)
  Revert "GitHub CI Pipeline update for debugging forked PR support" (NOAA-EMC#3057)
  GitHub CI Pipeline update for debugging forked PR support (NOAA-EMC#3056)
  Add more ocean variables for post-processing in GEFS (NOAA-EMC#2995)
  Auto provisioning of PW clusters from GitHub CI added (NOAA-EMC#3051)
  Fix the name of the TC tracker filenames in archive.py (NOAA-EMC#3030)
  Make wxflow links static instead of from link_workflow (NOAA-EMC#3008)
  Update global jdas enkf diag job with COMIN/COMOUT for COM prefix (NOAA-EMC#2959)
  Add run and finalize methods to marine LETKF task (NOAA-EMC#2944)
  Fix wave restarts and GEFS FHOUT/FHMAX (NOAA-EMC#3009)
  Disabling hyper-threading (NOAA-EMC#2965)
  GitHub Actions Pipeline Updates for Self-Hosted Runners on PW (NOAA-EMC#3018)
  CI jekninsfile update hotfix (NOAA-EMC#3038)
  Update gdas.cd (NOAA-EMC#2978)
  Add ability to add tag to pslots with generate_workflows (NOAA-EMC#3036)
  CI update to shell environment with HOMEgfs to HOME_GFS for systems that need the path (NOAA-EMC#3013)
  Quick updated to Jenkins (health check) launch script (NOAA-EMC#3033)
  Document the generate_workflows.sh script (NOAA-EMC#3028)
  Replace gfs_cyc with an interval (NOAA-EMC#2928)
  Hotfix: Fix generate_workflows.sh optional build flags (NOAA-EMC#3024)
  Add a tool to run multiple YAML cases locally (NOAA-EMC#3004)
  Hotfix: Correctly set overwrite option when specified (NOAA-EMC#3021)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully CI-Orion-Passed **Bot use only** CI testing on Orion for this PR has completed successfully CI-Wcoss2-Passed CI testing on WCOSS for this PR has completed successfully

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update sorc/gdas.cd hash Unable to build GDASApp on Cactus following the system upgrade

5 participants