Skip to content

Hera move spack-stack to /contrib#2579

Closed
RatkoVasic-NOAA wants to merge 9 commits into
ufs-community:developfrom
RatkoVasic-NOAA:Hera-scratch4
Closed

Hera move spack-stack to /contrib#2579
RatkoVasic-NOAA wants to merge 9 commits into
ufs-community:developfrom
RatkoVasic-NOAA:Hera-scratch4

Conversation

@RatkoVasic-NOAA
Copy link
Copy Markdown
Collaborator

@RatkoVasic-NOAA RatkoVasic-NOAA commented Jan 29, 2025

Commit Queue Requirements:

  • Fill out all sections of this template.
  • All sub component pull requests have been reviewed by their code managers.
  • Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
  • Commit 'test_changes.list' from previous step

Description:

Hera is switching from old scratch1 and scratch2 to scratch3 and scratch4 disks.
Current spack-stack used by UFS-WM (1.6.0/fms-2024.01) is installed at new place:
/contrib/spack-stack/spack-stack-1.6.0/envs/

In this PR we only address first part: moving spack-stack to /contrib disk.
Part II should have all data (ICs, BCs, LBCs, ...) moved to new disks (/scratch3 and /sctarch4)

Also added @BruceKropp-Raytheon idea of better customize dprefix variable in rt.sh

Commit Message:

* UFSWM/modulefiles

Branch passed all tests on Hera (logs attached).

Priority:

  • Normal

Git Tracking

UFSWM:

Sub component Pull Requests:

  • None

UFSWM Blocking Dependencies:

  • None

Changes

  • No Baseline Changes.

Input data Changes:

  • None.

Library Changes/Upgrades:

Libraries already installed on new disks.


Testing Log:

  • RDHPCS
    • Hera
    • Orion
    • Hercules
    • Jet
    • GaeaC5
    • GaeaC6
    • Derecho
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
  • opnReqTest (complete task if unnecessary)

@DavidHuber-NOAA
Copy link
Copy Markdown
Collaborator

Could the UPP hash also be updated with this PR? I know that will update the spack-stack version to 1.8.0 for the UPP, but it will point to a /contrib installation.

@RatkoVasic-NOAA
Copy link
Copy Markdown
Collaborator Author

Could the UPP hash also be updated with this PR? I know that will update the spack-stack version to 1.8.0 for the UPP, but it will point to a /contrib installation.

@DavidHuber-NOAA I'm OK with that, but It doesn't depend on me. If UFS-WM code managers (@jkbk2004) are OK, then we (they) can add it while merging.

@ulmononian
Copy link
Copy Markdown
Collaborator

@BinLiu-NOAA @BijuThomas-NOAA not sure when the hash of ufs wm will update in hafs, but just fyi for hera modulefile changes that would be needed when the hash does get updated

@jkbk2004
Copy link
Copy Markdown
Collaborator

jkbk2004 commented Feb 6, 2025

@FernandoAndrade-NOAA can you try to run all gnu cases of this pr on hera ? we may need to confirm if we see same issue. can be permission issue on my side.

@jkbk2004
Copy link
Copy Markdown
Collaborator

jkbk2004 commented Feb 6, 2025

@RatkoVasic-NOAA I am not sure the issue is on my side but rt.sh has an issue like

Also make sure that all modulefiles written in TCL start with the string
#%Module

Executing this command requires loading "gnu/13.3.0" which failed while
processing the following module(s):

    Module fullname   Module Filename
    ---------------   ---------------
    stack-gcc/13.3.0  /contrib/spack-stack/spack-stack-1.6.0/envs/gnu-fms-2024.01/install/modulefiles/Core/stack-gcc/13.3.0.lua
    modules.fv3       /scratch1/NCEPDEV/stmp2/Jong.Kim/FV3_RT/rt_2158566/control_c48_gnu/modulefiles/modules.fv3.lua

@RatkoVasic-NOAA
Copy link
Copy Markdown
Collaborator Author

@RatkoVasic-NOAA I am not sure the issue is on my side but rt.sh has an issue like

@jkbk2004 please give me the path and log file, so I can try to replicate your error.

@jkbk2004
Copy link
Copy Markdown
Collaborator

jkbk2004 commented Feb 6, 2025

@RatkoVasic-NOAA I am not sure the issue is on my side but rt.sh has an issue like

@jkbk2004 please give me the path and log file, so I can try to replicate your error.

Path is /scratch2/NCEPDEV/marine/Jong.Kim/UFS-RT/rt-2579/tests/logs/log_hera/run_control_c48_gnu.log

@RatkoVasic-NOAA
Copy link
Copy Markdown
Collaborator Author

Path is /scratch2/NCEPDEV/marine/Jong.Kim/UFS-RT/rt-2579/tests/logs/log_hera/run_control_c48_gnu.log

That run directory does not exist anymore:
ll /scratch1/NCEPDEV/stmp2/Jong.Kim/FV3_RT/rt_2158566/control_c48_gnu

But, I looked in your /scratch1/NCEPDEV/stmp2/Jong.Kim/FV3_RT/rt_2479708/control_c48_gnu/modulefiles directory:

Hera:/scratch2/NCEPDEV/marine/Jong.Kim/UFS-RT/rt-2579/tests/logs/log_hera>module use /scratch1/NCEPDEV/stmp2/Jong.Kim/FV3_RT/rt_2479708/control_c48_gnu/modulefiles
Hera:/scratch2/NCEPDEV/marine/Jong.Kim/UFS-RT/rt-2579/tests/logs/log_hera>module load modules.fv3
Hera:/scratch2/NCEPDEV/marine/Jong.Kim/UFS-RT/rt-2579/tests/logs/log_hera>module list

Currently Loaded Modules:
  1) stack-gcc/13.3.0      8) libjpeg/2.1.0      15) zstd/1.5.2              22) fms/2024.01.02        29) ip/4.3.0           36) libxcrypt/4.4.35        43) nccmp/1.9.0.1
  2) gnu/13.3.0            9) jasper/2.0.32      16) c-blosc/1.21.5          23) bacio/2.4.1           30) sp/2.5.0           37) sqlite/3.43.2           44) modules.fv3
  3) openmpi/4.1.6        10) zlib/1.2.13        17) netcdf-c/4.9.2          24) crtm-fix/2.4.0.1_emc  31) w3emc/2.10.0       38) util-linux-uuid/2.38.1
  4) stack-openmpi/4.1.6  11) libpng/1.6.37      18) netcdf-fortran/4.6.1    25) git-lfs/2.10.0        32) gftl/1.10.0        39) python/3.10.13
  5) nghttp2/1.57.0       12) pkg-config/0.27.1  19) parallel-netcdf/1.12.2  26) crtm/2.4.0.1          33) gftl-shared/1.6.1  40) mapl/2.40.3-esmf-8.6.0
  6) curl/8.4.0           13) hdf5/1.14.0        20) parallelio/2.5.10       27) g2/3.5.1              34) fargparse/1.5.0    41) scotch/7.0.4
  7) cmake/3.23.1         14) snappy/1.1.10      21) esmf/8.6.0              28) g2tmpl/1.13.0         35) gettext/0.19.8.1   42) ufs_common

Please do module purge, and repeat those 3 lines. (module use ..., module load..., module list). Let's see if it is something to do with your environment.

@ulmononian
Copy link
Copy Markdown
Collaborator

maybe for another PR, but UFS-WM baselines & input data need to be moved to scratch3/4. also, dprefix, DISKNM, STMP, & PTMP for hera in rt.sh (https://github.com/RatkoVasic-NOAA/ufs-weather-model/blob/e9c789f1c7566fa527038d77190683c480a91cec/tests/rt.sh#L789-L792) should be updated.

@ulmononian
Copy link
Copy Markdown
Collaborator

i ran the full RT suite on hera (intel/gnu). everything seemed good (did not hit the issue @jkbk2004 ran into).

@jkbk2004
Copy link
Copy Markdown
Collaborator

@RatkoVasic-NOAA @ulmononian can we combine this pr with #2471? It might be more efficient to follow if timelines are ok for scratch3/4 migration and Ursa onboarding.

@RatkoVasic-NOAA
Copy link
Copy Markdown
Collaborator Author

@RatkoVasic-NOAA @ulmononian can we combine this pr with #2471? It might be more efficient to follow if timelines are ok for scratch3/4 migration and Ursa onboarding.

This PR is ready, but I don't know when #2471 will be finished. It may be long time?

@jkbk2004
Copy link
Copy Markdown
Collaborator

@RatkoVasic-NOAA @ulmononian can we combine this pr with #2471? It might be more efficient to follow if timelines are ok for scratch3/4 migration and Ursa onboarding.

This PR is ready, but I don't know when #2471 will be finished. It may be long time?

@RatkoVasic-NOAA we need to validate this pr is ok with rocoto/ecflow/lmod for all people. I don't know why I get the issue to load gnu modulefiles. manual loading is ok but I see the issue with rt.sh.

Comment thread tests/rt.sh Outdated
DISKNM="/scratch2/NAGAPE/epic/UFS-WM_RT"
STMP="${dprefix}/stmp4"
PTMP="${dprefix}/stmp2"
dprefix="/scratch4/NCEPDEV"
Copy link
Copy Markdown
Collaborator

@BruceKropp-Raytheon BruceKropp-Raytheon Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you be willing to use a similar form here, and for all machine cases, the same way it is done for gaeac5:
dprefix=${dprefix:-"/scratch4/NCEPDEV"}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BruceKropp-Raytheon good idea, will do.

@RatkoVasic-NOAA RatkoVasic-NOAA changed the title Hera is moving from scratch1,2 to scratch3,4 - modulefiles updated Hera is moving from scratch1,2 to scratch3,4 - Part I (move spack-stack to /contrib) Mar 18, 2025
@RatkoVasic-NOAA RatkoVasic-NOAA changed the title Hera is moving from scratch1,2 to scratch3,4 - Part I (move spack-stack to /contrib) Hera move spack-stack to /contrib Mar 31, 2025
@RatkoVasic-NOAA RatkoVasic-NOAA deleted the Hera-scratch4 branch March 31, 2025 15:55
@RatkoVasic-NOAA RatkoVasic-NOAA restored the Hera-scratch4 branch March 31, 2025 15:58
jkbk2004 pushed a commit that referenced this pull request Apr 2, 2025
…art Deux + CCPP updates in fv3atm/ccpp-physics: split physics in two groups, reset GFS_interstitial DDT in CCPP_driver.F90 #2651 + Hera move spack-stack to /contrib #2579 (#2610)

* UFSWM - Hera spack-stack migration to /contrib
  * FV3 - Convert from using blocked data structures to contiguous data structures for the GFS external data types (GFS_diagnostics and GFS_restart).
  * FV3 - CCPP updates in fv3atm/ccpp-physics: split physics in two groups, reset GFS_interstitial DDT in CCPP_driver.F90
    * ccpp-physics - Remove GFS_suite_interstitial_{phys,rad}_reset.* (reset in fv3atm CCPP_driver.F90)
 
---------

Co-authored-by: Ratko Vasic <ratko.vasic@noaa.gov>
Co-authored-by: Dom Heinzeller <dom.heinzeller@noaa.gov>
@jkbk2004
Copy link
Copy Markdown
Collaborator

jkbk2004 commented Apr 2, 2025

merged with #2610

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants