Skip to content

CSPs sync to GW develop#3483

Merged
DavidHuber-NOAA merged 56 commits into
NOAA-EMC:developfrom
NOAA-EPIC:csp-sync
Apr 3, 2025
Merged

CSPs sync to GW develop#3483
DavidHuber-NOAA merged 56 commits into
NOAA-EMC:developfrom
NOAA-EPIC:csp-sync

Conversation

@weihuang-jedi
Copy link
Copy Markdown
Contributor

AWS uses two partitions for GW, "compute" partition for fcst models, and few wave components which need more than one node; "process" partition for products and others which need single node. Recent code broke this, and only compute was used.
The first cut of Jenkinsfile4AWS compile "gfs" and "gefs" in two commands, we want to combine "gfs", "gefs", and "sfs" together.
Another issue is on AWS, crontab files need to include shell options, but the order was wrong.

Resolves #3482

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO (If YES, please add a link to any PRs that are pending.)
    • EMC verif-global
    • GDAS
    • GFS-utils
    • GSI
    • GSI-monitor
    • GSI-utils
    • UFS-utils
    • UFS-weather-model
    • wxflow

How has this been tested?

  • Clone and build on AWS
  • Clone and test on Hera, Hercules

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • Any new scripts have been added to the .github/CODEOWNERS file with owners
  • I have made corresponding changes to the system documentation if necessary

@emcbot emcbot added CI-Gaeac5-Building CI-Hercules-Passed **Bot use only** CI testing on Hercules for this PR has completed successfully and removed CI-Gaeac5-Ready CI-Hercules-Running **Bot use only** CI testing on Hercules for this PR is in-progress labels Mar 28, 2025
@emcbot
Copy link
Copy Markdown

emcbot commented Mar 28, 2025

CI Passed on Hercules in Build# 3
Built and ran in directory /work2/noaa/global/role-global/GFS_CI_CD/HERCULES/CI/3483


Experiment C48_ATM_c08311e1 Completed 1 Cycles: *SUCCESS* at Thu Mar 27 22:53:01 CDT 2025
Experiment C48mx500_hybAOWCDA_c08311e1 Completed 2 Cycles: *SUCCESS* at Thu Mar 27 23:53:04 CDT 2025
Experiment C96mx100_S2S_c08311e1 Completed 1 Cycles: *SUCCESS* at Fri Mar 28 00:11:30 CDT 2025
Experiment C48_S2SW_c08311e1 Completed 1 Cycles: *SUCCESS* at Fri Mar 28 00:17:55 CDT 2025
Experiment C96_atm3DVar_c08311e1 Completed 3 Cycles: *SUCCESS* at Fri Mar 28 01:23:53 CDT 2025
Experiment C96C48_hybatmDA_c08311e1 Completed 3 Cycles: *SUCCESS* at Fri Mar 28 01:54:28 CDT 2025
Experiment C48mx500_3DVarAOWCDA_c08311e1 Completed 2 Cycles: *SUCCESS* at Fri Mar 28 02:30:55 CDT 2025
Experiment C48_S2SWA_gefs_c08311e1 Completed 1 Cycles: *SUCCESS* at Fri Mar 28 02:32:52 CDT 2025

Copy link
Copy Markdown
Contributor

@DavidHuber-NOAA DavidHuber-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve.

@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor

@weihuang-jedi I see now that there are conflicts in the branch and that the AWS CI did not launch overnight. Can you resolve the conflicts, please? And did you need the AWS CI to run again before merging?

@emcbot
Copy link
Copy Markdown

emcbot commented Mar 28, 2025

Experiment C48_S2SWA_gefs FAILED on Gaeac5 in Build# 5 in
/gpfs/f5/epic/proj-shared/global/CI/3483/RUNTESTS/EXPDIR/C48_S2SWA_gefs_c08311e1

@emcbot
Copy link
Copy Markdown

emcbot commented Mar 28, 2025

Experiment C48_ATM FAILED on Gaeac5 in Build# 5 with error logs:

Follow link here to view the contents of the above file(s): (link)

@emcbot
Copy link
Copy Markdown

emcbot commented Mar 28, 2025

Experiment C48_S2SW FAILED on Gaeac5 in Build# 5 with error logs:

/gpfs/f5/epic/proj-shared/global/CI/3483/RUNTESTS/COMROOT/C48_S2SW_c08311e1/logs/2021032312/gfs_stage_ic.log
/gpfs/f5/epic/proj-shared/global/CI/3483/RUNTESTS/COMROOT/C48_S2SW_c08311e1/logs/2021032312/gfs_waveinit.log

Follow link here to view the contents of the above file(s): (link)

@emcbot
Copy link
Copy Markdown

emcbot commented Mar 28, 2025

Experiment C48_ATM FAILED on Gaeac5 in Build# 5 in
/gpfs/f5/epic/proj-shared/global/CI/3483/RUNTESTS/EXPDIR/C48_ATM_c08311e1

@emcbot
Copy link
Copy Markdown

emcbot commented Mar 28, 2025

Experiment C48_S2SW FAILED on Gaeac5 in Build# 5 in
/gpfs/f5/epic/proj-shared/global/CI/3483/RUNTESTS/EXPDIR/C48_S2SW_c08311e1

@emcbot
Copy link
Copy Markdown

emcbot commented Mar 28, 2025

CI Failed on Gaeac5 in Build# 5
Built and ran in directory /gpfs/f5/epic/proj-shared/global/CI/3483


Experiment C48_S2SW_c08311e1 Terminated with 0
FAIL
FAIL tasks failed and 2 dead at Fri 28 Mar 2025 11:00:08 AM EDT
Experiment C48_S2SW_c08311e1 Terminated: *FAIL*
Experiment C48_ATM_c08311e1 Terminated with 0
FAIL
FAIL tasks failed and 1 dead at Fri 28 Mar 2025 11:00:14 AM EDT
Experiment C48_ATM_c08311e1 Terminated: *FAIL*
Error logs:
/gpfs/f5/epic/proj-shared/global/CI/3483/RUNTESTS/COMROOT/C48_S2SW_c08311e1/logs/2021032312/gfs_stage_ic.log
/gpfs/f5/epic/proj-shared/global/CI/3483/RUNTESTS/COMROOT/C48_S2SW_c08311e1/logs/2021032312/gfs_waveinit.log
Experiment C48_S2SWA_gefs_c08311e1 Terminated with 0 tasks failed and 0 dead at Fri 28 Mar 2025 11:00:24 AM EDT
Experiment C48_S2SWA_gefs_c08311e1 Terminated: *UNKNOWN*
Error logs:
/gpfs/f5/epic/proj-shared/global/CI/3483/RUNTESTS/COMROOT/C48_ATM_c08311e1/logs/2021032312/gfs_stage_ic.log

@weihuang-jedi
Copy link
Copy Markdown
Contributor Author

@DavidHuber-NOAA I have resolved the conflict.
Let us wait till the AWS CI testing finish.
Thanks!

Comment thread ci/Jenkinsfile4AWS Outdated
@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor

The build failed on AWS because build_compute.sh now requires an account argument -A <account>.

@weihuang-jedi
Copy link
Copy Markdown
Contributor Author

@DavidHuber-NOAA Because build_compute.sh needs -A account info, I have to change Jenkinsfile4AWS, but as this new changes can not be used for starting CI testing, so there is no way to start AWS CI testing, unless we push this change to GW EMC repo.
For this reason, can we skip AWS CI test for this PR.
THanks,
Wei

@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor

Yes, I am fine with skipping AWS CI testing. I'll review this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI-Awsepicglobalworkflow-Failed **Bot use only** CI testing on AWS for this PR has failed CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully CI-Hercules-Passed **Bot use only** CI testing on Hercules for this PR has completed successfully

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix some AWS out of sync issues.

6 participants