Skip to content

Update for JCB policies and stage DA job files with Jinja2-templates#2700

Merged
aerorahul merged 91 commits into
NOAA-EMC:developfrom
RussTreadon-NOAA:feature/rename_atm
Jul 1, 2024
Merged

Update for JCB policies and stage DA job files with Jinja2-templates#2700
aerorahul merged 91 commits into
NOAA-EMC:developfrom
RussTreadon-NOAA:feature/rename_atm

Conversation

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor

@RussTreadon-NOAA RussTreadon-NOAA commented Jun 18, 2024

Description

This PR updates the gdas.cd hash to bring in new JCB conventions.
Resolves #2699

From #2654
This PR will move much of the staging code that take place in the python initialization subroutines of the variational and ensemble DA jobs into Jinja2-templated YAML files to be passed into the wxflow file handler. Much of the staging has already been done this way, but this PR simply expands that strategy.

The old Python routines that were doing this staging are now removed. This is part of a broader refactoring of the pygfs tasking.

wxflow PR #30 is a companion to this PR.

Type of change

  • Maintenance (update gdas.cd hash)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO

How has this been tested?

  • Cycled test on Dogwood, Hera, & Hercules

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code

danholdaway and others added 30 commits May 29, 2024 14:35
* upsteam/develop:
  Archiving cleanup (NOAA-EMC#2621)
  Switch to Rocky 9 built external packages on Hercules (NOAA-EMC#2608)
  Add the capability to use slurm reservation nodes (NOAA-EMC#2627)
  Update forecast job to use COMIN/COMOUT (NOAA-EMC#2622)
  Update to add 1-deg global wave grid (NOAA-EMC#2619)
  Add C384mx025_3DVarAOWCDA yamls (NOAA-EMC#2625)
  Script to keep Jenkins Agent persistent from cron (NOAA-EMC#2634)
* upsteam/develop:
  Update ufs-weather-model  (NOAA-EMC#2646)
  Update wmo parm files to fix WMO header (NOAA-EMC#2652)
  Add IAU to snow DA (and its test) (NOAA-EMC#2610)
@aerorahul
Copy link
Copy Markdown
Contributor

I believe we have multiple, compounding errors/issues here. The forecast failure is caused by something else from the other failures (which are still likely due to the cleanup tasks)

The cleanup has been cleaned up in this test. So that cannot be it.

@CoryMartin-NOAA
Copy link
Copy Markdown
Contributor

@aerorahul agreed, I meant that I bet the non-S2S tests will pass now (or fail in a different place)

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Contributor

I think the WCDA test failure is likely related to #2681

@guillaumevernieres @AndrewEichmann-NOAA and myself have all been looking into this. I don't know if @AndrewEichmann-NOAA has updates yet on testing with PR #2681 completely reverted or not yet. I am building an update to that PR for testing but was hoping to get results from Andy first before hitting go on those.

@guillaumevernieres
Copy link
Copy Markdown
Contributor

I think the WCDA test failure is likely related to #2681

@guillaumevernieres @AndrewEichmann-NOAA and myself have all been looking into this. I don't know if @AndrewEichmann-NOAA has updates yet on testing with PR #2681 completely reverted or not yet. I am building an update to that PR for testing but was hoping to get results from Andy first before hitting go on those.

Reverting the missing value back to 0 fixes the issue. I thought I tested this properly, but apparently not.

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

RussTreadon-NOAA commented Jun 28, 2024

Thank you @guillaumevernieres . I resubmitted the failed gdasfcst on Hera with the following change to parm/config/gfs/config.ufs

@@ -401,7 +401,7 @@ if [[ "${skip_mom6}" == "false" ]]; then
   export cplflx=".true."
   model_list="${model_list}.ocean"
   nthreads_mom6=1
-  MOM6_DIAG_MISVAL="-1e34"
+  MOM6_DIAG_MISVAL="0.0"
   case "${mom6_res}" in
     "500")
       ntasks_mom6=8

This change is not sufficient or, more likely, I did not make the correct change. The WCDA gdasfcst still aborted. I now see that the PR #2681 change to config.ufs is more involved. Let me wait for the experts to chime in.

Comment thread scripts/exglobal_cleanup.sh Outdated
@emcbot
Copy link
Copy Markdown

emcbot commented Jun 28, 2024

Experiment C48_S2SWA_gefs FAILED on Hera in
/scratch1/NCEPDEV/global/CI/2700/RUNTESTS/C48_S2SWA_gefs_c1ef4b30

Copy link
Copy Markdown
Contributor Author

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the change is fine, do we want to comment out the block (current approach) or simply remove all the commented out scripting?

@aerorahul
Copy link
Copy Markdown
Contributor

aerorahul commented Jun 28, 2024

While the change is fine, do we want to comment out the block (current approach) or simply remove all the commented out scripting?

We can remove it and replace it at a later time. I leave it to you.
GitHub is experiencing issues in processing Pull Requests, so we are just waiting for it to come back, so this branch can be updated with develop and the CI can be kicked off.
Screenshot 2024-06-28 at 3 13 04 PM

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

Given github issues I will leave exglobal_cleanup.sh alone.

@aerorahul aerorahul added CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera and removed CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed labels Jun 28, 2024
@emcbot emcbot added CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera and removed CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera labels Jun 28, 2024
@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

Run C48mx500_3DVarAOWCDA from RussTreadon-NOAA:feature/rename_atm at dcec081. 20210324 18Z gdasfcst successfully completed.

@emcbot emcbot added CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress and removed CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera labels Jun 28, 2024
@TerrenceMcGuinness-NOAA
Copy link
Copy Markdown
Collaborator

TerrenceMcGuinness-NOAA commented Jun 30, 2024

I'm sorry to say there was a stack overflow error on the Jenkins agent on Hera for the scripts that was monitoring the CI test, but the good news is that all test passed, we just here not able to report back the success with the automated system:

Terry.McGuinness (hfe06) RUNTESTS $ pwd
/scratch1/NCEPDEV/global/CI/2700/RUNTESTS
Terry.McGuinness (hfe06) RUNTESTS $ cat ci-run_check.log     
Experiment C48_ATM_dcec0813 Completed 1 Cycles: *SUCCESS* at Fri Jun 28 22:38:26 UTC 2024
Experiment C48mx500_3DVarAOWCDA_dcec0813 Completed 2 Cycles: *SUCCESS* at Fri Jun 28 23:02:49 UTC 2024
Experiment C48_S2SW_dcec0813 Completed 1 Cycles: *SUCCESS* at Sat Jun 29 00:26:49 UTC 2024
Experiment C96_atm3DVar_dcec0813 Completed 3 Cycles: *SUCCESS* at Sat Jun 29 00:34:14 UTC 2024
Experiment C96C48_hybatmDA_dcec0813 Completed 3 Cycles: *SUCCESS* at Sat Jun 29 00:40:20 UTC 2024
Experiment C96_atmaerosnowDA_dcec0813 Completed 3 Cycles: *SUCCESS* at Sat Jun 29 00:52:30 UTC 2024
Experiment C48_S2SWA_gefs_dcec0813 Completed 1 Cycles: *SUCCESS* at Sat Jun 29 13:04:08 UTC 2024

Setting the label to PASSED by hand.

@TerrenceMcGuinness-NOAA TerrenceMcGuinness-NOAA added CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully and removed CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress labels Jun 30, 2024
@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

Thank you @TerrenceMcGuinness-NOAA for the update. I was wondering what had happened. Great to hear that all tests passed on Hera.

@aerorahul aerorahul merged commit de87067 into NOAA-EMC:develop Jul 1, 2024
@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor Author

Thank you @aerorahul , @WalterKolczynski-NOAA , and @TerrenceMcGuinness-NOAA for persistently working to get this PR into develop.

@RussTreadon-NOAA RussTreadon-NOAA deleted the feature/rename_atm branch July 1, 2024 13:37
DavidHuber-NOAA added a commit to DavidHuber-NOAA/global-workflow that referenced this pull request Jul 1, 2024
…ving

* origin/develop:
  Update for JCB policies and stage DA job files with Jinja2-templates (NOAA-EMC#2700)
  Revert PR 2681 (NOAA-EMC#2739)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully CI-Hercules-Passed **Bot use only** CI testing on Hercules for this PR has completed successfully CI-Wcoss2-Passed CI testing on WCOSS for this PR has completed successfully

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update GDASApp hash to bring in new JCB policies