Skip to content

Use jinja templates instead of @VARNAME@ in config files#3411

Merged
DavidHuber-NOAA merged 37 commits into
NOAA-EMC:developfrom
aerorahul:feature/jinja-configs
Mar 27, 2025
Merged

Use jinja templates instead of @VARNAME@ in config files#3411
DavidHuber-NOAA merged 37 commits into
NOAA-EMC:developfrom
aerorahul:feature/jinja-configs

Conversation

@aerorahul
Copy link
Copy Markdown
Contributor

@aerorahul aerorahul commented Feb 28, 2025

Description

This PR:

  • replaces the template pattern used in configs e.g. @MACHINE@ in config.base with jinja template pattern {{ MACHINE }}
  • setup_expt.py script is updated to remove custom code that was written to search and replace @VARNAME@ with values from the experiment yamls. Jinja provides a much cleaner way to interface those two.
  • updated configs that are jinja templates now have an extension of .j2 in parm/config/<system> directory. e.g. config.base is now named config.base.j2 in parm/config/gfs. It gets rendered as config.base in the experiment directory after setup_expt.py completes the execution.
  • makes use of --account argument in setup_expt.py as opposed to replacing ACCOUNT from hosts/host.yaml.
  • organizes hosts/host.yaml into sections for Paths, BQS properties, HPSS properties and Features for that host.

In addition, this PR also:

  • improves logging of create_experiment.py, setup_expt.py and setup_xml.py.
  • allows host.py to honor MACHINE_ID set by sourcing detect_machine.sh.

Resolves #3439

Type of change

  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO

How has this been tested?

Experiment directories were created before and after this change and compared. Changes were in the expected places where template patterns were updated.

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • Any new scripts have been added to the .github/CODEOWNERS file with owners
  • I have made corresponding changes to the system documentation if necessary

…dd workflow/requirements.txt for easy creation of workflow virtual env. ignore the venv in HOMEgfs space. use detect_machine.sh in global-workflow instead of relying on the gfs-utils submodule
Copy link
Copy Markdown

@github-advanced-security github-advanced-security AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ShellCheck found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Comment thread sorc/link_workflow.sh Outdated
Comment thread workflow/setup_expt.py
Co-authored-by: David Huber <69919478+DavidHuber-NOAA@users.noreply.github.com>
…ate the method in task.py that validates the system keys for jinja. it should not be necessary, but is here to check
Comment thread workflow/setup_expt.py Outdated
Comment thread workflow/setup_expt.py Outdated
@aerorahul
Copy link
Copy Markdown
Contributor Author

I am going to close this PR for now as I work through the unresolved issues.

@aerorahul aerorahul closed this Mar 4, 2025
@emcbot emcbot added CI-Hercules-Building **Bot use only** CI testing is cloning/building on Hercules and removed CI-Hercules-Ready **CM use only** PR is ready for CI testing on Hercules labels Mar 26, 2025
@emcbot
Copy link
Copy Markdown

emcbot commented Mar 26, 2025

Checkout Failed on Hercules in Build# 8: Remote call on Hercules-EMC failed

@emcbot emcbot added CI-Hercules-Failed **Bot use only** CI testing on Hercules for this PR has failed and removed CI-Hercules-Building **Bot use only** CI testing is cloning/building on Hercules labels Mar 26, 2025
@TerrenceMcGuinness-NOAA TerrenceMcGuinness-NOAA added CI-Hercules-Ready **CM use only** PR is ready for CI testing on Hercules and removed CI-Hercules-Failed **Bot use only** CI testing on Hercules for this PR has failed labels Mar 26, 2025
@emcbot emcbot added CI-Hercules-Building **Bot use only** CI testing is cloning/building on Hercules and removed CI-Hercules-Ready **CM use only** PR is ready for CI testing on Hercules labels Mar 26, 2025
@emcbot
Copy link
Copy Markdown

emcbot commented Mar 26, 2025

Build FAILED on Hercules in Build# 9 with error logs:

/work2/noaa/global/role-global/GFS_CI_CD/HERCULES/CI/3411/global-workflow/sorc/logs/gsi_enkf.log

Follow link here to view the contents of the above file(s): (link)

@emcbot emcbot added CI-Hercules-Failed **Bot use only** CI testing on Hercules for this PR has failed and removed CI-Hercules-Building **Bot use only** CI testing is cloning/building on Hercules labels Mar 26, 2025
@aerorahul
Copy link
Copy Markdown
Contributor Author

_CD/HERCULES/CI/3411/global-workflow/sorc/logs/gsi_enkf.log

@TerrenceMcGuinness-NOAA
Looks like the build job timed out on this one on Hercules. Here is the last snippet.

-- Installing: /work2/noaa/global/role-global/GFS_CI_CD/HERCULES/CI/3411/global-workflow/sorc/gsi_enkf.fd/install/include/gsi/stpvismod.mod
-- Installing: /work2/noaa/global/role-global/GFS_CI_CD/HERCULES/CI/3411/global-workflow/sorc/gsi_enkf.fd/install/include/gsi/into3lmod.mod
-- Installing: /work2/noaa/global/role-global/GFS_CI_CD/HERCULES/CI/3411/global-workflow/sorc/gsi_enkf.fd/install/include/gsi/gsi_gustoper.mod
slurmstepd: error: *** JOB 4881370 ON hercules-03-31 CANCELLED AT 2025-03-26T18:19:32 DUE TO TIME LIMIT ***

I am going to reset the label and hope that it kicks this back up

@aerorahul aerorahul added CI-Hercules-Ready **CM use only** PR is ready for CI testing on Hercules and removed CI-Hercules-Failed **Bot use only** CI testing on Hercules for this PR has failed labels Mar 27, 2025
@emcbot emcbot added CI-Hercules-Building **Bot use only** CI testing is cloning/building on Hercules CI-Hercules-Running **Bot use only** CI testing on Hercules for this PR is in-progress and removed CI-Hercules-Ready **CM use only** PR is ready for CI testing on Hercules CI-Hercules-Building **Bot use only** CI testing is cloning/building on Hercules labels Mar 27, 2025
@aerorahul
Copy link
Copy Markdown
Contributor Author

Looking at the CI run on WCOSS2, it appears all tests have passed successfully under /lfs/h2/emc/ptmp/emc.global/PR/PR_3411
I performed a rocotostat | grep -v "SUCCEEDED" under each experiment directory and got no hits on failed jobs.
Marking the label CI_WCOSS_Passed

@aerorahul aerorahul added CI-Wcoss2-Passed CI testing on WCOSS for this PR has completed successfully and removed CI-Wcoss2-Running CI testing on WCOSS for this PR is in-progress labels Mar 27, 2025
@emcbot emcbot added CI-Hercules-Passed **Bot use only** CI testing on Hercules for this PR has completed successfully and removed CI-Hercules-Running **Bot use only** CI testing on Hercules for this PR is in-progress labels Mar 27, 2025
@emcbot
Copy link
Copy Markdown

emcbot commented Mar 27, 2025

CI Passed on Hercules in Build# 10
Built and ran in directory /work2/noaa/global/role-global/GFS_CI_CD/HERCULES/CI/3411


Experiment C48_ATM_fdf80467 Completed 1 Cycles: *SUCCESS* at Thu Mar 27 10:00:24 CDT 2025
Experiment C48mx500_hybAOWCDA_fdf80467 Completed 2 Cycles: *SUCCESS* at Thu Mar 27 10:06:19 CDT 2025
Experiment C96mx100_S2S_fdf80467 Completed 1 Cycles: *SUCCESS* at Thu Mar 27 10:12:29 CDT 2025
Experiment C96C48_hybatmDA_fdf80467 Completed 3 Cycles: *SUCCESS* at Thu Mar 27 11:00:40 CDT 2025
Experiment C96_atm3DVar_fdf80467 Completed 3 Cycles: *SUCCESS* at Thu Mar 27 11:18:11 CDT 2025
Experiment C48_S2SW_fdf80467 Completed 1 Cycles: *SUCCESS* at Thu Mar 27 11:56:06 CDT 2025
Experiment C48_S2SWA_gefs_fdf80467 Completed 1 Cycles: *SUCCESS* at Thu Mar 27 12:15:21 CDT 2025
Experiment C48mx500_3DVarAOWCDA_fdf80467 Completed 2 Cycles: *SUCCESS* at Thu Mar 27 14:21:34 CDT 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully CI-Hercules-Passed **Bot use only** CI testing on Hercules for this PR has completed successfully CI-Wcoss2-Passed CI testing on WCOSS for this PR has completed successfully

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add additional configurable variables

7 participants