Skip to content

Rework gsi diagnostic file handling through DATAROOT storage#4487

Merged
DavidHuber-NOAA merged 25 commits into
NOAA-EMC:developfrom
DavidHuber-NOAA:feature/rework_diags
Feb 9, 2026
Merged

Rework gsi diagnostic file handling through DATAROOT storage#4487
DavidHuber-NOAA merged 25 commits into
NOAA-EMC:developfrom
DavidHuber-NOAA:feature/rework_diags

Conversation

@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor

@DavidHuber-NOAA DavidHuber-NOAA commented Jan 28, 2026

Description

This makes use of a COM-like directory structure in DATAROOT to stage the gsi diagnostic files for downstream processing by ${RUN}_(e)diag jobs. Rather than copying the files to COM, they are moved to pCOM. This should greatly improve runtimes of the *_anal, enkfgdas_eobs, and ${RUN}_(e)diag jobs. The diagnostic output is not required for downstream applications or future cycles, so it will no longer be copied to COM.

Refs #4479

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this change expected to change outputs: NO
  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO

How has this been tested?

C96C48_hybatmDA case on Gaea C6.
A full suite of tests should also be run.

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added

This comment was marked as resolved.

@CatherineThomas-NOAA
Copy link
Copy Markdown
Contributor

@DavidHuber-NOAA - I ran your branch at full res to test the timing and the gdas_anal job ran in 44 minutes. This is great; one of the retro streams analysis jobs that ran around the same time ran in 1 hr 10 min. 44 minutes is also in line with the run times we were getting before the gsi diag change was introduced.

One issue I ran into was that the eobs job failed:

/lfs/h2/emc/da/noscrub/catherine.thomas/git/global-workflow/dave/scripts/exglobal_atmos_analysis.sh: line 901: pCOMOUT_ATMOS_ANALYSIS: unbound variable

The eobs job executes the exglobal_atmos_analysis.sh script as well, but doesn't execute JGLOBAL_ATMOS_ANALYSIS, so pCOMOUT_ATMOS_ANALYSIS is undefined in this job.

@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor Author

Converting back to draft while I work through additional issues.

@DavidHuber-NOAA DavidHuber-NOAA marked this pull request as draft January 30, 2026 20:35
@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor Author

All tests passed on Ursa. Opening for review.

@DavidHuber-NOAA DavidHuber-NOAA added the CI-Ursa-Passed (cm) Manual CI passed on Ursa label Feb 4, 2026
@CatherineThomas-NOAA
Copy link
Copy Markdown
Contributor

@DavidHuber-NOAA - Are there logs/COM from the Ursa tests that I could poke around in?

@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor Author

@DavidHuber-NOAA - Are there logs/COM from the Ursa tests that I could poke around in?

Yes, please take a look here: /scratch3/NCEPDEV/stmp/David.Huber/rt_gsidiags/COMROOT.

I will note that Ursa was having issues with slow I/O on /scratch3, so there were a few rocotoboot calls with extended wallclocks.

@CatherineThomas-NOAA
Copy link
Copy Markdown
Contributor

I ran with this branch at full resolution on WCOSS2 for 2 full cycles:

  • Results were identical with the control
  • Radstat, cnvstat, and oznstat tarballs were untarred and all contained diags were also identical with the control's

Timings for impacted jobs were compared:

  • gdas_anal was reduced from 1 hr 04 min to 44 min - a reduction of 20 minutes (!)
  • gdas_analdiag was reduced from 4 to 3 minutes
  • enkfgdas_eobs was reduced from 12 to 9 minutes
  • enkfgdas_ediag was reduced from 3 to 2.5 minutes

This PR is working as intended - no change to results and handling the gsidiags in a more efficient way. Thanks @DavidHuber-NOAA!

@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor Author

Merging. I will rework this slightly in another PR to create dir.* links instead of moving at the end of the job.

@DavidHuber-NOAA DavidHuber-NOAA merged commit 4a26a69 into NOAA-EMC:develop Feb 9, 2026
4 of 5 checks passed
JessicaMeixner-NOAA pushed a commit to JessicaMeixner-NOAA/global-workflow that referenced this pull request Feb 9, 2026
…C#4487)

This makes use of a `COM`-like directory structure in `DATAROOT` to
stage the gsi diagnostic files for downstream processing by
`${RUN}_(e)diag` jobs. Rather than copying the files to `COM`, they are
moved to `pCOM`. This should greatly improve runtimes of the `*_anal`,
`enkfgdas_eobs`, and `${RUN}_(e)diag` jobs. The diagnostic output is not
required for downstream applications or future cycles, so it will no
longer be copied to `COM`.
@DavidHuber-NOAA DavidHuber-NOAA deleted the feature/rework_diags branch February 9, 2026 14:17
DavidHuber-NOAA added a commit to DavidHuber-NOAA/global-workflow that referenced this pull request Feb 10, 2026
CatherineThomas-NOAA pushed a commit that referenced this pull request Feb 11, 2026
This will bring in PRs #4476 and #4487 to the dev/gfs.v17 branch to
enable the use of DATAROOT space for gsi diagnostic files, which allows
easy use of links rather than sending a tarball to COM.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI-Ursa-Passed (cm) Manual CI passed on Ursa

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants