Skip to content

Trigger downstream jobs on GSI completion logs instead of full analysis job#4469

Closed
Copilot wants to merge 2 commits into
developfrom
copilot/trigger-jobs-on-logs-data
Closed

Trigger downstream jobs on GSI completion logs instead of full analysis job#4469
Copilot wants to merge 2 commits into
developfrom
copilot/trigger-jobs-on-logs-data

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 22, 2026

Description

Jobs downstream of GSI analyses (gdas_anal, gfs_anal, enkfgdas_eobs) currently block on full job completion. Since they only require GSI outputs and bias files, this delays cycle progression unnecessarily.

Changes

Write status log after GSI completes (exglobal_atmos_analysis.sh):

echo "${rCDUMP} ${PDY}${cyc} GSI done at $(date)" > "${COMOUT_ATMOS_ANALYSIS}/${APREFIX}gsi_analysis_status.log"

Placed after GSI execution and data copies, before diagnostic file processing.

Switch three tasks to data dependencies (gfs_tasks.py):

  • sfcanl: Triggers on gsi_analysis_status.log instead of {run}_anal task (non-JEDI only)
  • analcalc: Triggers on gsi_analysis_status.log instead of {run}_anal task (non-JEDI only)
  • vminmon: Triggers on gsi_analysis_status.log instead of {run}_anal task

Example dependency change:

# Before
dep_dict = {'type': 'task', 'name': f'{self.run}_anal'}

# After
analysis_path = self._template_to_rocoto_cycstring(self._base['COM_ATMOS_ANALYSIS_TMPL'])
dep_dict = {'type': 'data', 'data': f'{analysis_path}/{self.run}.t@Hz.gsi_analysis_status.log'}

analdiag remains task-dependent as it requires full diagnostic processing. JEDI workflows unchanged.

Resolves #2912

Type of change

  • New feature (adds functionality)

Change characteristics

  • Is this change expected to change outputs (e.g. value changes to existing outputs, new files stored in COM, files removed from COM, filename changes, additions/subtractions to archives)? YES
    • GFS (new log file: {run}.t{cyc}z.gsi_analysis_status.log)
    • GEFS (new log file: {run}.t{cyc}z.gsi_analysis_status.log)
    • SFS
    • GCAFS
  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO

How has this been tested?

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • Any new scripts have been added to the .github/CODEOWNERS file with owners
  • I have made corresponding changes to the system documentation if necessary
Original prompt

This section details on the original issue you should resolve

<issue_title>Trigger jobs downstream of GSI analyses on logs/data, not job dependencies</issue_title>
<issue_description>### What new functionality do you need?

To improve cycling throughput, jobs immediately downstream of gdas_anal, gfs_anal, and enkfgdas_eobs should be triggered based on logs/data rather than waiting for the job to complete.

Acceptance Criteria

  • Jobs are released when log files are written to COM
  • Appropriate logs are written after the completion of the GSI and data copies to COM

Suggest a solution (optional)

For example, currently, on the GDAS cycle, the gdas_analcalc, gdas_sfcanl, gdas_analdiag, and gdas_vminmon jobs depend on gdas_anal. Instead, for the gdas_analcalc, gdas_sfcanl, and gdas_vminmon jobs (but not gdas_analdiag), amend exglobal_atmos_analysis.sh to the following

diff --git a/dev/scripts/exglobal_atmos_analysis.sh b/dev/scripts/exglobal_atmos_analysis.sh
index 34ab78ae0..b3bda0e8a 100755
--- a/dev/scripts/exglobal_atmos_analysis.sh
+++ b/dev/scripts/exglobal_atmos_analysis.sh
@@ -891,6 +891,8 @@ if [[ "${SENDECF}" == "YES" && "${RUN}" != "enkf" ]]; then
     ecflow_client --event release_fcst
 fi
 
+echo "${rCDUMP} ${PDY}${cyc} GSI done at $(date)" > "${COMOUT_ATMOS_ANALYSIS}/${APREFIX}gsi_analysis_status.log"
+
 # Diagnostic files
 # if requested, GSI diagnostic file directories for use later
 if [[ "${GENDIAG}" == "YES" ]]; then

And then gfs_tasks.py (for the sfcanl job) to

diff --git a/dev/workflow/rocoto/gfs_tasks.py b/dev/workflow/rocoto/gfs_tasks.py
index 2cbbf0096..b522e28e3 100644
--- a/dev/workflow/rocoto/gfs_tasks.py
+++ b/dev/workflow/rocoto/gfs_tasks.py
@@ -298,7 +298,8 @@ class GFSTasks(Tasks):
         if self.options['do_jediatmvar']:
             dep_dict = {'type': 'task', 'name': f'{self.run}_atmanlfinal'}
         else:
-            dep_dict = {'type': 'task', 'name': f'{self.run}_anal'}
+            analysis_path = self._template_to_rocoto_cycstring(self._base['COM_ATMOS_ANALYSIS_TMPL'])
+            dep_dict = {'type': 'data', 'data': f'{analysis_path}/{self.run}.t@Hz.gsi_analysis_status.log'}
         deps.append(rocoto.add_dependency(dep_dict))
         if self.options['do_jedisnowda']:
             dep_dict = {'type': 'task', 'name': f'{self.run}_snowanl'}
```</issue_description>

## Comments on the Issue (you are @copilot in this section)

<comments>
</comments>

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: DavidHuber-NOAA <69919478+DavidHuber-NOAA@users.noreply.github.com>
Copilot AI changed the title [WIP] Trigger downstream jobs on log/data availability Trigger downstream jobs on GSI completion logs instead of full analysis job Jan 22, 2026
Copilot AI requested a review from DavidHuber-NOAA January 22, 2026 21:41
@DavidHuber-NOAA
Copy link
Copy Markdown
Contributor

Combined with #4487

@DavidHuber-NOAA DavidHuber-NOAA deleted the copilot/trigger-jobs-on-logs-data branch February 2, 2026 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Trigger jobs downstream of GSI analyses on logs/data, not job dependencies

2 participants