Skip to content

Fix workflow being skipped and marked as failed when agent starts before server#6361

Merged
6543 merged 1 commit into
woodpecker-ci:mainfrom
6543-forks:agent-failing-after-retry
Mar 31, 2026
Merged

Fix workflow being skipped and marked as failed when agent starts before server#6361
6543 merged 1 commit into
woodpecker-ci:mainfrom
6543-forks:agent-failing-after-retry

Conversation

@6543
Copy link
Copy Markdown
Member

@6543 6543 commented Mar 31, 2026

reported at #6355 (comment)

When the agent started before the server was available, it retried the connection as expected. However, once the server came up and a workflow was picked up, the pipeline would immediately fail without running any steps — the agent logs showed workflow context done firing instantly after received execution.

The root cause was a package-level shutdownCtx shared across retry iterations. On each failed attempt, stopAgentFunc stamped it with a 5-second timeout — starting the clock immediately. By the time the agent successfully connected and received a workflow, workflowCtx was derived from this already-expired context, so execution failed before Docker even started a container.

The fix removes the global mutable shutdown context and the stopAgentFunc indirection. Instead, runner.Run() no longer accepts a shutdownCtx parameter — it creates a fresh one locally only when needed for the Done() fallback call. The healthcheck server shutdown does the same. This makes the lifetime of each shutdown window explicit and local.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 0% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 34.19%. Comparing base (2de5962) to head (41e267a).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
agent/runner.go 0.00% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6361      +/-   ##
==========================================
- Coverage   34.20%   34.19%   -0.01%     
==========================================
  Files         426      426              
  Lines       28537    28541       +4     
==========================================
  Hits         9760     9760              
- Misses      17889    17893       +4     
  Partials      888      888              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@6543 6543 requested a review from a team March 31, 2026 17:14
@6543 6543 merged commit a7739a2 into woodpecker-ci:main Mar 31, 2026
7 of 9 checks passed
@6543 6543 deleted the agent-failing-after-retry branch March 31, 2026 17:56
This was referenced Mar 31, 2026
@6543
Copy link
Copy Markdown
Member Author

6543 commented Apr 4, 2026

one of the bugs was actually a regression of #6021 ... b.t.w. ... but with this refactoring we did more fixing and preventing :)

@woodpecker-bot woodpecker-bot mentioned this pull request Apr 15, 2026
1 task
@6543 6543 changed the title Fix workflow beeing skipped and marked as failed when agent starts before server Fix workflow being skipped and marked as failed when agent starts before server Apr 19, 2026
@woodpecker-bot woodpecker-bot mentioned this pull request Apr 27, 2026
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants