Skip to content

local backend: on linux / mac start commands in own process group and kill the group on cancel#6609

Merged
6543 merged 4 commits into
woodpecker-ci:mainfrom
6543-forks:local-backend_dont-let-commands-kill-agent
May 26, 2026
Merged

local backend: on linux / mac start commands in own process group and kill the group on cancel#6609
6543 merged 4 commits into
woodpecker-ci:mainfrom
6543-forks:local-backend_dont-let-commands-kill-agent

Conversation

@6543
Copy link
Copy Markdown
Member

@6543 6543 commented May 18, 2026

Issue

i got workflows canceled
image

looking at what has canceled it:
image
it was the make that the local agent executed!!!

Reason

exec.CommandContext uses the same process group as the main one by default. and make kills it.
I also discovered that sub-processes from the command where not killed, so now we create a new group and kill it as a whole.

@6543 6543 added bug Something isn't working backend/local labels May 18, 2026
@6543 6543 changed the title local backend: start commands in own process group and kill the group on cancel local backend: on linux / mac start commands in own process group and kill the group on cancel May 18, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 81.25000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 41.62%. Comparing base (b9b3538) to head (9399b31).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
pipeline/backend/local/clone.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6609      +/-   ##
==========================================
+ Coverage   41.57%   41.62%   +0.05%     
==========================================
  Files         432      433       +1     
  Lines       28802    28813      +11     
==========================================
+ Hits        11974    11994      +20     
+ Misses      15750    15744       -6     
+ Partials     1078     1075       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Cover the two behaviors the cmd_unix.go cancel/Setpgid changes enable:

- TestStepInOwnProcessGroup asserts the step shell runs in its own
  process group, so signals like 'make -j' sending SIGTERM to its
  pgrp via 'kill 0' cannot reach the agent.
- TestStepCancelKillsGrandchildren asserts that cancelling the step
  context also kills processes spawned by the step shell, instead of
  leaving them as orphans on the host.

Both fail against main and pass with the cmd_unix.go fix applied.
@6543
Copy link
Copy Markdown
Member Author

6543 commented May 18, 2026

added test via clause and confirmend, without my patch it gets:

{"level":"trace","taskUUID":"test-task-uuid-123","time":"2026-05-18T12:30:07+02:00","message":"create workflow environment"}
{"level":"trace","taskUUID":"test-destroy-task","time":"2026-05-18T12:30:07+02:00","message":"create workflow environment"}
{"level":"trace","taskUUID":"test-destroy-task","time":"2026-05-18T12:30:07+02:00","message":"delete workflow environment"}
{"level":"trace","taskUUID":"test-run-tasks","time":"2026-05-18T12:30:07+02:00","message":"create workflow environment"}
{"level":"trace","taskUUID":"test-run-tasks","time":"2026-05-18T12:30:07+02:00","message":"start step test-step"}
{"level":"trace","taskUUID":"test-run-tasks","time":"2026-05-18T12:30:08+02:00","message":"wait for step test-step"}
{"level":"trace","taskUUID":"test-run-tasks","time":"2026-05-18T12:30:08+02:00","message":"start step altshell"}
{"level":"trace","taskUUID":"test-run-tasks","time":"2026-05-18T12:30:08+02:00","message":"wait for step altshell"}
{"level":"trace","taskUUID":"test-run-tasks","time":"2026-05-18T12:30:08+02:00","message":"start step fail-step"}
{"level":"trace","taskUUID":"test-run-tasks","time":"2026-05-18T12:30:08+02:00","message":"wait for step fail-step"}
{"level":"trace","taskUUID":"test-run-tasks","time":"2026-05-18T12:30:08+02:00","message":"wait for step missing"}
{"level":"trace","taskUUID":"test-run-tasks","time":"2026-05-18T12:30:08+02:00","message":"start step test-plugin"}
{"level":"trace","taskUUID":"test-run-tasks","time":"2026-05-18T12:30:08+02:00","message":"start step test-unsupported"}
{"level":"trace","taskUUID":"test-run-tasks","time":"2026-05-18T12:30:08+02:00","message":"delete workflow environment"}
{"level":"trace","taskUUID":"task-1","time":"2026-05-18T12:30:08+02:00","message":"create workflow environment"}
{"level":"trace","taskUUID":"task-2","time":"2026-05-18T12:30:08+02:00","message":"create workflow environment"}
{"level":"trace","taskUUID":"task-3","time":"2026-05-18T12:30:08+02:00","message":"create workflow environment"}
{"level":"trace","taskUUID":"test-pgrp-isolation","time":"2026-05-18T12:30:08+02:00","message":"create workflow environment"}
{"level":"trace","taskUUID":"test-pgrp-isolation","time":"2026-05-18T12:30:08+02:00","message":"start step pgrp"}
{"level":"trace","taskUUID":"test-pgrp-isolation","time":"2026-05-18T12:30:08+02:00","message":"delete workflow environment"}
--- FAIL: TestStepInOwnProcessGroup (0.00s)
    process_group_test.go:100: 
        	Error Trace:	/home/maddl/git/own/woodpecker/pipeline/backend/local/process_group_test.go:100
        	Error:      	Should not be: 113727
        	Test:       	TestStepInOwnProcessGroup
        	Messages:   	step shell shares process group with agent (pgid=113727); signals from the step would reach the agent
    process_group_test.go:105: 
        	Error Trace:	/home/maddl/git/own/woodpecker/pipeline/backend/local/process_group_test.go:105
        	Error:      	Not equal: 
        	            	expected: 113867
        	            	actual  : 113727
        	Test:       	TestStepInOwnProcessGroup
        	Messages:   	step shell is not the leader of its own process group (pid=113867, pgid=113727)
{"level":"trace","taskUUID":"test-cancel-grandchild","time":"2026-05-18T12:30:08+02:00","message":"create workflow environment"}
{"level":"trace","taskUUID":"test-cancel-grandchild","time":"2026-05-18T12:30:08+02:00","message":"start step grandchild"}
{"level":"trace","taskUUID":"test-cancel-grandchild","time":"2026-05-18T12:30:08+02:00","message":"wait for step grandchild"}
{"level":"trace","taskUUID":"test-cancel-grandchild","time":"2026-05-18T12:30:11+02:00","message":"delete workflow environment"}
--- FAIL: TestStepCancelKillsGrandchildren (3.02s)
    process_group_test.go:172: 
        	Error Trace:	/home/maddl/git/own/woodpecker/pipeline/backend/local/process_group_test.go:172
        	Error:      	Condition never satisfied
        	Test:       	TestStepCancelKillsGrandchildren
        	Messages:   	grandchild pid 113869 is still alive after step cancel; cancel did not propagate to the process group
{"level":"trace","taskUUID":"task-1","time":"2026-05-18T12:30:11+02:00","message":"start step step-name-task-1-0"}
{"level":"trace","taskUUID":"task-1","time":"2026-05-18T12:30:11+02:00","message":"wait for step step-name-task-1-0"}
{"level":"trace","taskUUID":"task-1","time":"2026-05-18T12:30:11+02:00","message":"start step step-name-task-1-1"}
{"level":"trace","taskUUID":"task-1","time":"2026-05-18T12:30:11+02:00","message":"wait for step step-name-task-1-1"}
{"level":"trace","taskUUID":"task-1","time":"2026-05-18T12:30:11+02:00","message":"start step step-name-task-1-2"}
{"level":"trace","taskUUID":"task-1","time":"2026-05-18T12:30:11+02:00","message":"wait for step step-name-task-1-2"}
{"level":"trace","taskUUID":"task-2","time":"2026-05-18T12:30:11+02:00","message":"start step step-name-task-2-0"}
{"level":"trace","taskUUID":"task-2","time":"2026-05-18T12:30:11+02:00","message":"wait for step step-name-task-2-0"}
{"level":"trace","taskUUID":"task-2","time":"2026-05-18T12:30:11+02:00","message":"start step step-name-task-2-1"}
{"level":"trace","taskUUID":"task-2","time":"2026-05-18T12:30:11+02:00","message":"wait for step step-name-task-2-1"}
{"level":"trace","taskUUID":"task-2","time":"2026-05-18T12:30:11+02:00","message":"start step step-name-task-2-2"}
{"level":"trace","taskUUID":"task-2","time":"2026-05-18T12:30:11+02:00","message":"wait for step step-name-task-2-2"}
{"level":"trace","taskUUID":"task-3","time":"2026-05-18T12:30:11+02:00","message":"start step step-name-task-3-0"}
{"level":"trace","taskUUID":"task-3","time":"2026-05-18T12:30:11+02:00","message":"wait for step step-name-task-3-0"}
{"level":"trace","taskUUID":"task-3","time":"2026-05-18T12:30:11+02:00","message":"start step step-name-task-3-1"}
{"level":"trace","taskUUID":"task-3","time":"2026-05-18T12:30:11+02:00","message":"wait for step step-name-task-3-1"}
{"level":"trace","taskUUID":"task-3","time":"2026-05-18T12:30:11+02:00","message":"start step step-name-task-3-2"}
{"level":"trace","taskUUID":"task-3","time":"2026-05-18T12:30:11+02:00","message":"wait for step step-name-task-3-2"}
{"level":"trace","taskUUID":"task-1","time":"2026-05-18T12:30:11+02:00","message":"delete workflow environment"}
{"level":"trace","taskUUID":"task-2","time":"2026-05-18T12:30:11+02:00","message":"delete workflow environment"}
{"level":"trace","taskUUID":"task-3","time":"2026-05-18T12:30:11+02:00","message":"delete workflow environment"}
FAIL
coverage: 57.2% of statements
FAIL	go.woodpecker-ci.org/woodpecker/v3/pipeline/backend/local	3.270s
FAIL

@6543 6543 requested a review from a team May 18, 2026 19:15
Copy link
Copy Markdown
Contributor

@qwerty287 qwerty287 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Untested. I also don't like having a test file process_group_test.go without a source file process_group.go, but I see it's not easy to find a good name here

@6543 6543 merged commit d37ab38 into woodpecker-ci:main May 26, 2026
9 checks passed
@6543 6543 deleted the local-backend_dont-let-commands-kill-agent branch May 26, 2026 12:44
@woodpecker-bot woodpecker-bot mentioned this pull request May 26, 2026
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend/local bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants