Fix pipeline cancellation status handling and step state synchronization#6011
Conversation
|
also when Wait() does not return err the step does not get cancled in the background either till it finished ?!? |
|
related work #3850 |
This comment was marked as off-topic.
This comment was marked as off-topic.
|
That's nothing new… There's a summary issue for the whole pipeline canceling: #2875 |
|
@qwerty287 I'll work on that now ... will take some time i guess to get through all of it :) |
we have not documented pipeline even if it is our core engine that does the havey lifting ... also the agent-server0rpc stuff is in there but shound be a separate thing
…hed and refactored by us a lot and now is mosty ours
|
@qwerty287 ☝️ updated the code comment and code to make it clear and more err resistant |
|
deployed on our ci: https://ci.woodpecker-ci.org/repos/6106/pipeline/1778/4 |
|
@6543 there seems to be another issue: https://ci.woodpecker-ci.org/repos/3780/pipeline/31596 I canceled that manually but it's showing as succeeded |
|
hmm that's another issue unrelated to agent rpc but queue implementation ... would you mind creating an issue? |
When a workflow includes service steps, the workflow and pipeline status is permanently stuck as "running" in the database after all steps had finished. completeChildrenIfParentCompleted was called after UpdateWorkflowStatusToDone, so the status calculation still saw service steps as running. The in-memory step state also wasn't updated after the database update. The fix moves child completion before the status calculation and syncs the in-memory state so WorkflowStatus sees the finalized step states. I think this bug was introduced in the refactoring in woodpecker-ci#6011
When a workflow includes service steps, the workflow and pipeline status is permanently stuck as "running" in the database after all steps had finished. completeChildrenIfParentCompleted was called after UpdateWorkflowStatusToDone, so the status calculation still saw service steps as running. The in-memory step state also wasn't updated after the database update. The fix moves child completion before the status calculation and syncs the in-memory state so WorkflowStatus sees the finalized step states. I think this bug was introduced in the refactoring in woodpecker-ci#6011
Summary
Fixes critical issues with pipeline status updates when pipelines are cancelled or workflows are killed by the agent.
The agent's gRPC client error handling during cancellation was causing incorrect status propagation, leading to cancelled pipelines being marked as either "failed" or "success" instead of "killed".
Problems Fixed
WebUI Screenshot showing working canceled steps & workflow
Changes
Canceledfield to gRPC protocol (WorkflowState and StepState) - bumps protocol version to 15context.WithCancelCause()to distinguish cancellation reasonsTesting
Tested on both local- and docker-backend
read more at #2875
close #833
close #3848
close #2062
close #2911
close #4349
block #6056
block #6039