Skip to content

Don't unblock run-level-concurrency-blocked runs in the resolver#37461

Open
silverwind wants to merge 5 commits intogo-gitea:mainfrom
silverwind:fix-blocked-run-resolver
Open

Don't unblock run-level-concurrency-blocked runs in the resolver#37461
silverwind wants to merge 5 commits intogo-gitea:mainfrom
silverwind:fix-blocked-run-resolver

Conversation

@silverwind
Copy link
Copy Markdown
Member

@silverwind silverwind commented Apr 28, 2026

Fixes #37446.

The job-status resolver in checkJobsOfCurrentRunAttempt only considered needs and job-level concurrency when transitioning jobs out of Blocked. When something drove the resolver against a run blocked solely by workflow-level concurrency — for example, a sibling run in the same group entering the queue and triggering EmitJobsIfReadyByRun — the run's job silently became Waiting while another run still held the concurrency group, and the runner could pick it up, defeating the concurrency guarantee.

The fix bails out of the resolver when the run's latest attempt is still blocked by run-level concurrency. checkRunConcurrency re-evaluates when the holding run finishes.

Covered by a unit test (Test_checkJobsOfCurrentRunAttempt_RunLevelConcurrencyKeepsJobsBlocked in services/actions/job_emitter_test.go) that sets up a Running holder attempt and a Blocked sibling attempt in the same concurrency group directly in the DB, calls checkJobsOfCurrentRunAttempt, and asserts the blocked job stays Blocked. Fails on master, passes with the fix.


This PR was written with the help of Claude Opus 4.7

checkJobsOfCurrentRunAttempt's resolver only considered needs and
job-level concurrency when transitioning jobs out of Blocked. When
something drove the resolver against a run blocked solely by
workflow-level concurrency (for example, a sibling run in the same
group entering the queue and triggering EmitJobsIfReadyByRun), the
run's job silently became Waiting while another run still held the
group, and the runner could pick it up.

Bail out of the resolver when the run's latest attempt is still
blocked by run-level concurrency. checkRunConcurrency re-evaluates
when the holding run finishes.

Fixes go-gitea#37446

Co-Authored-By: Claude (Opus 4.7) <noreply@anthropic.com>
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Apr 28, 2026
@silverwind silverwind requested a review from Zettat123 April 28, 2026 00:09
@silverwind silverwind added topic/gitea-actions related to the actions of Gitea type/bug backport/v1.26 This PR should be backported to Gitea 1.26 labels Apr 28, 2026
@silverwind silverwind requested a review from Copilot April 28, 2026 00:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Prevents the job-status resolver from transitioning jobs out of Blocked when the overall run is still blocked by workflow-level (run-level) concurrency, closing a gap where queued resolver runs could defeat concurrency guarantees.

Changes:

  • Add an early-bail in checkJobsOfCurrentRunAttempt when the latest attempt is still blocked by run-level concurrency.
  • Add an integration test ensuring schedule-triggered runs remain blocked (and don’t emit runnable jobs) while a sibling run holds the concurrency group.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
tests/integration/actions_concurrency_test.go Adds an integration test covering the run-level concurrency-blocked resolver regression scenario.
services/actions/job_emitter.go Bails out of the job resolver when the run is blocked due to run-level concurrency, preventing jobs from becoming runnable prematurely.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/integration/actions_concurrency_test.go Outdated
Comment thread services/actions/job_emitter.go
Comment thread tests/integration/actions_concurrency_test.go Outdated
silverwind and others added 2 commits April 28, 2026 03:29
Per review feedback on go-gitea#37461.

Co-Authored-By: Claude (Opus 4.7) <noreply@anthropic.com>
The Len(blockedRuns, 1) assertion already proves the surviving schedule
run is not Waiting, so the runner could not pick anything up.

Co-Authored-By: Claude (Opus 4.7) <noreply@anthropic.com>
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Apr 28, 2026
@wxiaoguang
Copy link
Copy Markdown
Contributor

I believe it can be tested in a unit test.

TestScheduleConcurrencyBlockedRunStaysBlocked is extremely slow, and it is not clear about the test details.

@wxiaoguang wxiaoguang marked this pull request as draft April 28, 2026 04:28
@silverwind
Copy link
Copy Markdown
Member Author

TestScheduleConcurrencyBlockedRunStaysBlocked is extremely slow, and it is not clear about the test details.

Took 2.7s locally. Slow yes, but not extremely.

@wxiaoguang
Copy link
Copy Markdown
Contributor

TestScheduleConcurrencyBlockedRunStaysBlocked is extremely slow, and it is not clear about the test details.

Took 2.7s locally. Slow yes, but not extremely.

These tests will be extremely slow in CI, the more added, the slower.

@wxiaoguang
Copy link
Copy Markdown
Contributor

wxiaoguang commented Apr 28, 2026

TestScheduleConcurrencyBlockedRunStaysBlocked is extremely slow, and it is not clear about the test details.

Took 2.7s locally. Slow yes, but not extremely.

These tests will be extremely slow in CI, the more added, the slower.

Share some findings I have got about CI time: Operating a git repo via API/web is slow, due to it needs to execute the Gitea's git hook. It might be slower in CI due to limited resources.

To keep CI fast and optimize the speed: avoid unnecessary git repo operations via Gitea's API or web (avoid the Gitea's git hook) as much as possible

  • Some tests can be clearly written in unit tests
  • Some git repo operation can be done via git fast-import

I think it can save at least many minutes if git fast-import can be correctly used.

@silverwind
Copy link
Copy Markdown
Member Author

Yes, with ~4 times slower CI, we are looking at 12s+, which is borderline.

@wxiaoguang
Copy link
Copy Markdown
Contributor

Another concern is that the integration tests are abused.

Actually, for a function level logic, it should be clearly tested in unit tests, including various edge cases.

Integration test should focus on "the whole thing overall works together", it's difficult to use it to cover edge cases, and usually it is not informative when writing a integration test for a speical case (a lot of unrelated code, maintenance burden)

@silverwind
Copy link
Copy Markdown
Member Author

silverwind commented Apr 28, 2026

Yeah, what can be asserted in a unit test should be. I didn't tell Claude to write integration test, it decided itself. Likely a good pointer to add to AGENTS.md.

My attack points for fast CI are:

Per @wxiaoguang's feedback on go-gitea#37461: the run-level concurrency guard
in checkJobsOfCurrentRunAttempt is function-level logic and is better
covered by a unit test. The unit test sets up a Running holder
attempt and a Blocked sibling attempt in the same concurrency group
directly in the DB, calls checkJobsOfCurrentRunAttempt, and asserts
the blocked job stays Blocked. ~0.3s vs ~3.7s for the integration
version, and no API/git-hook overhead.

Co-Authored-By: Claude (Opus 4.7) <noreply@anthropic.com>
@silverwind
Copy link
Copy Markdown
Member Author

@wxiaoguang good call — replaced the integration test with a unit test in services/actions/job_emitter_test.go (commit dd83ccf). It sets up the Running holder + Blocked sibling attempts in the same concurrency group directly via db.Insert and calls checkJobsOfCurrentRunAttempt on the blocked run. ~0.3s vs ~3.7s for the integration version, no git-hook traffic. Verified it fails on master and passes with the fix.


This response was written with the help of Claude Opus 4.7

@silverwind silverwind marked this pull request as ready for review April 28, 2026 05:04
@bircni bircni requested a review from wxiaoguang May 1, 2026 08:53
@wxiaoguang
Copy link
Copy Markdown
Contributor

The new test looks good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/v1.26 This PR should be backported to Gitea 1.26 lgtm/need 1 This PR needs approval from one additional maintainer to be merged. topic/gitea-actions related to the actions of Gitea type/bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test TestScheduleConcurrency

5 participants