Skip to content

Fix cancel race: disarm safety-net timer before QA finalization#6934

Merged
Jasonnnz merged 1 commit into
feature/qa-video-automationfrom
swarm/qa-video-auto/task-16
Feb 23, 2026
Merged

Fix cancel race: disarm safety-net timer before QA finalization#6934
Jasonnnz merged 1 commit into
feature/qa-video-automationfrom
swarm/qa-video-auto/task-16

Conversation

@Jasonnnz
Copy link
Copy Markdown
Contributor

@Jasonnnz Jasonnnz commented Feb 23, 2026

The 2-second cancel safety-net timer could fire before finalizeQARecording() completes on slow paths, sending cu_session_abort before cu_session_finalized. The daemon's handleCuSessionAbort deletes cuSessionMetadata, so the later cu_session_finalized can't inject the summary. Fix: store the safety-net Task and cancel it when run() reaches the finalization code path, since run() handles finalization + abort itself.


Open with Devin

Co-Authored-By: Claude <noreply@anthropic.com>
@Jasonnnz Jasonnnz self-assigned this Feb 23, 2026
@Jasonnnz Jasonnnz merged commit 6d9d2d4 into feature/qa-video-automation Feb 23, 2026
@Jasonnnz Jasonnnz deleted the swarm/qa-video-auto/task-16 branch February 23, 2026 16:51
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment on lines +1064 to 1067
cancelSafetyNetTask = Task { @MainActor in
try? await Task.sleep(nanoseconds: 2_000_000_000) // 2 seconds
guard self.isCancelled else { return } // in case state changed
try? self.daemonClient.send(CuSessionAbortMessage(sessionId: self.id))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Safety-net task cancellation is ineffective — abort still fires after cancel() on the Task

The safety-net Task created in cancel() uses try? await Task.sleep(...) which swallows CancellationError. After the sleep is interrupted by cancelSafetyNetTask?.cancel(), execution continues to the guard self.isCancelled check — but self.isCancelled is true (set by cancel()), so the guard passes and cu_session_abort is sent anyway.

Root Cause: Task.isCancelled vs self.isCancelled

When run() reaches line 287 and calls cancelSafetyNetTask?.cancel(), the cooperative cancellation sets the task's cancellation flag. The Task.sleep at Session.swift:1065 throws CancellationError, but try? swallows it. The task then continues to line 1066:

guard self.isCancelled else { return }

Since self.isCancelled was set to true by cancel() at Session.swift:1055, this guard passes. The task then sends CuSessionAbortMessage at line 1067.

Critically, run() then awaits finalizeQARecording() at line 291, which is a suspension point. This gives the cancelled safety-net task a chance to resume on @MainActor and send the abort before finalizeQARecording() completes — exactly the race condition the PR is supposed to fix.

The fix should check Task.isCancelled inside the safety-net task body, e.g.:

cancelSafetyNetTask = Task { @MainActor in
    try? await Task.sleep(nanoseconds: 2_000_000_000)
    guard !Task.isCancelled else { return }  // <-- check cooperative cancellation
    guard self.isCancelled else { return }
    try? self.daemonClient.send(CuSessionAbortMessage(sessionId: self.id))
}

Impact: The cu_session_abort message can still arrive at the daemon before cu_session_finalized, causing the daemon's handleCuSessionAbort to delete cuSessionMetadata before the summary can be injected — the exact bug this PR was meant to prevent.

(Refers to lines 1064-1068)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b6357b3394

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1064 to 1066
cancelSafetyNetTask = Task { @MainActor in
try? await Task.sleep(nanoseconds: 2_000_000_000) // 2 seconds
guard self.isCancelled else { return } // in case state changed
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Return when canceling the safety-net task

Canceling cancelSafetyNetTask in run() does not actually disarm it because the task body ignores Task.sleep cancellation (try? await Task.sleep(...)) and continues immediately; with isCancelled == true, it still sends cu_session_abort before finalizeQARecording() finishes, so the original metadata-loss race can still occur on slow finalization paths. This is triggered specifically when the new post-loop cancelSafetyNetTask?.cancel() path runs during QA cancellation.

Useful? React with 👍 / 👎.

Jasonnnz added a commit that referenced this pull request Feb 23, 2026
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Claude <noreply@anthropic.com>
Jasonnnz added a commit that referenced this pull request Feb 23, 2026
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Claude <noreply@anthropic.com>
Jasonnnz added a commit that referenced this pull request Feb 23, 2026
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Claude <noreply@anthropic.com>
Jasonnnz added a commit that referenced this pull request Feb 23, 2026
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Claude <noreply@anthropic.com>
Jasonnnz added a commit that referenced this pull request Feb 23, 2026
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Claude <noreply@anthropic.com>
Jasonnnz added a commit that referenced this pull request Feb 23, 2026
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant