Skip to content

perf: fix main-thread hangs from Observation cascades, synchronous Combine dispatch, and GeometryReader layout#23464

Merged
ashleeradka merged 8 commits into
mainfrom
devin/1775252289-fix-observation-cascade-hang
Apr 3, 2026
Merged

perf: fix main-thread hangs from Observation cascades, synchronous Combine dispatch, and GeometryReader layout#23464
ashleeradka merged 8 commits into
mainfrom
devin/1775252289-fix-observation-cascade-hang

Conversation

@devin-ai-integration

@devin-ai-integration devin-ai-integration Bot commented Apr 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes a severe main-thread hang (20–47 seconds) caused by two compounding issues in the chat message pipeline:

Issue 1 — withObservationTracking re-entrancy cascades: When messages mutated, 5+ independent withObservationTracking onChange handlers each scheduled a Task { @MainActor } that re-read messages, re-registered tracking, and wrote derived @Observable properties — triggering a cascading SwiftUI transaction flush that measured every ForEach item in the LazyVStack. withObservationTracking is designed for single-shot observation; the re-arming loop pattern (onChange → Task → re-observe) creates N async dispatches per mutation that pile up in the same run-loop tick.

Issue 2 — Synchronous GeometryReader layout dependency: VAppWorkspaceDockLayout and VSplitView used GeometryReader, which creates a synchronous parent-child layout dependency. When any child's preferred size changes, SwiftUI re-measures the entire tree from the GeometryReader root through all children.

Bug fix — secret_blocked message capture: handleError removed the blocked user message inside batchUpdateMessages but then attempted to read the message's text and attachments from vm.messages after the batch — when the message no longer existed. This silently dropped all file attachments the user originally included, breaking "Send Anyway" reconstruction. Fixed by capturing the blocked ChatMessage inside the batch closure before msgs.remove(at:).

Approach

Messages property: Custom access(keyPath:) / withMutation(keyPath:) getter/setter that participates in the Observation framework while also publishing to a Combine CurrentValueSubject. The _modify accessor defers the Combine publish via a coalesced Task { @MainActor }, so multiple rapid subscript mutations (e.g. stopGenerating touching 5+ fields) result in a single downstream notification instead of one per mutation. The set accessor and batchUpdateMessages publish synchronously since they represent intentional complete mutations.

Derived values: A single recomputeDerivedValues(from:) function computes all four cached properties (activePendingRequestId, hasPendingConfirmation, hasNonEmptyMessage, latestPersistedTipDaemonMessageId) in one O(n) pass, replacing four independent Combine pipelines that each did O(n) work. Manual != checks before writing prevent unnecessary @Observable notifications.

Dead code removal: isThinkingPublisher and _isThinkingSubject had zero subscribers. isThinking reverted to a plain @Observable stored property.

Mutation batching: All multi-mutation sites now use batchUpdateMessages to emit a single notification:

  • MessageSendCoordinator: stopGenerating (main path + orphaned tool call early-return), resetCancelState
  • ChatViewModel: handleMessageContentResponse, updateSurfacePreviewImage, trimOldMessagesIfNeeded, trimForBackground, watchdog recovery
  • ChatActionHandler: handleGenerationCancelled, handleError, handleConversationError, handleMessageDequeued

ConversationActivityStore: observeAssistantActivityLoop and observeActiveMessageCountLoop — which read the heavy messages array — replaced with messagesPublisher Combine subscriptions. Scalar-reading loops (observeBusyStateLoop, observeInteractionStateLoop) retained as withObservationTracking since they read lightweight properties that change infrequently.

GeometryReader → .onGeometryChange: VAppWorkspaceDockLayout and VSplitView use .onGeometryChange(for:) to deliver width asynchronously, avoiding full-tree measurement cascades. The width is only used for drag constraints, so async delivery is safe.

Resource cleanup: ChatMessageManager now has a deinit that cancels _deferredPublishTask and derivedValuesSub, following the established pattern in sibling classes (ChatBtwState, SubagentDetailStore, ChatPaginationState) per AGENTS.md convention. ChatPaginationState.resetMessagePagination explicitly cancels the old Combine subscription before reassignment for consistency with its deinit.

Decision Context & Anti-Patterns

This section documents why specific architectural choices were made and what patterns to avoid, for future reference by developers and coding agents.

Why withObservationTracking re-arming loops are an anti-pattern for messages:
Apple's withObservationTracking is single-shot by design — the onChange closure fires once and must be re-registered. The common "re-arm in onChange" pattern (onChange → Task { @MainActor in re-observe }) creates a new async dispatch per mutation. When N independent loops each observe messages, a single array mutation generates N Task dispatches that all land in the same run-loop tick, each writing to @Observable properties that trigger SwiftUI's GraphHost.flushTransactions(). This is the thundering-herd cascade that caused 20-47s hangs. For frequently-mutated collections, use Combine subscriptions (which naturally coalesce via CurrentValueSubject) instead of re-arming observation loops. The withObservationTracking pattern remains appropriate for infrequently-changing scalar properties like isSending and isThinking, where the one-shot overhead is negligible.

Why custom getter/setter/_modify instead of plain @Observable:
Plain @Observable properties emit per-mutation notifications — every messages[i].field = value triggers SwiftUI invalidation. We need a dual notification mechanism: (1) Observation framework participation (so SwiftUI tracks reads), and (2) Combine CurrentValueSubject publishing (so non-SwiftUI subscribers like ChatPaginationState and ConversationActivityStore can subscribe with backpressure). The custom accessor pattern is exactly what the @Observable macro generates internallyaccess(keyPath:) in the getter, withMutation(keyPath:) wrapping the setter — but adds Combine publishing alongside. The _modify accessor uses willSet/didSet directly (not the withMutation closure) because Swift's _modify coroutine requires yield, which is incompatible with closures.

Why GeometryReader was replaced with .onGeometryChange(for:of:action:):
GeometryReader creates a synchronous parent-child layout dependency — when any child's preferred size changes, SwiftUI re-measures the entire subtree from the GeometryReader root. onGeometryChange delivers size asynchronously as a state change on the next transaction, breaking the synchronous dependency. Apple recommends onGeometryChange when the geometry value is consumed as state rather than used for immediate child layout (see WWDC23 — Demystify SwiftUI performance). In VAppWorkspaceDockLayout and VSplitView, the width is only used for drag constraints, making async delivery safe.

Why some withObservationTracking loops were kept:
ConversationActivityStore's observeBusyStateLoop and observeInteractionStateLoop read scalar @Observable properties (isSending, isThinking, hasPendingConfirmation) that change infrequently (a few times per conversation turn, not per-token). The re-arming overhead is negligible for these, and the simpler withObservationTracking pattern avoids introducing Combine subscriptions where they aren't needed.

Why isThinkingPublisher was removed:
_isThinkingSubject and isThinkingPublisher had zero subscribers anywhere in the codebase. The Combine subject was being .send()-ed on every mutation for no consumers. isThinking reverted to a plain @Observable stored property, which is sufficient for SwiftUI observation.

Why recomputeDerivedValues uses manual != checks:
Writing to an @Observable property always triggers SwiftUI invalidation, even if the new value equals the old value. The Observation framework does not perform equality checks — it fires on any withMutation call. Manual != guards before writing prevent unnecessary SwiftUI transaction flushes when derived values haven't actually changed. See ObservationRegistrar.willSet/didSet.

Why Task.isCancelled guard is required in scheduleDeferredPublish:
When batchUpdateMessages or the set accessor fires a synchronous publish, it cancels any pending deferred Task. However, Swift's cooperative cancellation is advisory — a cancelled Task still executes unless it checks Task.isCancelled. Without the guard, the cancelled task would send a redundant (and potentially stale) notification after the batch's synchronous publish already delivered the authoritative snapshot.

What this PR does NOT fix:
A residual ~1s layout pass remains after sending messages, visible in stackshots as 100% GraphHost.flushTransactions() → deep StackLayout/_FlexFrameLayout/_PaddingLayout recursion with zero application code in the stack. This is the baseline cost of a single SwiftUI layout transaction — always present but previously masked by the 20-47s cascade. Reducing this requires view-layer changes (cell nesting depth, .compositingGroup(), image downscaling) and is tracked in LUM-713. A separate ~10s WindowServer compositing hang (GPU layer tree complexity with blur effects and large textures) is also tracked there.

Benefits

  • Speed: Eliminates 20–47s hang, reducing to ~1s baseline layout cost per transaction
  • Efficiency: Per-notification cost reduced from 4×O(n) (four independent pipelines) to 1×O(n) (single recomputation function). Deferred coalescing prevents per-mutation pipeline evaluation during rapid subscript writes.
  • Correctness: secret_blocked "Send Anyway" now correctly preserves attachments from the blocked message
  • Maintainability: Explicit batchUpdateMessages at call sites; single derived-value function instead of four pipeline chains
  • Less dead code: Removed unused isThinkingPublisher / _isThinkingSubject

Why it's safe

  • The custom getter/setter/_modify pattern is semantically equivalent to what the @Observable macro generates — it adds a Combine send() alongside each mutation
  • The deferred Task { @MainActor } runs on the cooperative executor during the run loop's source-processing phase — before SwiftUI's CFRunLoopObserver fires its transaction flush — ensuring derived values are up-to-date when views re-evaluate. A Task.isCancelled guard prevents cancelled deferred tasks from sending redundant notifications or clobbering newer task references.
  • batchUpdateMessages cancels any pending deferred publish before firing its synchronous publish, preventing stale double-notifications
  • recomputeDerivedValues produces the same values as the old four pipelines; manual != checks before writing prevent unnecessary SwiftUI invalidation
  • .onGeometryChange is the Apple-recommended replacement for GeometryReader when the size is consumed as state rather than used for child layout
  • ConversationActivityStore retains withObservationTracking loops for lightweight scalar reads (isSending, isThinking) that are not in the hang path
  • ChatActionHandler batching preserves identical control flow — conditional logic (error insertion, empty message removal, queued-status resets) runs inside the batch closure with the same guards. Non-message state assignments remain outside the batch.
  • secret_blocked fix: the captured ChatMessage is only used for "Send Anyway" context (text and attachments). Bookkeeping cleanup (pendingQueuedCount, pendingMessageIds, etc.) still happens inside the batch as before.

Files changed

  • ChatMessageManager.swift — Core fix: deferred coalesced _modify publish with Task.isCancelled guard, single recomputeDerivedValues function, removed dead isThinkingPublisher/_isThinkingSubject, batchUpdateMessages cancels pending deferred task, added deinit for resource cleanup
  • ChatActionHandler.swift — Batch handleGenerationCancelled, handleError, handleConversationError, handleMessageDequeued multi-mutation sites; fix secret_blocked message capture before batch removal
  • ChatPaginationState.swift — Replace withObservationTracking loop with messagesPublisher subscription; explicit cancel() in resetMessagePagination
  • ChatViewModel.swift — Batch handleMessageContentResponse, updateSurfacePreviewImage, trimOldMessagesIfNeeded, trimForBackground, watchdog recovery mutations; delegate hasPendingConfirmation to cached value
  • MessageSendCoordinator.swift — Batch stopGenerating (main path + orphaned tool call early-return) and resetCancelState mutations
  • ConversationActivityStore.swift — Replace observeAssistantActivityLoop and observeActiveMessageCountLoop with messagesPublisher subscriptions; remove dead activityGenerations / invalidateActivityGeneration
  • VAppWorkspaceDockLayout.swift — Replace GeometryReader with .onGeometryChange(for:)
  • VSplitView.swift — Same GeometryReader.onGeometryChange(for:) migration

References

Review & Testing Checklist for Human

  • Build in Xcode — CI skips macOS/iOS builds. Verify the project compiles, especially the _modify accessor using _$observationRegistrar directly, the removal of isThinkingPublisher / _isThinkingSubject (any remaining references will fail to compile), and the new import Combine in ConversationActivityStore.swift.
  • Secret-blocked "Send Anyway" flow — Send a message that triggers secret_blocked (e.g. paste an API key). Verify the "Send Anyway" button appears AND preserves the original message text and any attached files. This exercises the capturedBlockedMessage fix in handleError.
  • Cancel flow — While the assistant is responding, cancel generation. Verify the assistant message stops streaming, queued messages reset to .sent, and no orphaned state remains. stopGenerating, resetCancelState, handleGenerationCancelled, and handleConversationError now batch their mutations — confirm no messages are missed.
  • Error handling — Trigger a conversation error (e.g. invalid API key, credits exhausted). Verify the inline error message renders correctly and the secret_blocked banner still appears when applicable. handleError and handleConversationError moved their message mutations inside batchUpdateMessages while keeping error state assignments outside.
  • Dock and panel drag resizing — With the dock open, drag the divider to resize. Verify the dock width constrains correctly (min 300pt, leaves min 300pt for workspace). The availableWidth now arrives asynchronously via .onGeometryChange.

Recommended test plan: Load a conversation with 100+ messages (the scenario from the hang reports). Interact with it: scroll, send messages, cancel mid-stream, trigger confirmations, trigger errors. Monitor for any stalls using Instruments. Compare against main to confirm the 20-47s hang is eliminated. Expect a residual ~1s baseline layout cost (tracked in LUM-713).

Notes

  • The _modify accessor uses _$observationRegistrar.willSet/didSet directly (rather than the withMutation closure API) because Swift's _modify coroutine requires yield, which is incompatible with a closure. This is the same pattern the @Observable macro generates internally.
  • The deferred publish from _modify coalesces via a Task { @MainActor } guard: if a task is already pending, subsequent _modify calls are no-ops. The task checks Task.isCancelled before executing to prevent redundant notifications when batchUpdateMessages or the set accessor cancels the pending task and publishes synchronously.
  • The batchUpdateMessages closure receives inout [ChatMessage] — callers must not read self.messages inside the closure (they'd get stale data). Current call sites only mutate the msgs parameter, which is correct.
  • In handleConversationError, typedError is force-unwrapped (typedError!) in two places — both are guarded by !wasCancelling, which guarantees typedError is non-nil (it's set to ConversationError(from: msg) when !wasCancelling). This is safe but worth noting for future refactors.
  • A residual ~1s layout pass remains after sending a message with an image attachment. This is the baseline cost of a single SwiftUI transaction flush (vs. the previous 20–47s cascade). Further optimization of the layout pass depth and WindowServer compositing cost is tracked in LUM-713.

Link to Devin session: https://app.devin.ai/sessions/a0f7f0e1f1f6439db9534237d16c377a
Requested by: @ashleeradka


Open with Devin

…ing main-thread hang

Replace the withObservationTracking re-entrancy loops in ChatMessageManager
and ChatPaginationState with direct Observation framework participation
(access/withMutation) and Combine pipelines. Each messages mutation now
emits a single Observation notification + a single Combine publish instead
of scheduling N Task { @mainactor } callbacks that cascade through the
SwiftUI transaction system.

Key changes:
- ChatMessageManager: custom getter/setter for messages and isThinking
  using access(keyPath:)/withMutation(keyPath:) + CurrentValueSubject
- ChatMessageManager: batchUpdateMessages(_:) for coalesced bulk mutations
- ChatMessageManager: hasPendingConfirmation cached via Combine pipeline
- ChatPaginationState: replace withObservationTracking loop with
  messagesPublisher Combine subscription
- ChatViewModel: use batchUpdateMessages for trim operations
- MessageSendCoordinator: use batchUpdateMessages in resetCancelState
- ConversationActivityStore: use cached hasPendingConfirmation, clean up docs

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
@devin-ai-integration

Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Open in Devin Review

chatgpt-codex-connector[bot]

This comment was marked as resolved.

devin-ai-integration Bot and others added 2 commits April 3, 2026 21:52
Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
GeometryReader creates a synchronous parent-child layout dependency that
forces SwiftUI to re-measure the entire view hierarchy from root to leaf
on every content change. Replace with .onGeometryChange(for:) which
delivers size asynchronously and avoids triggering full-tree measurement
cascades when child content (e.g. LazyVStack messages) changes.

The width is only used for drag constraints, so async delivery is safe.

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
@devin-ai-integration devin-ai-integration Bot changed the title perf: eliminate withObservationTracking → Combine bridge cascade causing main-thread hang perf: eliminate withObservationTracking cascade and reduce layout measurement overhead Apr 3, 2026
Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
@devin-ai-integration devin-ai-integration Bot changed the title perf: eliminate withObservationTracking cascade and reduce layout measurement overhead perf: fix main-thread hangs from Observation→Combine bridging and synchronous GeometryReader layout Apr 3, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

…all mutation sites

- _modify accessor now defers publish via Task { @mainactor } so rapid
  subscript mutations coalesce into one notification
- Consolidate 4 independent Combine pipelines into single
  recomputeDerivedValues(from:) function (4×O(n) → 1×O(n))
- Remove dead isThinkingPublisher / _isThinkingSubject code
- Simplify isThinking to plain stored property
- Batch stopGenerating, watchdog recovery, handleMessageContentResponse,
  and updateSurfacePreviewImage mutations via batchUpdateMessages
- Convert ConversationActivityStore observeAssistantActivityLoop and
  observeActiveMessageCountLoop from withObservationTracking to
  messagesPublisher subscriptions
- Cancel pending deferred task in batchUpdateMessages to prevent stale
  double-notification

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
@devin-ai-integration devin-ai-integration Bot changed the title perf: fix main-thread hangs from Observation→Combine bridging and synchronous GeometryReader layout perf: fix main-thread hangs from Observation cascades, synchronous Combine dispatch, and GeometryReader layout Apr 3, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

…ant notifications

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
devin-ai-integration[bot]

This comment was marked as resolved.

…Manager

- Batch handleConversationError: wrap assistant finalization, empty message
  removal, inline error insertion, and processing/queued status resets in
  a single batchUpdateMessages call
- Batch stopGenerating orphaned tool call completion in early-return path
- Add explicit messagesSub?.cancel() in resetMessagePagination for style
  consistency with deinit
- Add deinit to ChatMessageManager cancelling _deferredPublishTask and
  derivedValuesSub per AGENTS.md convention

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
devin-ai-integration[bot]

This comment was marked as resolved.

…omment

- In handleError, capture the blocked ChatMessage into a local variable
  inside the batchUpdateMessages closure *before* removing it from the
  array. The post-batch 'Send Anyway' context extraction now reads from
  the captured message instead of re-searching vm.messages (where the
  message no longer exists after batch removal).
- Fix stale comment in ChatViewModel.swift displayedMessageCount setter
  that still referenced 'async Combine bridge' from the old
  withObservationTracking approach.

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
@ashleeradka ashleeradka merged commit ca9a31e into main Apr 3, 2026
7 checks passed
@ashleeradka ashleeradka deleted the devin/1775252289-fix-observation-cascade-hang branch April 3, 2026 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant