Skip to content

[LUM-745] Migrate GatewayConnectionManager to @Observable to fix re-entrant attribute graph hang#25496

Merged
tkheyfets merged 12 commits into
mainfrom
devin/1776178504-lum-745-observable-migration
Apr 16, 2026
Merged

[LUM-745] Migrate GatewayConnectionManager to @Observable to fix re-entrant attribute graph hang#25496
tkheyfets merged 12 commits into
mainfrom
devin/1776178504-lum-745-observable-migration

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot commented Apr 14, 2026

Migrates GatewayConnectionManager from ObservableObject/@Published to the @Observable macro to fix re-entrant AttributeGraph hangs caused by multiple synchronous @Published mutations in performHealthCheck(). With @Observable, property-level tracking eliminates objectWillChange.send() cascades entirely.

Also introduces ObservationValues<Value> — a cancellation-cooperative AsyncSequence wrapper for observationStream() that responds to consuming-task cancellation (unlike raw AsyncStream.next() which ignores it), preventing leaked observation tasks and retained objects when callers replace the observed instance.

Prompt / plan

LUM-745 — App hang: GatewayConnectionManager.isConnected published setter triggers re-entrant AttributeGraph cascade.

Root cause: performHealthCheck() mutates multiple @Published properties synchronously (assistantVersion, lastUpdateOutcome, isUpdateInProgress, updateStatusMessage, isConnected, isConnecting). Each fires objectWillChange.send()AttributeInvalidatingSubscriber.invalidateAttributeAG::Graph::Context::thread_is_updating re-entrancy → hang. The previous fix (PR #24556, no-op guards) only prevented redundant writes; hangs still occurred on genuine state transitions.

@Observable uses property-level tracking with no objectWillChange.send(), so multiple property mutations never cause re-entrant attribute graph cascades.

Apple refs checked (2025-07-14): Observation framework, Migrating from ObservableObject, WWDC23 — Discover Observation

Test plan

  • CI passes (macOS/iOS builds skipped in CI — requires local ./build.sh verification)
  • All 13 observationStream / withObservationTracking call sites verified for correct initial-value delivery, deduplication, and task cancellation cleanup
  • ObservationValues cancellation chain verified: consuming task cancelled → onCancel finishes continuation → next() returns nil → stream dropped → internal task cancelled → all captures released

Link to Devin session: https://app.devin.ai/sessions/662902235a184ea1bc150ef85bc23885
Requested by: @tkheyfets


Open with Devin

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

chatgpt-codex-connector[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@tkheyfets
Copy link
Copy Markdown
Contributor

@codex review

chatgpt-codex-connector[bot]

This comment was marked as resolved.

@tkheyfets
Copy link
Copy Markdown
Contributor

@codex review

chatgpt-codex-connector[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@tkheyfets
Copy link
Copy Markdown
Contributor

@codex review

chatgpt-codex-connector[bot]

This comment was marked as resolved.

@tkheyfets
Copy link
Copy Markdown
Contributor

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. More of your lovely PRs please.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

tkheyfets
tkheyfets previously approved these changes Apr 16, 2026
devin-ai-integration Bot and others added 9 commits April 16, 2026 18:45
Closes LUM-745

The @published setter on GatewayConnectionManager caused re-entrant
AttributeGraph cascades when multiple properties mutated in the same
call stack (e.g. performHealthCheck setting assistantVersion,
isUpdateInProgress, and isConnected synchronously). Each @published
setter fires objectWillChange.send() which queues asyncTransaction
calls that trigger re-entrancy detection in AG::Graph::Context,
hanging the main thread.

@observable uses property-level tracking without objectWillChange,
eliminating the re-entrancy entirely.

Migration scope:
- Core: @published -> plain stored properties, @ObservationIgnored
  for non-reactive bookkeeping (tasks, cancellables, handlers)
- SwiftUI views: .onReceive -> .onChange(of:), @ObservedObject -> var
- Non-view Combine subscribers: $isConnected.sink -> Task-based
  withObservationTracking loops (MainWindow, ConversationRestorer,
  AppDelegate+MenuBar, ToolPermissionTesterModel, iOS ClientProvider)
- isConnectedStream: re-implemented with withObservationTracking
- AGENTS.md: moved to migrated list

Apple refs checked (2026-04-14): Observation framework, Managing
model data in your app, Migrating from ObservableObject to Observable

Co-Authored-By: tkheyfets <timur@vellum.ai>
…ionTracking loops

Introduces a shared observationStream() function that converts any
@observable property read into a deduplicated AsyncStream. All 6 call
sites (MainWindow, ConversationRestorer x2, AppDelegate+MenuBar,
ToolPermissionTesterModel, iOS ClientProvider) and isConnectedStream
now use for-await loops over observationStream instead of manually
managing withObservationTracking + withCheckedContinuation + dedup.

Co-Authored-By: tkheyfets <timur@vellum.ai>
- ConversationRestorer: cancel observation tasks in deinit (AnyCancellable
  auto-cancelled but Task does not)
- MainWindow: cancel connectionObservationTask in close() and detachWindow()
- ToolPermissionTesterModel: add deinit to cancel observeConnectionTask
- MainWindowView+Lifecycle: call handleDaemonConnectionChange on appear to
  restore initial-value semantics lost in .onReceive → .onChange migration
- ObservationStream: fix misleading doc comment about actor isolation

Co-Authored-By: tkheyfets <timur@vellum.ai>
…nnectionManager

- SettingsGeneralTab: replace .onReceive(cm.$isUpdateInProgress/$updateStatusMessage)
  with .onChange(of:) — @observable properties don't synthesize Combine publishers
- SettingsStore: replace $isTrustRulesSheetOpen/.assign and $latestModelInfo/.sink
  with observationStream-based tasks, with proper cancellation

Co-Authored-By: tkheyfets <timur@vellum.ai>
…sStore deinit

- SettingsGeneralTab: seed isServiceGroupUpdateInProgress and updateStatusMessage
  from connectionManager in .onAppear to restore initial-value delivery lost in
  .onReceive → .onChange migration
- SettingsStore: add deinit cancelling trustRulesObservationTask and
  modelInfoObservationTask

Co-Authored-By: tkheyfets <timur@vellum.ai>
… task cancel

Use withTaskCancellationHandler + CancellableContinuationBox state machine to
ensure the withCheckedContinuation is resumed promptly when the task is
cancelled, even if no property change has occurred. The state machine handles
all race conditions between onChange (fires from the mutating thread),
onCancel (fires from any context), and continuation storage.

Co-Authored-By: tkheyfets <timur@vellum.ai>
…ifecycle

Missed rename from connectionStatusCancellable → connectionStatusTask in the
retire/cleanup path.

Co-Authored-By: tkheyfets <timur@vellum.ai>
…ionStream

ModelInfoMessage (generated type) does not conform to Equatable, so it cannot
use observationStream which requires Equatable & Sendable. Replace with a direct
withObservationTracking loop using CancellableContinuationBox (now public) for
proper cancellation support.

Co-Authored-By: tkheyfets <timur@vellum.ai>
…Info

1. observationStream: After withObservationTracking installs, check if the
   value already changed from lastValue and resume immediately if so. This
   closes the window between the initial read and tracking installation
   where a one-shot state transition could be lost forever.

2. SettingsStore: Apply current latestModelInfo before entering the
   observation loop, matching the old Combine $publisher behavior that
   delivered the current value on subscription.

Co-Authored-By: tkheyfets <timur@vellum.ai>
@devin-ai-integration devin-ai-integration Bot force-pushed the devin/1776178504-lum-745-observable-migration branch from fdccaa0 to fe45e26 Compare April 16, 2026 18:50
devin-ai-integration[bot]

This comment was marked as resolved.

@tkheyfets
Copy link
Copy Markdown
Contributor

@codex review

devin-ai-integration Bot and others added 2 commits April 16, 2026 19:07
AsyncStream.next() for non-throwing streams does not check task
cancellation — it only returns nil when finish() is called. When the
consuming task is cancelled (e.g., rebuildClient() replacing a
GatewayConnectionManager), the stream, its internal observation Task,
and all closure captures leak indefinitely.

Introduce ObservationValues<Value> — a custom AsyncSequence wrapper
whose next() uses withTaskCancellationHandler to finish the underlying
stream when the consuming task is cancelled. This causes the pending
next() to return nil, the for-await loop to exit, onTermination to
fire (cancelling the internal Task), and all captured references to
be released.

Also replace ContinuationFinisher's unsafeBitCast with a closure-based
approach for type safety.

Co-Authored-By: tkheyfets <timur@vellum.ai>
The @observable macro synthesizes main-actor-isolated getters for
properties on @mainactor classes. A @sendable closure cannot access
these because @sendable implies the closure may run on any thread.

Remove @sendable from the getValue parameter — the closure is only
ever called on the caller's actor (the internal Task inherits actor
context via Task.init in Swift 5 mode), so cross-actor safety at the
call site is unnecessary.

Co-Authored-By: tkheyfets <timur@vellum.ai>
chatgpt-codex-connector[bot]

This comment was marked as resolved.

…odelInfo

The custom withObservationTracking loop for latestModelInfo had a race
window: if the property changed between the initial read and tracking
registration, the change was missed (no onChange fires for an already-
occurred write).

Fix: add Equatable conformance to CatalogModel and ModelInfo via
extensions (generated files are not edited), then replace the manual
loop with observationStream() which already handles this race via a
post-tracking value check.

Co-Authored-By: tkheyfets <timur@vellum.ai>
@tkheyfets tkheyfets merged commit 86b0169 into main Apr 16, 2026
7 checks passed
@tkheyfets tkheyfets deleted the devin/1776178504-lum-745-observable-migration branch April 16, 2026 19:21
devin-ai-integration Bot added a commit that referenced this pull request Apr 16, 2026
Co-Authored-By: tkheyfets <timur@vellum.ai>
devin-ai-integration Bot added a commit that referenced this pull request Apr 17, 2026
Migrates MainWindowState, AssistantFeatureFlagStore, VoiceModeManager, and
UpdateManager from ObservableObject to @observable to fix the 2000ms+
MainWindowView.body re-evaluation hang.

Root cause: the four @ObservedObject managers on MainWindowView invalidated
the entire 127-line body (plus three modifier chains and ~20 .onReceive
registrations) on every @published mutation. Speech-cadence writes
(VoiceModeManager.partialTranscription/liveTranscription) and delegate
callbacks that wrote several properties synchronously (e.g.
UpdateManager.didFindValidUpdate) matched the re-entrant AttributeGraph
failure mode fixed by PR #25496 (LUM-745) for GatewayConnectionManager.

Changes:
- Apply @observable to the four managers; remove @published; mark
  non-reactive stored properties (@AppStorage, Task handles, cancellables,
  Sparkle updater controller, locks, delegates) as @ObservationIgnored.
- Replace MainWindowView's four @ObservedObject properties with plain
  stored var properties. SwiftUI tracks the specific properties read in
  body (and its modifier chain) instead of invalidating on every
  publisher send.
- Replace the single internal Combine consumer
  (UpdateManager.$isUpdateAvailable.values.dropFirst()) with
  observationStream({ self.isUpdateAvailable }).dropFirst(), matching the
  drop-initial-value semantics of the previous projection.
- Delete MainWindowState.observeNavigationHistory() — NavigationHistory
  is already @observable, so views reading through
  windowState.navigationHistory.{back,forward}Stack now track naturally
  via the nested @observable chain.
- Drop @ObservedObject from all eight downstream views (ImageLightboxOverlay,
  DynamicWorkspaceWrapper, MainWindowToastOverlay,
  MainWindowVersionMismatchBanner, ConversationSwitcherDrawer,
  SettingsPanel, AssistantUpgradeSection, VoiceTranscriptionView).
- Refresh stale comments in ComposerSection/ComposerView and the AGENTS.md
  observation-bridge notes.

Verification:
- All four classes are @mainactor; every @published mutation is already on
  the main thread. UpdateManager's nonisolated Sparkle delegate callbacks
  all hop via Task { @mainactor in } before touching state.
- No external $property or objectWillChange consumers on the four classes
  (rg confirmed).
- The ~20 .onReceive publishers in MainWindowView's modifier chain are all
  NotificationCenter publishers, unaffected by the migration.
- Tests do not reference @published projections or objectWillChange on
  these types.

Scope: migration + bridge deletion only. MainWindowView body/modifier
extraction is a follow-up PR.

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
ashleeradka added a commit that referenced this pull request Apr 17, 2026
* perf: migrate MainWindowView managers to @observable (LUM-841)

Migrates MainWindowState, AssistantFeatureFlagStore, VoiceModeManager, and
UpdateManager from ObservableObject to @observable to fix the 2000ms+
MainWindowView.body re-evaluation hang.

Root cause: the four @ObservedObject managers on MainWindowView invalidated
the entire 127-line body (plus three modifier chains and ~20 .onReceive
registrations) on every @published mutation. Speech-cadence writes
(VoiceModeManager.partialTranscription/liveTranscription) and delegate
callbacks that wrote several properties synchronously (e.g.
UpdateManager.didFindValidUpdate) matched the re-entrant AttributeGraph
failure mode fixed by PR #25496 (LUM-745) for GatewayConnectionManager.

Changes:
- Apply @observable to the four managers; remove @published; mark
  non-reactive stored properties (@AppStorage, Task handles, cancellables,
  Sparkle updater controller, locks, delegates) as @ObservationIgnored.
- Replace MainWindowView's four @ObservedObject properties with plain
  stored var properties. SwiftUI tracks the specific properties read in
  body (and its modifier chain) instead of invalidating on every
  publisher send.
- Replace the single internal Combine consumer
  (UpdateManager.$isUpdateAvailable.values.dropFirst()) with
  observationStream({ self.isUpdateAvailable }).dropFirst(), matching the
  drop-initial-value semantics of the previous projection.
- Delete MainWindowState.observeNavigationHistory() — NavigationHistory
  is already @observable, so views reading through
  windowState.navigationHistory.{back,forward}Stack now track naturally
  via the nested @observable chain.
- Drop @ObservedObject from all eight downstream views (ImageLightboxOverlay,
  DynamicWorkspaceWrapper, MainWindowToastOverlay,
  MainWindowVersionMismatchBanner, ConversationSwitcherDrawer,
  SettingsPanel, AssistantUpgradeSection, VoiceTranscriptionView).
- Refresh stale comments in ComposerSection/ComposerView and the AGENTS.md
  observation-bridge notes.

Verification:
- All four classes are @mainactor; every @published mutation is already on
  the main thread. UpdateManager's nonisolated Sparkle delegate callbacks
  all hop via Task { @mainactor in } before touching state.
- No external $property or objectWillChange consumers on the four classes
  (rg confirmed).
- The ~20 .onReceive publishers in MainWindowView's modifier chain are all
  NotificationCenter publishers, unaffected by the migration.
- Tests do not reference @published projections or objectWillChange on
  these types.

Scope: migration + bridge deletion only. MainWindowView body/modifier
extraction is a follow-up PR.

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>

* fix: use generation counter to signal update-check completion

Codex review on #26152 flagged a latency regression in
`checkForUpdatesAsync`: `observationStream` deduplicates by
`Equatable`, so when Sparkle's `didFindValidUpdate` writes
`isUpdateAvailable = true` on a re-check where it was already `true`
(or `updaterDidNotFindUpdate` writes `false` when it was already
`false`), the stream never yields and the async check stalls until the
5-second timeout. The previous `$isUpdateAvailable.values` Combine
projection emitted on every `@Published` assignment regardless of
equality, so the same scenarios returned instantly.

Add `updateCheckCompletionGeneration: UInt64` — a tracked monotonic
counter that both delegate callbacks bump whenever a check completes.
`checkForUpdatesAsync` now observes the counter instead of
`isUpdateAvailable`, so every completion yields a new value and the
task returns `isUpdateAvailable` promptly. Uses `&+=` for overflow
safety (UInt64.max checks is comfortably beyond process lifetime but
this is still the correct primitive).

`updaterDidNotFindUpdate` bumps the counter in a `defer` so the
deferred-install early return still signals completion to any racing
`checkForUpdatesAsync` call.

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>

* docs: update @observable migration tables in clients/AGENTS.md

Per Devin Review on #26152: move MainWindowState, VoiceModeManager,
UpdateManager, and AssistantFeatureFlagStore from the "intentionally
remaining ObservableObject" table to the "migrated to @observable"
list.

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>

* docs: rewrite @observable docstrings to describe current design

Remove language that contrasted the new design against the prior
ObservableObject implementation or referenced PR iterations
("previously", "no longer", "this counter restores that behavior").
Docstrings now describe the current design's properties directly so
future readers of the file don't need the PR context to make sense of
them.

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>

* fix: use @bindable for MainWindowView.windowState

PanelCoordinator's `nativePanelView` and `fullWindowPanel` both use
`$windowState.pendingMemoryId` / `$windowState.pendingSkillId` to
produce binding projections for LogsAndUsagePanel.  A plain `var`
reference to an `@Observable` type doesn't expose the `$` prefix, so
those call sites broke under the migration.

`@Bindable` is Apple's recommended wrapper for binding sites on
`@Observable` types.  Unlike `@ObservedObject`, it doesn't establish
its own view-invalidation policy — Observation still tracks property
granularly — so the goal of the migration (scoped invalidation through
MainWindowView.body) is preserved.

Refs: https://developer.apple.com/documentation/swiftui/bindable
Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: ashlee@vellum.ai <ashlee@vellum.ai>
tkheyfets added a commit that referenced this pull request Apr 17, 2026
…ctor (#26082)

* Run GatewayConnectionManager health check loop off @mainactor

The 15-second health check loop previously ran as a Task { @mainactor ... }
so the while-loop, Task.sleep scheduling, and error handling all occupied
the main actor. Under memory pressure this periodic work contributes to
app hangs (LUM-914 recorded a ≥2000ms hang attributed to this loop on a
device with ~131MB free RAM).

Change the loop to a detached task at .utility priority and hop to
@mainactor only for the state reads (isUpdateInProgress,
healthCheckInterval, shouldReconnect) and the update-timeout cleanup
that mutates @published properties. performHealthCheck() itself stays
@mainactor so all @published writes and cached-assistant reads remain on
main — no new data races, no changes to resolveConnection() or the rest
of the network stack that triggered the earlier revert of #21695.

Refs: WWDC25 Embracing Swift concurrency
(https://developer.apple.com/videos/play/wwdc2025/268/),
Task.detached(priority:), MainActor.run.

Closes LUM-916

Co-Authored-By: tkheyfets <timur@vellum.ai>

* Update comments to reflect @observable migration from #25496

Co-Authored-By: tkheyfets <timur@vellum.ai>

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: tkheyfets <timur@vellum.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant