perf: reduce WindowServer compositing cost and layout measurement overhead#23515
Conversation
…rhead 1. Downsample images to display resolution using CGImageSourceCreateThumbnailAtIndex before rendering in InlineToolCallImageView and AttachmentImageGrid. This avoids uploading full-resolution bitmaps (e.g. 4000x3000 = 48MB RGBA) to the GPU when the display frame is capped at ~580pt. 2. Replace GeometryReader with .onGeometryChange() in InlineAudioAttachmentView's progress bar to eliminate a synchronous layout dependency inside the LazyVStack. 3. Extend .compositingGroup() to completed interleaved messages (previously skipped), flattening deep CALayer trees for the WindowServer. Resolves LUM-713 Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Re-key the .task(id:) in AttachmentImageGrid to include displayScale so downsampled images are regenerated when the window moves between displays with different pixel densities (e.g. 1x → 2x Retina). Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
1. Fix lightbox regression: ImageLightboxState.displayImage now decodes from base64Data on demand when fullResImage is nil, so non-lazy attachments show full-resolution in the lightbox even though the inline preview is downsampled. 2. Add downsampleForDisplay(data:targetSize:scale:) overload that feeds raw compressed bytes directly to CGImageSourceCreateWithData, avoiding the NSImage→TIFF→ImageIO round-trip that allocates a full-resolution bitmap. All decode paths with raw Data available now prefer this path. Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
…access Move base64→NSImage decoding from the displayImage computed property (called during every view body evaluation) into a one-shot async task spawned by showImageLightbox(). The decoded image is stored in fullResImage and reused on subsequent accesses. The decoding runs on a detached task to keep Data(base64Encoded:) and NSImage(data:) off the main thread entirely. The lightbox initially shows the downsampled inline preview and swaps to full-resolution once the decode completes. Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
Replace NSImage(data:) with CGImageSourceCreateWithData + CGImageSourceCreateImageAtIndex in the detached task, per AGENTS.md guidance to prefer thread-safe CoreGraphics/ImageIO APIs over AppKit types in concurrent contexts. Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>
| lazyAttachmentId: lazyAttachmentId, | ||
| fullResImage: nil, | ||
| isLoadingFullRes: lazyAttachmentId != nil | ||
| isLoadingFullRes: lazyAttachmentId != nil || base64Data != nil |
There was a problem hiding this comment.
🟡 Unnecessary loading spinner shown in lightbox when image is already full-resolution
The change to isLoadingFullRes: lazyAttachmentId != nil || base64Data != nil causes a spurious loading spinner to appear in the lightbox when callers pass an already-decoded full-resolution image alongside base64Data. Specifically, ComposerAttachments.openAttachmentPreview (ComposerAttachments.swift:102-121) decodes the full-res image synchronously via NSImage(data:) and passes both the decoded image and the raw base64Data to showImageLightbox. Before this PR, isLoadingFullRes was lazyAttachmentId != nil (false here), so no spinner appeared. Now isLoadingFullRes is true whenever base64Data is non-nil, triggering the spinner overlay (ImageLightboxOverlay.swift:31-35) on top of the already-visible full-res image while decodeBase64LightboxImage redundantly re-decodes the same data (~50-200ms). The user sees a confusing loading indicator over a fully loaded image.
Prompt for agents
The root cause is that showImageLightbox unconditionally sets isLoadingFullRes=true when base64Data is provided, but some callers (ComposerAttachments.openAttachmentPreview) have already decoded the full-res image and pass it as the image parameter. The fix should distinguish between callers that pass a downsampled thumbnail (need async full-res decode) and callers that pass the already-decoded full-res image (no decode needed).
Possible approaches:
1. Add a parameter like needsFullResDecode: Bool to showImageLightbox so callers can opt out of the async decode when they already have full-res.
2. Have showImageLightbox only set isLoadingFullRes and call decodeBase64LightboxImage when the passed image appears to be downsampled (e.g. check pixel dimensions vs some threshold).
3. Update ComposerAttachments.openAttachmentPreview to not pass base64Data (pass nil) since it already provides the full-res image, though this would prevent the lightbox toolbar from using base64Data for copy/save actions.
Option 1 is cleanest. The new parameter defaults to true for backward compatibility with the chat bubble path (which now passes downsampled images), and ComposerAttachments passes false.
Was this helpful? React with 👍 or 👎 to provide feedback.
|
Devin is archived and cannot be woken up. Please unarchive Devin if you want to continue using it. |
Summary
Four targeted performance improvements for the message list, addressing WindowServer compositing hangs (~10s) and layout measurement overhead (~1s) traced via spin dumps (LUM-713).
Why this is needed
Full-resolution images (e.g., 4000×3000 = 48MB RGBA bitmap) were uploaded as GPU textures even though display size is capped at ~580pt. With multiple images in a conversation, cumulative GPU texture memory caused WindowServer compositing stalls. A synchronous
GeometryReaderinside theLazyVStackcreated a layout dependency that blocked measurement. Interleaved messages (text + tool calls + images) were excluded from.compositingGroup(), leaving deep CALayer trees for the WindowServer to traverse.Changes
Image downsampling via
CGImageSourceCreateThumbnailAtIndex— TwodownsampleForDisplay()overloads:data:overload (preferred): Feeds raw compressed bytes directly toCGImageSourceCreateWithData, decoding only the pixels needed for the target size — no full-resolution bitmap allocation.NSImageoverload (fallback): For pre-decodedNSImageobjects (tool-call cached images,thumbnailImage), extracts TIFF representation and feeds it to ImageIO.InlineToolCallImageViewand allAttachmentImageGriddecode paths. Task key includesdisplayScaleso downsampling re-runs at the correct pixel density when moving between displays.GeometryReader→.onGeometryChange()inInlineAudioAttachmentView.progressBar— Eliminates a synchronous layout dependency inside theLazyVStack. The track width is now measured asynchronously and stored in@State.Extend
.compositingGroup()to completed interleaved messages — Removes thecachedHasInterleavedContentexclusion so all non-streaming messages get layer flattening. Still skipped during streaming to avoid re-compositing on every token delta.Lightbox full-resolution recovery — Since inline previews are now downsampled, the lightbox decodes full-resolution images asynchronously.
decodeBase64LightboxImageuses thread-safe ImageIO APIs (CGImageSourceCreateWithData+CGImageSourceCreateImageAtIndex) on aTask.detachedoff the main thread, then setsfullResImageon completion. The lightbox initially shows the downsampled preview and swaps to full-res once decoded.Why this is safe
GeometryReaderreplacement is a direct 1:1 swap —.onGeometryChange()provides the same width measurement without the synchronous layout dependency..compositingGroup()extension is gated by!message.isStreaming, so active token streaming is unaffected. Only completed messages get the flattened layer.Task.detachedwith cancellation support (lightboxFetchTask) matching the existing lazy-fetch pattern.Alternatives considered and rejected
NSImage.preparingThumbnail(of:)— Only available on iOS/tvOS, not macOS.CGImageSourceCreateThumbnailAtIndexis the macOS equivalent recommended in WWDC18..taskblocks inherit MainActor isolation for cancellation semantics tied to view lifecycle. Moving decode off-MainActor would require manual cancellation tracking. The directdata:→ ImageIO path avoids the main-thread bitmap allocation that was the actual problem, making the actor migration unnecessary for this fix.isStreamingalready gates active token deltas and the remaining async work on completed messages is infrequent, the tradeoff favors flattening. If profiling with Core Animation Instruments shows regressions on interleaved-heavy conversations, the exclusion can be restored.LazyVStackre-measurement — This would address the deeper layout hang tracked in LUM-716, but is a larger architectural change outside the scope of this fix.References
CGImageSourceCreateThumbnailAtIndexfor memory-efficient downsamplingCGImageSourceCreateThumbnailAtIndexdocs.onGeometryChange()docs — async geometry measurement replacingGeometryReader.compositingGroup()docs — flattening render tree into single compositing layerReview & Testing Checklist for Human
Notes
downsampleForDisplay(_:NSImage)fallback path callsimage.tiffRepresentationon the main thread via.task. This is not a regression (pre-existing code calledNSImage(data:)on the main thread), and this fallback is only hit for tool-call cached images andthumbnailImage— the common base64 paths use the directdata:overload.fetchFullResLightboxImagepath (for lazy-loaded attachments fetched from the server) still usesNSImage(data:), but runs on@MainActorwhere AppKit APIs are safe. Only theTask.detachedbase64 decode path usesCGImageSourcefor thread safety.sizeThatFits/placeChildrenstalls during scrolling and streaming) are tracked separately in LUM-716.Link to Devin session: https://app.devin.ai/sessions/1f18acbafb0043a9afa2662c3fdb533c
Requested by: @ashleeradka