fix: voice mode bugs — permission bypass, shutdown race, tap leak, prefix pollution, barge-in conflict#6099
Conversation
1. Permission bypass: 'No, don't do it' was approved because 'do it' matched before 'no'/'don't'. Now treats ambiguous (both match) as denial for safety. 2. Shutdown race: stopSpeaking() during deactivate() fired ttsOnComplete which called startListening(), leaving service in inconsistent state. Now sets state=.off before shutdown. 3. Audio tap leak: if audioEngine.start() failed, the installed tap was never removed, causing a crash on the next startRecording() call. 4. Voice prefix pollution: the voice system instruction prefix was stored in chat messages and used for thread title generation. Now only sent via IPC. 5. Barge-in monitor conflict: startBargeInMonitor() ran after finishTextStream() which could complete synchronously (no ElevenLabs), causing double tap install. Reordered to start barge-in monitor before finishTextStream. Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 038145891f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // whole transcript. For queued messages this is set in messageDequeued. | ||
| if !willBeQueued { | ||
| currentTurnUserText = text | ||
| currentTurnUserText = rawText |
There was a problem hiding this comment.
Preserve voice prefix for secret-blocked retries
Store currentTurnUserText as the transport text (or keep a separate transport copy) in voice mode; assigning rawText here means later recovery flows resend a different prompt than the one originally sent to the daemon. In the secret_blocked path, ChatViewModel+MessageHandling snapshots currentTurnUserText into secretBlockedMessageText, and sendAnyway() sends that value directly, so a voice-mode “Send anyway” retry drops the voice instruction prefix and changes response behavior (e.g., spoken brevity/tooling constraints no longer apply).
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
🟡 Voice prefix pollution fix missed refinementMessagePreview, which is displayed in UI
The PR systematically changes text → rawText for all user-facing strings (chat messages at lines 275 and 307, thread titles at line 258, and currentTurnUserText at lines 289 and 344) to prevent the voice instruction prefix from polluting the UI. However, refinementMessagePreview at line 315 still uses text (which includes the voice prefix).
Root Cause and Impact
When voice mode is active and a workspace surface is open (activeSurfaceId != nil), messages are routed as workspace refinements. Line 315 sets refinementMessagePreview = text, where text contains the full voice prefix:
[Voice conversation — keep spoken responses brief...]
user's actual message
This refinementMessagePreview is displayed directly in the workspace activity feed UI (clients/macos/vellum-assistant/Features/Surfaces/WorkspaceActivityFeed.swift:17) via Text(preview). Users would see the raw system instruction prefix in the workspace overlay bubble instead of just their spoken message.
Every other UI-facing usage of the message text in sendMessage() was correctly changed to rawText by this PR, but this one instance was missed.
(Refers to line 315)
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
stopSpeaking()duringdeactivate()re-enteredstartListening()via synchronousttsOnCompletecallbackaudioEngine.start()fails — clean up installed tap to prevent crash on next recording[Voice conversation — keep spoken responses brief...]) polluting chat messages and thread titles — now only used for IPC to daemonfinishTextStream()which may complete synchronouslyOriginal prompt
fix: voice mode bugs — permission bypass, shutdown race, tap leak, prefix pollution, barge-in conflict
🤖 Generated with Claude Code