Skip to content

feat(desktop): add Voice Control dictation#4981

Open
justincrich wants to merge 39 commits into
superset-sh:mainfrom
justincrich:main
Open

feat(desktop): add Voice Control dictation#4981
justincrich wants to merge 39 commits into
superset-sh:mainfrom
justincrich:main

Conversation

@justincrich
Copy link
Copy Markdown
Contributor

@justincrich justincrich commented May 28, 2026

Why

Voice Control gives desktop users a first-class dictation path inside Superset instead of relying on OS-level dictation or external tools. The goal is to make voice input feel native to the app: discoverable in Settings, controllable with a user-selected shortcut, constrained to supported text targets, and explicit about recording/processing state.

This is especially useful for the two highest-friction text-entry surfaces in the desktop app:

  • Chat composers, where users often want to dictate longer prompts without losing their current draft or caret position.
  • Terminal panes, where users need a controlled way to insert dictated text without accidentally executing it.

The implementation intentionally keeps the preference local, starts disabled by default, and adds microphone readiness UI so users understand what has to be enabled before dictation works.

What This Does

Voice Control settings and discoverability

  • Adds a local voiceInputEnabled setting with a default disabled state.
  • Surfaces Voice Control in General settings with status copy, an enable/disable switch, and microphone readiness.
  • Adds Voice Control search metadata so settings search returns both the General setting and the Keyboard shortcut entry for voice-related queries.
  • Adds deep links between General settings and Keyboard settings so users can move directly between enabling Voice Control and editing the activation shortcut.
  • Moves Voice Control into the Keyboard shortcuts list as its own category directly under the shortcut search bar, so it participates in shortcut search instead of being a detached block.

Shortcut registration, editing, and persistence

  • Registers a configurable Voice Control activation shortcut.
  • Persists keyboard shortcut overrides and keyboard layout preferences through the existing hotkey store path.
  • Allows the shortcut to be changed, reset, unassigned, and conflict-checked through the Keyboard settings UI.
  • Disables Voice Control shortcut editing when Voice Control itself is off, with a cross-link back to the General Voice Control setting.
  • Adds explicit Fn/Globe handling for macOS-style keyboards:
    • Fn/Globe can be used as a standalone activation shortcut.
    • Fn/Globe combinations remain unsupported because browsers/Electron do not reliably expose them as stable chords.
    • Activation works even when macOS reports the key through event.key, modifier state, or without a stable event.code.

Dictation capture and insertion

  • Adds a renderer-side dictation flow built around MediaRecorder.
  • Adds app-side transcription plumbing through the desktop tRPC router.
  • Inserts transcribed text into the supported focused target instead of blindly pasting globally.
  • Supports chat composers and terminal panes as initial targets.
  • Preserves existing chat drafts and caret-aware insertion behavior.
  • Writes dictated terminal text without auto-executing it.
  • Tracks recent supported focus so activation still works when focus briefly lands on the document body after interacting with the app chrome.

User feedback while dictating

  • Adds a minimal Voice Control indicator for recording and processing states.
  • Shows errors when Voice Control is disabled, the microphone is unavailable, no supported target is focused, or transcription fails.
  • Supports press-and-hold behavior: recording starts on shortcut press and ends when the shortcut is released.

Microphone and permission behavior

  • Reads microphone status without triggering a permission prompt.
  • Prompts only through explicit user action.
  • Keeps normal OS dictation and normal paste behavior working when Voice Control is disabled.

Media

Screen Recording

TODO: Attach a short screen recording showing:

  1. Enabling Voice Control in General settings.
  2. Jumping from General settings to the Voice Control shortcut in Keyboard settings.
  3. Searching voice in Keyboard settings and seeing the Voice Control shortcut below the search bar.
  4. Recording a dictation with press-and-hold activation and inserting text into chat or terminal.

Settings Snapshot

Screenshot 2026-05-28 at 3 11 20 PM
  1. The General > Voice Control section with microphone readiness.
  2. The Keyboard settings search bar with the Voice Control shortcut category directly below it.
Screen.Recording.2026-05-28.at.3.12.09.PM.mov

Validation

  • bun run --cwd apps/desktop typecheck
  • git ls-files -z | xargs -0 bunx @biomejs/biome@2.4.2 check
  • ./scripts/check-desktop-git-env.sh
  • ./scripts/check-git-ref-strings.sh
  • bash ./scripts/check-simple-git-usage.sh
  • bun test apps/desktop/src/renderer/hotkeys/hooks/useRecordHotkeys/useRecordHotkeys.test.ts apps/desktop/src/renderer/hotkeys/hooks/useHotkey/useHotkey.test.tsx apps/desktop/src/renderer/hotkeys/utils/resolveHotkeyFromEvent.test.ts apps/desktop/src/renderer/hotkeys/registry.test.ts apps/desktop/src/renderer/routes/_authenticated/settings/keyboard/voice-shortcut.test.tsx apps/desktop/src/renderer/voice-input/voiceDictationTarget.test.ts apps/desktop/src/renderer/voice-input/useVoiceActivationGuard.test.ts apps/desktop/src/renderer/voice-input/voice-preferences.integration.test.tsx apps/desktop/src/lib/trpc/routers/settings/voice-input.test.ts apps/desktop/src/lib/trpc/routers/permissions.test.ts apps/desktop/src/lib/trpc/routers/permissions/native-permissions.test.ts

Focused voice/hotkey test run result: 126 pass, 0 fail.

Note: the normal bun run lint command cannot complete from my local fork-main checkout because that checkout currently has untracked nested worktrees under .claude/worktrees, and Biome scans those as nested root configurations. The tracked-file Biome check and the lint script's auxiliary checks pass.

Risks and Notes

  • Fn/Globe keyboard reporting varies by macOS keyboard and hardware. This PR handles the forms Electron can expose, but combinations with Fn/Globe are deliberately not supported.
  • Dictation is scoped to known app targets rather than global paste, which is safer but means unsupported surfaces will show a clear target error instead of accepting text.
  • Voice Control starts disabled by default and requires microphone readiness before it can be useful.

Open in Stage

Summary by cubic

Adds Voice Control dictation to the desktop app with a configurable shortcut, native settings, and safe insertion into chat and terminal. Default is off; users can enable it in Settings and dictate via press-and-hold with clear recording/processing feedback.

  • New Features

    • Voice Control toggle in Settings > Behavior with microphone readiness.
    • Configurable activation shortcut under a new “Voice Control” category in Keyboard settings; searchable, conflict-checked, reset/unassign supported, and deep-linked from Behavior.
    • macOS Fn/Globe supported as a standalone shortcut; combinations intentionally unsupported.
    • Dictation flow uses MediaRecorder in the renderer and tRPC in the app; transcribes with OpenAI’s gpt-4o-mini-transcribe; press-and-hold to record, release to stop.
    • Inserts transcribed text only into supported, focused targets: chat composers and terminal panes; preserves drafts/caret; terminal insert never executes.
    • Lightweight indicator shows starting/listening/processing/success/error states.
    • Reads microphone status without prompting; prompts only via explicit action.
    • Focus tracking so activation still works across brief focus changes.
    • Electron permission handler allows audio-only media requests from the app window.
  • Migration

    • Local DB adds a voice_input_enabled setting (defaults to false); no manual migration needed.
    • Users must enable Voice Control and grant microphone access.
    • Ensure OpenAI credentials are available (env var or existing Chat credential) for transcription.

Written for commit 499f1b3. Summary will update on new commits.

Review in cubic

Summary by CodeRabbit

  • New Features

    • Added voice dictation capability with OpenAI transcription integration for chat and terminal inputs.
    • Added voice input settings toggle and dedicated keyboard shortcut configuration.
    • Implemented microphone permission status checking and management.
    • Added support for Fn/Globe key recognition in keyboard shortcuts.
  • Tests

    • Added comprehensive test coverage for voice input, microphone permissions, and keyboard shortcuts.
  • Chores

    • Added database migration for voice input settings persistence.

Review Change Stack

Add droid to HOST_AGENT_PRESETS and BUILTIN_TERMINAL_AGENTS with --auto medium
autonomy, simplify the agent-identity type union, and add branded SVG icons.
Two related performance fixes for the V2 FilesTab:

1. Empty folder on expand (root cause): the lazy-expand detector in
   useFilesTabBridge subscribed to model changes and iterated EVERY
   path in knownPaths on each notification, polling isExpanded() to
   decide whether to fetch children. With large workspaces this was
   O(n) per Pierre subscriber tick, AND racy — if Pierre's internal
   expansion state hadn't flipped by the time the subscriber fired,
   the fetch never started and the folder appeared empty until the
   user pressed Refresh.

   Replaced the polling sweep with a purpose-built
   `unloadedDirCandidatesRef` set that tracks only directories Pierre
   knows about but we haven't loaded yet. The subscriber now iterates
   the candidate set (typically <20 entries) instead of all known
   paths. Candidates are populated as fetchDir discovers child
   directories, added by fs:events for new folders, and cleared on
   workspace switch / doRefresh.

2. Workspace switch lag: FilesTab was issuing its own
   workspace.get.useQuery without staleTime, so every remount fired
   a fresh IPC even though the parent route (V2WorkspacePage) had
   just loaded the same data. Bumped staleTime to 30s so the cache
   serves the duplicate request instantly on remount.

Files:
- useFilesTabBridge.ts: O(n) → O(k) expansion detection, candidate set
- FilesTab.tsx: staleTime on workspace.get query

Out of scope (deferred): legacy FilesView (separate fix branch exists),
backend listDirectory perf, Pierre Trees upgrade for native onExpand.
Per research into VSCode/JetBrains/Zed/Warp/Cursor: this project already
meets most best-practice items for real-time file tree updates — we use
@parcel/watcher (same library as VSCode), with comprehensive native
excludes for node_modules/.git/dist/etc. The gap is observability, not
architecture.

Adds dev-mode logging at three points in the fs:events pipeline so we
can diagnose any future perceived-lag reports with evidence instead of
speculation:

1. packages/workspace-fs/src/watch.ts — log at parcel callback emission
   (entry point into our pipeline). Gated on
   process.env.SUPERSET_FS_EVENTS_DEBUG=1.

2. packages/host-service/src/events/event-bus.ts — log at sendMessage
   for fs:events payloads (transport boundary). Same env flag.

3. apps/desktop/.../FilesTab/.../useFilesTabBridge.ts — log every
   fs:events handler entry, plus the two surviving early-return paths
   (rootPath-empty belt-and-suspenders; outside-workspace path filter;
   rename fallback when oldKey not in knownPaths). Gated on
   import.meta.env.DEV so logs ship in dev builds only.

To enable Node-side logging during dev:
  SUPERSET_FS_EVENTS_DEBUG=1 SUPERSET_PROFILE=local bun dev --filter=@superset/desktop

Renderer logs are always on in dev builds.

No production behaviour change. No latency tuning — defer until logs
prove throttling (75ms parcel debounce + 200ms ThrottledWorker) is the
dominant source of perceived lag.
# Conflicts:
#	packages/shared/src/agent-identity.ts
#	packages/shared/src/builtin-terminal-agents.ts
#	packages/shared/src/host-agent-presets.ts
#	packages/ui/src/assets/icons/preset-icons/droid-white.svg
#	packages/ui/src/assets/icons/preset-icons/droid.svg
The origin/main merge left duplicate imports, PRESET_ICONS entries,
and re-exports for droidIcon/droidWhiteIcon in the preset-icons index.
Removed the duplicates to fix the esbuild build error.
Builds the desktop app as a real production bundle and installs it to
/Applications, swapping dev .env values (NODE_ENV, public URLs, RELAY_URL,
SUPERSET_WORKSPACE_NAME) for production only during the build, then
restoring the dev .env via an EXIT trap. Avoids the dev-server-URL crash,
the blank-screen RELAY_URL Zod failure, and the wrong-workspace data dir.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

📝 Walkthrough

Walkthrough

Introduces desktop voice dictation end-to-end with hotkeys (including Fn/Globe), TRPC/OpenAI transcription, UI integrations (dashboard/chat/terminal), settings with DB migration, and extensive tests. Adds SUPER-869 spec docs and reorders Droid preset icons. Improves FilesTab lazy expansion, adds FS debug logging, and a local macOS build script.

Changes

Desktop Voice Input, Hotkeys, and Settings

Layer / File(s) Summary
Voice types, events, terminal targets, dictation target, DB/migration contracts
apps/desktop/src/shared/constants.ts, packages/local-db/drizzle/*, packages/local-db/src/schema/schema.ts, apps/desktop/src/renderer/voice-input/*
Defines voice activation/types/events, terminal target registry, dictation target resolution, default enable flag, and DB migration/schema for voice setting.
tRPC wiring, voice transcribe router, native permissions, settings API
apps/desktop/src/lib/trpc/routers/*, apps/desktop/src/main/windows/main.ts, apps/desktop/package.json
Adds voiceInput router with OpenAI transcription, refactors microphone permission status, exposes get/set voice setting, wires router, handler, and tests; adds happy-dom for tests.
Focus tracking, activation guard, target resolution and integration tests
apps/desktop/src/renderer/voice-input/*
Adds focus tracking and activation guard, dictation target tests, and an integration test spanning behavior and keyboard settings with shortcut flows.
Dictation hook and indicator component
apps/desktop/src/renderer/voice-input/hooks/useVoiceDictation/*, apps/desktop/src/renderer/voice-input/components/VoiceDictationIndicator/*
Implements dictation capture via MediaRecorder and a UI indicator; exposes hook/types via barrel export.
Dashboard, Chat, Terminal, and TipTap wiring
apps/desktop/src/renderer/routes/_authenticated/_dashboard/*, apps/desktop/src/renderer/components/Chat/*, apps/desktop/src/renderer/screens/.../Terminal/Terminal.tsx, apps/desktop/src/renderer/components/TiptapPromptEditor/*
Integrates dictation indicator and voice toggle handling; marks chat/terminal as voice targets; TipTap inserts dictated text via event.
Settings UI/routes, shortcut links, settings search, and tests
apps/desktop/src/renderer/routes/_authenticated/settings/*
Adds Voice Control section with toggle and microphone readiness, deep-linking between behavior/keyboard pages, settings-search entries, shortcut link constants, and tests.
Hotkeys Fn/Globe support, utilities, registry, stores, and tests
apps/desktop/src/renderer/hotkeys/*
Adds Fn/Globe handling in display, chord/resolve/record, standalone Fn detection, VOICE_INPUT_TOGGLE registry, persistence via browserLocalStorage, and comprehensive tests.

SUPER-869 Droid preset specs and preset icon order

Layer / File(s) Summary
SUPER-869 documentation
.spec/improvements/SUPER-869/*
Adds BRIEF, SCOPE, and follow-ups docs specifying Droid preset integration details and acceptance criteria.
Preset icons reorder for droid
packages/ui/src/assets/icons/preset-icons/index.ts
Reorders droid preset icon entries and exports without changing assets.

FilesTab lazy expansion, FS events debug, and local prod build script

Layer / File(s) Summary
FilesTab query and bridge lazy-expansion
apps/desktop/src/renderer/routes/.../FilesTab/*
Adds query staleTime and refactors bridge to track/load unloaded directory candidates with DEV logs.
FS watch debug logging
packages/host-service/src/events/event-bus.ts, packages/workspace-fs/src/watch.ts
Adds DEBUG-gated logs for incoming filesystem event batches.
Desktop local production build script
apps/desktop/scripts/build-local-prod.sh
Builds and installs a local macOS app with temporary env overrides and safe restoration.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant Renderer as Renderer (Dashboard/Chat/Terminal)
  participant Hook as useVoiceDictation
  participant TRPC as tRPC voiceInput.transcribe
  participant OpenAI as OpenAI Transcriptions
  participant Target as Chat/Terminal Target

  User->>Renderer: Press VOICE_INPUT_TOGGLE
  Renderer->>Hook: start(target)
  Hook->>Hook: capture audio (MediaRecorder)
  User-->>Hook: release key (stop)
  Hook->>TRPC: base64 audio + mime
  TRPC->>OpenAI: POST /v1/audio/transcriptions
  OpenAI-->>TRPC: { text }
  TRPC-->>Hook: { text }
  Hook->>Target: insert transcript
  Renderer-->>User: Voice indicator success
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Poem

A rabbit taps the Fn to speak,
Whispered bytes in a tidy streak.
Mic to TRPC, off they glide,
OpenAI returns the guide.
Chat and terminal get their say—
Push-to-talk hops save the day.
Droid waves too—icons in array. 🐇🎙️⌨️

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

@capy-ai
Copy link
Copy Markdown

capy-ai Bot commented May 28, 2026

Capy auto-review is paused for this organization because the monthly auto-review limit has been reached. Increase the limit or turn it off in billing settings to resume automatic reviews.

@stage-review
Copy link
Copy Markdown

stage-review Bot commented May 28, 2026

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 28, 2026

Greptile Summary

This PR introduces a native Voice Control dictation feature for the desktop app, covering the full end-to-end path from a configurable keyboard shortcut through MediaRecorder audio capture, OpenAI transcription via a new tRPC route, and caret-aware insertion into chat composers and terminal panes.

  • Adds a voiceInputEnabled DB column (migration 0042), tRPC settings endpoints, and a General Settings section with microphone readiness UI and deep links to Keyboard settings.
  • Registers a new VOICE_INPUT_TOGGLE hotkey with explicit Fn/Globe support, press-and-hold stop on key-release, and conflict-check integration; refactors hotkey utilities (chord.ts, fnKey.ts) out of resolveHotkeyFromEvent.ts.
  • Introduces useVoiceDictation, useVoiceActivationGuard, focusTracking, and voiceDictationTarget to manage session lifecycle, focus-fallback logic, and target-specific text insertion with a VOICE_DICTATION_INSERT_EVENT custom event handled by TiptapPromptEditor.

Confidence Score: 4/5

The new feature is off by default and isolated behind the voiceInputEnabled flag, so it poses no risk to users who have not opted in. The one insertion bug only surfaces in an edge case where the TiptapPromptEditor event handler is absent, but it could cause text to land in the wrong input.

The dictation session lifecycle, press-and-hold release tracking, and transcription plumbing are all well-constructed and have solid test coverage. The main concern is getEditableElement in voiceDictationTarget.ts: it returns document.activeElement without verifying the element belongs to the target container, which can misdirect inserted text when the primary event-handler path is unavailable. The rest of the changes — hotkey refactoring, DB migration, permissions rework, and settings UI — are clean and low-risk.

apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts needs the getEditableElement containment fix; apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.tsx should scope its permission polling interval.

Important Files Changed

Filename Overview
apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts Target resolution and text insertion logic; getEditableElement returns document.activeElement without checking it is a descendant of targetElement, risking insertion into an unrelated input when the TiptapPromptEditor event handler is absent.
apps/desktop/src/renderer/voice-input/hooks/useVoiceDictation/useVoiceDictation.ts New hook managing MediaRecorder lifecycle, press-and-hold session state, and async transcription. Race conditions between start/stop are handled via refs; cleanup on unmount looks correct.
apps/desktop/src/lib/trpc/routers/voice-input.ts New tRPC route that forwards audio to OpenAI's transcription API; correctly validates size bounds, handles error types, and checks OAuth token expiry before use.
apps/desktop/src/renderer/routes/_authenticated/_dashboard/layout.tsx Integrates voice dictation into the dashboard layout; hotkey registration with press-and-hold release tracking looks correct; armVoiceReleaseStop correctly cleans up on component unmount via useEffect.
apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.tsx Adds Voice Control section with microphone readiness UI and optimistic toggle; permission status polled every 2 s regardless of current state, which is more frequent than necessary for terminal states.
apps/desktop/src/renderer/hotkeys/utils/chord.ts New module consolidating chord normalization, AltGr suppression, and terminal-reserved chord logic previously inline in resolveHotkeyFromEvent.ts; logic is unchanged, only extracted.
apps/desktop/src/renderer/hotkeys/utils/resolveHotkeyFromEvent.ts Refactored to re-export from the new chord.ts module; Fn/Globe handling added to resolveHotkeyFromEvent; changes are clean and backward-compatible.
apps/desktop/src/renderer/hotkeys/hooks/useHotkey/useHotkey.ts Adds a manual window.addEventListener("keydown") path for standalone Fn/Globe shortcuts; optionsRef avoids stale closures in the listener; cleanup via useEffect return looks correct.
packages/local-db/drizzle/0042_add_voice_input_enabled.sql Minimal additive migration adding nullable voice_input_enabled column with no default; existing rows stay null and the app applies the DEFAULT_VOICE_INPUT_ENABLED default at read time.
apps/desktop/src/lib/trpc/routers/permissions/native-permissions.ts Refactors checkMicrophone to use a richer getMicrophonePermissionStatus helper that maps Electron's four states to the UI-facing MicrophonePermissionStatus type; backward-compatible.

Sequence Diagram

sequenceDiagram
    participant U as User
    participant KH as useHotkey (VOICE_INPUT_TOGGLE)
    participant G as useVoiceActivationGuard
    participant FT as focusTracking
    participant VD as useVoiceDictation
    participant MR as MediaRecorder
    participant tRPC as tRPC (main process)
    participant OAI as OpenAI Transcriptions API
    participant T as DictationTarget (chat/terminal)

    U->>KH: keydown (shortcut)
    KH->>G: runVoiceActivationHotkeyEvent()
    G->>FT: getFocusedVoiceInputTargetElement()
    FT-->>G: targetElement (via DOM or remembered)
    G-->>KH: "{ status: allowed, target }"
    KH->>KH: armVoiceReleaseStop()
    KH->>T: getFocusedVoiceDictationTarget()
    T-->>KH: "VoiceDictationTarget { insertTranscript }"
    KH->>VD: toggle(target)
    VD->>MR: getUserMedia() then recorder.start(250ms)
    MR-->>VD: "onstart, phase = listening"

    U->>KH: keyup (shortcut released)
    KH->>VD: stop()
    VD->>MR: recorder.stop()
    MR-->>VD: "onstop, phase = processing"
    VD->>VD: blobToBase64(chunks)
    VD->>tRPC: "voiceInput.transcribe({ audioBase64, mimeType })"
    tRPC->>OAI: POST /v1/audio/transcriptions
    OAI-->>tRPC: "{ text }"
    tRPC-->>VD: "{ text }"
    VD->>T: insertTranscript(text)
    T->>T: dispatch VOICE_DICTATION_INSERT_EVENT
    T-->>VD: "handled = true"
    VD-->>U: "phase = success, idle after 1.4s"
Loading
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts:23-34
**`getEditableElement` may insert text into the wrong element**

`document.activeElement` is returned without checking that it is a descendant of `targetElement`. When the `TiptapPromptEditor` event handler is absent (editor unmounted or not yet attached), the fallback path calls `getEditableElement(chatTargetElement)`. If at that moment `document.activeElement` is a `<textarea>` or `<input>` elsewhere (e.g., an xterm helper textarea that triggered the remembered-target fallback), `insertTextIntoEditable` will write the dictated transcript into that unrelated element instead of the intended chat input.

### Issue 2 of 3
apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts:49-54
`document.execCommand("insertText")` is deprecated and removed from the HTML Living Standard; Chromium/Electron may silently no-op or drop it in a future update. For content-editable elements, prefer `document.getSelection()` + `Range.insertNode()` or a direct DOM mutation with a dispatched `input` event as the fallback.

```suggestion
	if (!element.isContentEditable) {
		return false;
	}

	element.focus();
	const selection = window.getSelection();
	if (selection && selection.rangeCount > 0) {
		const range = selection.getRangeAt(0);
		range.deleteContents();
		const textNode = document.createTextNode(text);
		range.insertNode(textNode);
		range.setStartAfter(textNode);
		range.collapse(true);
		selection.removeAllRanges();
		selection.addRange(range);
		element.dispatchEvent(new Event("input", { bubbles: true }));
		return true;
	}
	return document.execCommand("insertText", false, text);
```

### Issue 3 of 3
apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.tsx:261-264
**Aggressive permission polling while settings are open**

`permissions.getStatus` is refetched every 2 seconds for the entire time the Voice Control section is visible in settings. On macOS this translates to a `getMediaAccessStatus` IPC call every 2 s, which is unnecessary once the status is `"granted"` or `"denied"` — those states won't change without the user acting in System Settings. Consider polling only while `status` is `"promptable"` or `"unknown"`, and removing `refetchInterval` once a terminal state is observed.

Reviews (1): Last reviewed commit: "Merge upstream main into fork main" | Re-trigger Greptile

Comment on lines +23 to +34
if (activeElement instanceof HTMLInputElement) return activeElement;
if (activeElement instanceof HTMLTextAreaElement) return activeElement;
if (activeElement instanceof HTMLElement && activeElement.isContentEditable) {
return activeElement;
}
return targetElement.querySelector(
"textarea, input, [contenteditable='true']",
);
}

function insertTextIntoEditable(
element: HTMLInputElement | HTMLTextAreaElement | HTMLElement,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 getEditableElement may insert text into the wrong element

document.activeElement is returned without checking that it is a descendant of targetElement. When the TiptapPromptEditor event handler is absent (editor unmounted or not yet attached), the fallback path calls getEditableElement(chatTargetElement). If at that moment document.activeElement is a <textarea> or <input> elsewhere (e.g., an xterm helper textarea that triggered the remembered-target fallback), insertTextIntoEditable will write the dictated transcript into that unrelated element instead of the intended chat input.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts
Line: 23-34

Comment:
**`getEditableElement` may insert text into the wrong element**

`document.activeElement` is returned without checking that it is a descendant of `targetElement`. When the `TiptapPromptEditor` event handler is absent (editor unmounted or not yet attached), the fallback path calls `getEditableElement(chatTargetElement)`. If at that moment `document.activeElement` is a `<textarea>` or `<input>` elsewhere (e.g., an xterm helper textarea that triggered the remembered-target fallback), `insertTextIntoEditable` will write the dictated transcript into that unrelated element instead of the intended chat input.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +49 to +54
if (!element.isContentEditable) {
return false;
}

element.focus();
return document.execCommand("insertText", false, text);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 document.execCommand("insertText") is deprecated and removed from the HTML Living Standard; Chromium/Electron may silently no-op or drop it in a future update. For content-editable elements, prefer document.getSelection() + Range.insertNode() or a direct DOM mutation with a dispatched input event as the fallback.

Suggested change
if (!element.isContentEditable) {
return false;
}
element.focus();
return document.execCommand("insertText", false, text);
if (!element.isContentEditable) {
return false;
}
element.focus();
const selection = window.getSelection();
if (selection && selection.rangeCount > 0) {
const range = selection.getRangeAt(0);
range.deleteContents();
const textNode = document.createTextNode(text);
range.insertNode(textNode);
range.setStartAfter(textNode);
range.collapse(true);
selection.removeAllRanges();
selection.addRange(range);
element.dispatchEvent(new Event("input", { bubbles: true }));
return true;
}
return document.execCommand("insertText", false, text);
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts
Line: 49-54

Comment:
`document.execCommand("insertText")` is deprecated and removed from the HTML Living Standard; Chromium/Electron may silently no-op or drop it in a future update. For content-editable elements, prefer `document.getSelection()` + `Range.insertNode()` or a direct DOM mutation with a dispatched `input` event as the fallback.

```suggestion
	if (!element.isContentEditable) {
		return false;
	}

	element.focus();
	const selection = window.getSelection();
	if (selection && selection.rangeCount > 0) {
		const range = selection.getRangeAt(0);
		range.deleteContents();
		const textNode = document.createTextNode(text);
		range.insertNode(textNode);
		range.setStartAfter(textNode);
		range.collapse(true);
		selection.removeAllRanges();
		selection.addRange(range);
		element.dispatchEvent(new Event("input", { bubbles: true }));
		return true;
	}
	return document.execCommand("insertText", false, text);
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +261 to +264
});
const requestMicrophone =
electronTrpc.permissions.requestMicrophone.useMutation({
onSettled: () => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Aggressive permission polling while settings are open

permissions.getStatus is refetched every 2 seconds for the entire time the Voice Control section is visible in settings. On macOS this translates to a getMediaAccessStatus IPC call every 2 s, which is unnecessary once the status is "granted" or "denied" — those states won't change without the user acting in System Settings. Consider polling only while status is "promptable" or "unknown", and removing refetchInterval once a terminal state is observed.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.tsx
Line: 261-264

Comment:
**Aggressive permission polling while settings are open**

`permissions.getStatus` is refetched every 2 seconds for the entire time the Voice Control section is visible in settings. On macOS this translates to a `getMediaAccessStatus` IPC call every 2 s, which is unnecessary once the status is `"granted"` or `"denied"` — those states won't change without the user acting in System Settings. Consider polling only while `status` is `"promptable"` or `"unknown"`, and removing `refetchInterval` once a terminal state is observed.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (8)
apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts (1)

19-31: ⚡ Quick win

Consider adding a null check for document.activeElement.

The function queries document.activeElement but doesn't verify it's non-null before the instance checks. While activeElement is rarely null in practice, TypeScript's strict null checking and the DOM spec allow it.

🛡️ Proposed defensive null check
 function getEditableElement(
 	targetElement: HTMLElement,
 ): HTMLInputElement | HTMLTextAreaElement | HTMLElement | null {
 	const activeElement = document.activeElement;
+	if (!activeElement) {
+		return targetElement.querySelector(
+			"textarea, input, [contenteditable='true']",
+		);
+	}
 	if (activeElement instanceof HTMLInputElement) return activeElement;
 	if (activeElement instanceof HTMLTextAreaElement) return activeElement;
 	if (activeElement instanceof HTMLElement && activeElement.isContentEditable) {
 		return activeElement;
 	}
 	return targetElement.querySelector(
 		"textarea, input, [contenteditable='true']",
 	);
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts` around lines
19 - 31, getEditableElement currently reads document.activeElement without
guarding against null; add a defensive null check for activeElement before
performing instanceof or isContentEditable checks. Update the function
(getEditableElement) to first assign const activeElement =
document.activeElement; if (!activeElement) return
targetElement.querySelector("textarea, input, [contenteditable='true']");
otherwise proceed with the existing instanceof HTMLInputElement /
HTMLTextAreaElement / isContentEditable checks so TypeScript strict-null checks
are satisfied and runtime nulls are handled.
apps/desktop/src/lib/trpc/routers/voice-input.ts (1)

75-138: 💤 Low value

Consider using UNAUTHORIZED for OpenAI authentication errors.

Lines 121-125 map all OpenAI errors (including 401/403 authentication failures) to BAD_REQUEST. For consistency with line 79 (which uses PRECONDITION_FAILED for missing API key), authentication failures from OpenAI could use UNAUTHORIZED to better signal the error category to clients.

♻️ Optional: distinguish auth errors
 if (!response.ok) {
+	const errorMessage = await readOpenAIError(response);
+	if (response.status === 401 || response.status === 403) {
+		throw new TRPCError({
+			code: "UNAUTHORIZED",
+			message: errorMessage,
+		});
+	}
 	throw new TRPCError({
 		code: "BAD_REQUEST",
-		message: await readOpenAIError(response),
+		message: errorMessage,
 	});
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/desktop/src/lib/trpc/routers/voice-input.ts` around lines 75 - 138, The
current voice dictation mutation maps all non-OK OpenAI responses to a
BAD_REQUEST TRPCError; update the error mapping in the mutation that calls
OPENAI_TRANSCRIPTION_URL (inside the .mutation handler) to detect authentication
failures (response.status === 401 || response.status === 403) and throw a
TRPCError with code "UNAUTHORIZED" (using the same message from await
readOpenAIError(response)); keep other non-OK responses as BAD_REQUEST. Ensure
you only change the error branch after the fetch and before parsing
response.json().
apps/desktop/src/main/windows/main.ts (1)

130-146: ⚡ Quick win

Keep the permission policy as audio-only (current voice input doesn’t request video)

  • The only renderer getUserMedia usage for voice dictation requests audio only (no video), so denying audio+video together won’t affect the current microphone flow.
  • Add a brief rationale comment to document that audio && !video is an intentional privacy/security choice.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/desktop/src/main/windows/main.ts` around lines 130 - 146, Add a brief
inline comment above the window.webContents.session.setPermissionRequestHandler
media-permission branch to state that the app intentionally allows only audio
and explicitly denies combined audio+video for privacy/security—keep the
existing check in the permission === "media" branch that computes mediaTypes and
calls callback(mediaTypes.includes("audio") && !mediaTypes.includes("video"));
do not change logic, only add the explanatory comment near the mediaTypes
handling to document that current getUserMedia use is audio-only (voice
dictation) and video is intentionally disallowed.
apps/desktop/src/lib/trpc/routers/settings/voice-input.test.ts (1)

49-53: ⚡ Quick win

Make migration path resolution independent of process.cwd().

Using process.cwd() here can break when tests run from apps/desktop (or any non-repo-root cwd). Resolve from the test file location instead to avoid environment-dependent failures.

Proposed fix
+import { dirname } from "node:path";
+import { fileURLToPath } from "node:url";
 import { readFileSync } from "node:fs";
 import { resolve } from "node:path";
@@
 function applyVoiceInputMigration() {
+	const currentDir = dirname(fileURLToPath(import.meta.url));
 	const migrationSql = readFileSync(
 		resolve(
-			process.cwd(),
-			"packages/local-db/drizzle/0042_add_voice_input_enabled.sql",
+			currentDir,
+			"../../../../../../../packages/local-db/drizzle/0042_add_voice_input_enabled.sql",
 		),
 		"utf8",
 	);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/desktop/src/lib/trpc/routers/settings/voice-input.test.ts` around lines
49 - 53, The test currently resolves the migration SQL using process.cwd() when
reading migrationSql, which makes the path environment-dependent; change the
resolution to be relative to the test file instead (use __dirname or convert
import.meta.url to a file path) and update the resolve(...) call that wraps
readFileSync so it builds the path from the test module location to
packages/local-db/drizzle/0042_add_voice_input_enabled.sql; keep the
readFileSync(migrationSql) usage but replace process.cwd() with a path computed
from the test file's directory so tests run correctly regardless of the current
working directory.
apps/desktop/src/renderer/routes/_authenticated/_dashboard/layout.tsx (1)

131-132: 💤 Low value

Verify event listener capture phase consistency.

The keyup listener uses capture phase (true) while the blur listener does not. This is likely intentional since keyup needs to intercept before other handlers, and blur follows standard focus event handling. Confirm this mixed approach aligns with the intended press-and-hold behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/desktop/src/renderer/routes/_authenticated/_dashboard/layout.tsx` around
lines 131 - 132, The two event listeners use inconsistent capture settings:
window.addEventListener("keyup", stopFromRelease, true) uses capture while
window.addEventListener("blur", stopFromRelease) does not; confirm intended
press-and-hold behavior and make the phase consistent. If you need to intercept
blur during capture like keyup, add the third arg true to the blur listener
(referencing stopFromRelease), otherwise remove the capture flag from the keyup
listener so both use the bubble phase; update whichever call (the
addEventListener for "keyup" or "blur") so both use the same capture boolean and
add a brief inline comment explaining why capture was chosen for
stopFromRelease.
apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.tsx (1)

257-261: 💤 Low value

Verify polling interval for permission status.

The 2-second refetchInterval for microphone permission status might be aggressive. While gated by showVoiceInput, confirm this polling frequency is acceptable for permission checks, especially on systems where these checks may incur non-trivial overhead.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.tsx`
around lines 257 - 261, The permission polling in
electronTrpc.permissions.getStatus.useQuery is currently set to refetchInterval:
2000 while only gated by showVoiceInput; reduce the polling frequency or switch
to event-driven checks to avoid unnecessary overhead — e.g., change
refetchInterval to a less aggressive value (like 10_000+ ms) or remove periodic
polling and trigger a refetch when the component mounts, gains focus, or when
showVoiceInput toggles; update the options passed to useQuery accordingly so
microphone permission checks are less frequent and only run when needed.
apps/desktop/src/renderer/hotkeys/stores/keyboardLayoutStore.ts (1)

52-69: ⚡ Quick win

Import/subscribe failures escape the retry path.

onError only handles failures emitted by an established subscription. If await import("renderer/lib/trpc-client") (or the synchronous subscribe call) rejects/throws, the promise from startKeyboardLayoutSync() rejects and is swallowed by void, so there's no retry and map stays null indefinitely. Wrapping the body in try/catch that schedules the same backoff retry would make startup resilient.

♻️ Proposed guard
 async function startKeyboardLayoutSync(): Promise<void> {
-	const { electronTrpcClient } = await import("renderer/lib/trpc-client");
-	electronTrpcClient.keyboardLayout.changes.subscribe(undefined, {
-		onData: (data) => {
-			retryAttempt = 0;
-			applySnapshot(data);
-		},
-		onError: (err) => {
-			console.error("[keyboardLayoutStore] subscription error:", err);
-			const idx = Math.min(retryAttempt, RETRY_BACKOFF_MS.length - 1);
-			const delay = RETRY_BACKOFF_MS[idx] ?? 10_000;
-			retryAttempt++;
-			setTimeout(() => {
-				void startKeyboardLayoutSync();
-			}, delay);
-		},
-	});
+	const scheduleRetry = (err: unknown) => {
+		console.error("[keyboardLayoutStore] subscription error:", err);
+		const idx = Math.min(retryAttempt, RETRY_BACKOFF_MS.length - 1);
+		const delay = RETRY_BACKOFF_MS[idx] ?? 10_000;
+		retryAttempt++;
+		setTimeout(() => {
+			void startKeyboardLayoutSync();
+		}, delay);
+	};
+	try {
+		const { electronTrpcClient } = await import("renderer/lib/trpc-client");
+		electronTrpcClient.keyboardLayout.changes.subscribe(undefined, {
+			onData: (data) => {
+				retryAttempt = 0;
+				applySnapshot(data);
+			},
+			onError: scheduleRetry,
+		});
+	} catch (err) {
+		scheduleRetry(err);
+	}
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/desktop/src/renderer/hotkeys/stores/keyboardLayoutStore.ts` around lines
52 - 69, startKeyboardLayoutSync currently only retries on subscription onError
but will fail-fast if the dynamic import or the synchronous subscribe call
throws; wrap the entire body of startKeyboardLayoutSync in a try/catch so any
thrown/rejected error is caught, log the error, compute the same backoff delay
using retryAttempt and RETRY_BACKOFF_MS, increment retryAttempt, and schedule a
retry by calling startKeyboardLayoutSync after that delay; keep the existing
onError retry path for subscription errors and ensure applySnapshot and
electronTrpcClient.keyboardLayout.changes.subscribe remain unchanged inside the
try block.
apps/desktop/src/renderer/hotkeys/hooks/useRecordHotkeys/useRecordHotkeys.ts (1)

108-120: ⚡ Quick win

Drop FnLock from the unsupported-Fn gate (it’s not exposed via KeyboardEvent)

getUnsupportedShortcutReason treats FnLock the same as transient Fn, but KeyboardEvent.getModifierState("FnLock") isn’t supported/exposed in browsers (including macOS Chromium), so the “FnLock blocks recording every shortcut” case is unlikely. If fnActive ever does evaluate true on some platform, this check runs before the Backspace/Delete unassign branch, which would then be blocked too.

Consider gating on getModifierState("Fn") only (or removing the FnLock check).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/desktop/src/renderer/hotkeys/hooks/useRecordHotkeys/useRecordHotkeys.ts`
around lines 108 - 120, The gate that sets fnActive should stop checking for
"FnLock" because KeyboardEvent.getModifierState("FnLock") isn’t exposed; update
the fnActive assignment to only use event.getModifierState?.("Fn") === true
(remove the "FnLock" check) so the existing logic in
getUnsupportedShortcutReason (the fnActive, hasNonFnKey,
isFnShortcutToken(key|code), UNSUPPORTED_FN_SHORTCUT_REASON branch) continues to
work correctly and does not incorrectly block the Backspace/Delete unassign
branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.spec/improvements/SUPER-869/BRIEF.md:
- Around line 47-53: The fenced code block containing keys id, label,
description, command and promptCommand lacks a language specifier; update the
opening backticks for that block (the triple-tick before the lines with id:
"droid" ...) to include a language (for example ```yaml or ```typescript) so the
snippet is markdown-compliant and gets proper syntax highlighting.
- Around line 32-41: The fenced code block containing the keys presetId, label,
description, command, args, promptTransport, promptArgs, and env should include
a language specifier; update the opening fence from ``` to ```typescript (or
another appropriate language) so the block becomes ```typescript and enables
proper syntax highlighting for the snippet in BRIEF.md.

In
`@apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.tsx`:
- Around line 408-423: Update the conditional className/markup for the <p
id="voice-input-status"> so that when setVoiceInputEnabled.isError is true the
rendered element includes the explicit "select-text cursor-text" classes (e.g.,
by appending these classes to the error branch of the className expression or
wrapping the error string in a span with those classes); keep the existing
non-error classes ("text-xs text-destructive" vs "text-xs
text-muted-foreground") for other states and only apply select-text cursor-text
to the error message rendered by setVoiceInputEnabled.isError.

---

Nitpick comments:
In `@apps/desktop/src/lib/trpc/routers/settings/voice-input.test.ts`:
- Around line 49-53: The test currently resolves the migration SQL using
process.cwd() when reading migrationSql, which makes the path
environment-dependent; change the resolution to be relative to the test file
instead (use __dirname or convert import.meta.url to a file path) and update the
resolve(...) call that wraps readFileSync so it builds the path from the test
module location to packages/local-db/drizzle/0042_add_voice_input_enabled.sql;
keep the readFileSync(migrationSql) usage but replace process.cwd() with a path
computed from the test file's directory so tests run correctly regardless of the
current working directory.

In `@apps/desktop/src/lib/trpc/routers/voice-input.ts`:
- Around line 75-138: The current voice dictation mutation maps all non-OK
OpenAI responses to a BAD_REQUEST TRPCError; update the error mapping in the
mutation that calls OPENAI_TRANSCRIPTION_URL (inside the .mutation handler) to
detect authentication failures (response.status === 401 || response.status ===
403) and throw a TRPCError with code "UNAUTHORIZED" (using the same message from
await readOpenAIError(response)); keep other non-OK responses as BAD_REQUEST.
Ensure you only change the error branch after the fetch and before parsing
response.json().

In `@apps/desktop/src/main/windows/main.ts`:
- Around line 130-146: Add a brief inline comment above the
window.webContents.session.setPermissionRequestHandler media-permission branch
to state that the app intentionally allows only audio and explicitly denies
combined audio+video for privacy/security—keep the existing check in the
permission === "media" branch that computes mediaTypes and calls
callback(mediaTypes.includes("audio") && !mediaTypes.includes("video")); do not
change logic, only add the explanatory comment near the mediaTypes handling to
document that current getUserMedia use is audio-only (voice dictation) and video
is intentionally disallowed.

In
`@apps/desktop/src/renderer/hotkeys/hooks/useRecordHotkeys/useRecordHotkeys.ts`:
- Around line 108-120: The gate that sets fnActive should stop checking for
"FnLock" because KeyboardEvent.getModifierState("FnLock") isn’t exposed; update
the fnActive assignment to only use event.getModifierState?.("Fn") === true
(remove the "FnLock" check) so the existing logic in
getUnsupportedShortcutReason (the fnActive, hasNonFnKey,
isFnShortcutToken(key|code), UNSUPPORTED_FN_SHORTCUT_REASON branch) continues to
work correctly and does not incorrectly block the Backspace/Delete unassign
branch.

In `@apps/desktop/src/renderer/hotkeys/stores/keyboardLayoutStore.ts`:
- Around line 52-69: startKeyboardLayoutSync currently only retries on
subscription onError but will fail-fast if the dynamic import or the synchronous
subscribe call throws; wrap the entire body of startKeyboardLayoutSync in a
try/catch so any thrown/rejected error is caught, log the error, compute the
same backoff delay using retryAttempt and RETRY_BACKOFF_MS, increment
retryAttempt, and schedule a retry by calling startKeyboardLayoutSync after that
delay; keep the existing onError retry path for subscription errors and ensure
applySnapshot and electronTrpcClient.keyboardLayout.changes.subscribe remain
unchanged inside the try block.

In `@apps/desktop/src/renderer/routes/_authenticated/_dashboard/layout.tsx`:
- Around line 131-132: The two event listeners use inconsistent capture
settings: window.addEventListener("keyup", stopFromRelease, true) uses capture
while window.addEventListener("blur", stopFromRelease) does not; confirm
intended press-and-hold behavior and make the phase consistent. If you need to
intercept blur during capture like keyup, add the third arg true to the blur
listener (referencing stopFromRelease), otherwise remove the capture flag from
the keyup listener so both use the bubble phase; update whichever call (the
addEventListener for "keyup" or "blur") so both use the same capture boolean and
add a brief inline comment explaining why capture was chosen for
stopFromRelease.

In
`@apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.tsx`:
- Around line 257-261: The permission polling in
electronTrpc.permissions.getStatus.useQuery is currently set to refetchInterval:
2000 while only gated by showVoiceInput; reduce the polling frequency or switch
to event-driven checks to avoid unnecessary overhead — e.g., change
refetchInterval to a less aggressive value (like 10_000+ ms) or remove periodic
polling and trigger a refetch when the component mounts, gains focus, or when
showVoiceInput toggles; update the options passed to useQuery accordingly so
microphone permission checks are less frequent and only run when needed.

In `@apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts`:
- Around line 19-31: getEditableElement currently reads document.activeElement
without guarding against null; add a defensive null check for activeElement
before performing instanceof or isContentEditable checks. Update the function
(getEditableElement) to first assign const activeElement =
document.activeElement; if (!activeElement) return
targetElement.querySelector("textarea, input, [contenteditable='true']");
otherwise proceed with the existing instanceof HTMLInputElement /
HTMLTextAreaElement / isContentEditable checks so TypeScript strict-null checks
are satisfied and runtime nulls are handled.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 32d53db8-1f94-4c16-ac23-49b291f7b344

📥 Commits

Reviewing files that changed from the base of the PR and between dbf96b5 and 499f1b3.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (74)
  • .spec/improvements/SUPER-869/BRIEF.md
  • .spec/improvements/SUPER-869/SCOPE.md
  • .spec/improvements/SUPER-869/follow-ups.md
  • apps/desktop/package.json
  • apps/desktop/scripts/build-local-prod.sh
  • apps/desktop/src/lib/trpc/routers/index.ts
  • apps/desktop/src/lib/trpc/routers/permissions.test.ts
  • apps/desktop/src/lib/trpc/routers/permissions/native-permissions.test.ts
  • apps/desktop/src/lib/trpc/routers/permissions/native-permissions.ts
  • apps/desktop/src/lib/trpc/routers/settings/index.ts
  • apps/desktop/src/lib/trpc/routers/settings/voice-input.test.ts
  • apps/desktop/src/lib/trpc/routers/voice-input.ts
  • apps/desktop/src/main/windows/main.ts
  • apps/desktop/src/renderer/components/Chat/ChatInterface/components/ChatInputFooter/ChatInputFooter.tsx
  • apps/desktop/src/renderer/components/Chat/ChatInterface/components/TiptapPromptEditor/TiptapPromptEditor.tsx
  • apps/desktop/src/renderer/hotkeys/display.ts
  • apps/desktop/src/renderer/hotkeys/hooks/index.ts
  • apps/desktop/src/renderer/hotkeys/hooks/useHotkey/useHotkey.test.tsx
  • apps/desktop/src/renderer/hotkeys/hooks/useHotkey/useHotkey.ts
  • apps/desktop/src/renderer/hotkeys/hooks/useRecordHotkeys/index.ts
  • apps/desktop/src/renderer/hotkeys/hooks/useRecordHotkeys/useRecordHotkeys.test.ts
  • apps/desktop/src/renderer/hotkeys/hooks/useRecordHotkeys/useRecordHotkeys.ts
  • apps/desktop/src/renderer/hotkeys/index.ts
  • apps/desktop/src/renderer/hotkeys/registry.test.ts
  • apps/desktop/src/renderer/hotkeys/registry.ts
  • apps/desktop/src/renderer/hotkeys/stores/browserLocalStorage.ts
  • apps/desktop/src/renderer/hotkeys/stores/hotkeyOverridesStore.ts
  • apps/desktop/src/renderer/hotkeys/stores/keyboardLayoutStore.ts
  • apps/desktop/src/renderer/hotkeys/stores/keyboardPreferencesStore.ts
  • apps/desktop/src/renderer/hotkeys/types.ts
  • apps/desktop/src/renderer/hotkeys/utils/binding.ts
  • apps/desktop/src/renderer/hotkeys/utils/chord.ts
  • apps/desktop/src/renderer/hotkeys/utils/fnKey.ts
  • apps/desktop/src/renderer/hotkeys/utils/index.ts
  • apps/desktop/src/renderer/hotkeys/utils/resolveHotkeyFromEvent.test.ts
  • apps/desktop/src/renderer/hotkeys/utils/resolveHotkeyFromEvent.ts
  • apps/desktop/src/renderer/routes/_authenticated/_dashboard/layout.tsx
  • apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/$workspaceId/components/WorkspaceSidebar/components/FilesTab/FilesTab.tsx
  • apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/$workspaceId/components/WorkspaceSidebar/components/FilesTab/hooks/useFilesTabBridge/useFilesTabBridge.ts
  • apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/$workspaceId/hooks/usePaneRegistry/components/ChatPane/components/WorkspaceChatInterface/components/ChatInputFooter/ChatInputFooter.tsx
  • apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/$workspaceId/hooks/usePaneRegistry/components/TerminalPane/TerminalPane.tsx
  • apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.microphone-readiness.test.tsx
  • apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.test.tsx
  • apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.tsx
  • apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.voice-shortcut-link.test.tsx
  • apps/desktop/src/renderer/routes/_authenticated/settings/behavior/page.tsx
  • apps/desktop/src/renderer/routes/_authenticated/settings/keyboard/page.tsx
  • apps/desktop/src/renderer/routes/_authenticated/settings/keyboard/voice-shortcut.test.tsx
  • apps/desktop/src/renderer/routes/_authenticated/settings/utils/settings-search/settings-search.test.ts
  • apps/desktop/src/renderer/routes/_authenticated/settings/utils/settings-search/settings-search.ts
  • apps/desktop/src/renderer/routes/_authenticated/settings/utils/voice-shortcut-links/index.ts
  • apps/desktop/src/renderer/routes/_authenticated/settings/utils/voice-shortcut-links/voice-shortcut-links.ts
  • apps/desktop/src/renderer/screens/main/components/WorkspaceView/ContentView/TabsContent/Terminal/Terminal.tsx
  • apps/desktop/src/renderer/voice-input/components/VoiceDictationIndicator/VoiceDictationIndicator.tsx
  • apps/desktop/src/renderer/voice-input/components/VoiceDictationIndicator/index.ts
  • apps/desktop/src/renderer/voice-input/events.ts
  • apps/desktop/src/renderer/voice-input/focusTracking.ts
  • apps/desktop/src/renderer/voice-input/hooks/useVoiceDictation/index.ts
  • apps/desktop/src/renderer/voice-input/hooks/useVoiceDictation/useVoiceDictation.ts
  • apps/desktop/src/renderer/voice-input/terminalVoiceTargets.ts
  • apps/desktop/src/renderer/voice-input/types.ts
  • apps/desktop/src/renderer/voice-input/useVoiceActivationGuard.test.ts
  • apps/desktop/src/renderer/voice-input/useVoiceActivationGuard.ts
  • apps/desktop/src/renderer/voice-input/voice-preferences.integration.test.tsx
  • apps/desktop/src/renderer/voice-input/voiceDictationTarget.test.ts
  • apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts
  • apps/desktop/src/shared/constants.ts
  • packages/host-service/src/events/event-bus.ts
  • packages/local-db/drizzle/0042_add_voice_input_enabled.sql
  • packages/local-db/drizzle/meta/0042_snapshot.json
  • packages/local-db/drizzle/meta/_journal.json
  • packages/local-db/src/schema/schema.ts
  • packages/ui/src/assets/icons/preset-icons/index.ts
  • packages/workspace-fs/src/watch.ts

Comment thread .spec/improvements/SUPER-869/BRIEF.md Outdated
Comment thread .spec/improvements/SUPER-869/BRIEF.md Outdated
Comment on lines +408 to +423
<p
id="voice-input-status"
className={
setVoiceInputEnabled.isError
? "text-xs text-destructive"
: "text-xs text-muted-foreground"
}
>
{setVoiceInputEnabled.isError
? "Voice preference could not be saved"
: isVoiceInputLoading
? "Loading voice preference"
: voiceInputEnabled
? "Voice control is enabled"
: "Voice control is disabled"}
</p>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add select-text cursor-text classes to error message.

The error text "Voice preference could not be saved" violates the coding guideline requiring error text to be selectable with explicit select-text cursor-text classes in apps/desktop/**/*.{tsx,jsx} files.

🔧 Proposed fix
 <p
   id="voice-input-status"
-  className={
+  className={cn(
+    "select-text cursor-text",
     setVoiceInputEnabled.isError
       ? "text-xs text-destructive"
       : "text-xs text-muted-foreground"
-  }
+  )}
 >

As per coding guidelines: Error text must be selectable by users with explicit select-text cursor-text classes; renderer sets user-select: none on body.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<p
id="voice-input-status"
className={
setVoiceInputEnabled.isError
? "text-xs text-destructive"
: "text-xs text-muted-foreground"
}
>
{setVoiceInputEnabled.isError
? "Voice preference could not be saved"
: isVoiceInputLoading
? "Loading voice preference"
: voiceInputEnabled
? "Voice control is enabled"
: "Voice control is disabled"}
</p>
<p
id="voice-input-status"
className={cn(
"select-text cursor-text",
setVoiceInputEnabled.isError
? "text-xs text-destructive"
: "text-xs text-muted-foreground"
)}
>
{setVoiceInputEnabled.isError
? "Voice preference could not be saved"
: isVoiceInputLoading
? "Loading voice preference"
: voiceInputEnabled
? "Voice control is enabled"
: "Voice control is disabled"}
</p>
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@apps/desktop/src/renderer/routes/_authenticated/settings/behavior/components/BehaviorSettings/BehaviorSettings.tsx`
around lines 408 - 423, Update the conditional className/markup for the <p
id="voice-input-status"> so that when setVoiceInputEnabled.isError is true the
rendered element includes the explicit "select-text cursor-text" classes (e.g.,
by appending these classes to the error branch of the className expression or
wrapping the error string in a span with those classes); keep the existing
non-error classes ("text-xs text-destructive" vs "text-xs
text-muted-foreground") for other states and only apply select-text cursor-text
to the error message rendered by setVoiceInputEnabled.isError.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 issues found across 75 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/desktop/src/renderer/hotkeys/stores/browserLocalStorage.ts">

<violation number="1" location="apps/desktop/src/renderer/hotkeys/stores/browserLocalStorage.ts:6">
P2: Local storage operations are not exception-safe; `window.localStorage` access should be wrapped in try/catch to prevent DOMException from crashing zustand persist flows.</violation>
</file>

<file name="apps/desktop/src/renderer/hotkeys/utils/chord.ts">

<violation number="1" location="apps/desktop/src/renderer/hotkeys/utils/chord.ts:63">
P1: AltGraph text-entry events are normalized into plain-key chords, which can cause false-positive hotkey matches on international keyboards.</violation>
</file>

<file name="apps/desktop/src/lib/trpc/routers/voice-input.ts">

<violation number="1" location="apps/desktop/src/lib/trpc/routers/voice-input.ts:71">
P1: audioBase64 input is decoded into memory before size limits are enforced, enabling potential memory exhaustion via oversized payload</violation>
</file>

<file name="apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts">

<violation number="1" location="apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts:23">
P1: Only use `document.activeElement` when it belongs to the current voice target container. Without a containment check, dictation can be inserted into whichever unrelated input is currently focused.</violation>
</file>
Architecture diagram
sequenceDiagram
    participant User
    participant UI as Desktop UI
    participant Hotkey as Hotkey System
    participant Guard as Voice Activation Guard
    participant Settings as Settings Service
    participant Mic as MediaRecorder
    participant TRPC as tRPC Router
    participant OpenAI as OpenAI API
    participant Target as Voice Dictation Target

    Note over User,Target: Voice Control Dictation Flow

    User->>UI: Enable Voice Control in Settings
    UI->>Settings: setVoiceInputEnabled(true)
    Settings-->>UI: persisted

    User->>UI: Press activation shortcut (e.g. ⌘⇧V)
    UI->>Hotkey: VOICE_INPUT_TOGGLE fires
    Hotkey->>Guard: runVoiceActivationHotkeyEvent()
    
    alt Voice Control disabled
        Guard-->>UI: { status: "disabled" }
        UI->>UI: Show error: "Voice Control is off"
    else No supported target focused
        Guard->>Guard: evaluateVoiceActivationGuard()
        Guard-->>UI: { status: "unsupported-target" }
        UI->>UI: Show error: "Focus chat or terminal"
    else Activation allowed
        Guard->>Guard: check target (chat/terminal)
        Guard-->>UI: { status: "allowed", target }
        UI->>UI: Arm release-to-stop handler
    end

    Note over UI,Target: Dictation Recording (press-and-hold)

    UI->>Mic: start(target)
    Mic->>Mic: getUserMedia({ audio })
    Mic-->>UI: MediaStream
    UI->>Mic: new MediaRecorder(stream)
    Mic-->>UI: Recording chunks

    Note over UI,Target: Shortcut Released

    User->>UI: Release activation key
    UI->>Mic: stop()
    Mic->>UI: onstop fires with audio chunks

    UI->>UI: Build audio Blob from chunks
    UI->>TRPC: transcribe({ audioBase64, mimeType })

    Note over TRPC,OpenAI: App-Side Transcription via tRPC

    TRPC->>TRPC: resolveOpenAIApiKey()
    alt No API key found
        TRPC-->>UI: TRPCError PRECONDITION_FAILED
        UI->>UI: Show error: "Connect OpenAI in Settings"
    else API key exists
        TRPC->>OpenAI: POST /v1/audio/transcriptions
        Note over TRPC,OpenAI: FormData: file, model=gpt-4o-mini-transcribe
        OpenAI-->>TRPC: { text: "transcribed text" }
        TRPC-->>UI: { text }
    end

    Note over UI,Target: Insert Transcription into Target

    UI->>Target: insertTranscript(text)
    
    alt Target is chat
        Target->>Target: dispatch CustomEvent on [data-voice-input-target]
        Note over Target: Chat components listen and insert via editor.chain().insertContent()
        Target-->>UI: true
    else Target is terminal
        Target->>Target: terminalRuntimeRegistry.writeInput()
        Target-->>UI: true
    end

    UI->>UI: Show success indicator (1.4s)
    UI->>UI: Hide indicator, return to idle
Loading

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

const key = normalizeToken(event.code);
if (isIgnorableKey(key)) return null;
// AltGr is reported by Chromium as ctrlKey+altKey on Windows/Linux.
const altGraph = event.getModifierState?.("AltGraph") === true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: AltGraph text-entry events are normalized into plain-key chords, which can cause false-positive hotkey matches on international keyboards.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/desktop/src/renderer/hotkeys/utils/chord.ts, line 63:

<comment>AltGraph text-entry events are normalized into plain-key chords, which can cause false-positive hotkey matches on international keyboards.</comment>

<file context>
@@ -0,0 +1,92 @@
+	const key = normalizeToken(event.code);
+	if (isIgnorableKey(key)) return null;
+	// AltGr is reported by Chromium as ctrlKey+altKey on Windows/Linux.
+	const altGraph = event.getModifierState?.("AltGraph") === true;
+	const mods: string[] = [];
+	if (event.metaKey) mods.push("meta");
</file context>

transcribe: publicProcedure
.input(
z.object({
audioBase64: z.string().min(1),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: audioBase64 input is decoded into memory before size limits are enforced, enabling potential memory exhaustion via oversized payload

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/desktop/src/lib/trpc/routers/voice-input.ts, line 71:

<comment>audioBase64 input is decoded into memory before size limits are enforced, enabling potential memory exhaustion via oversized payload</comment>

<file context>
@@ -0,0 +1,140 @@
+		transcribe: publicProcedure
+			.input(
+				z.object({
+					audioBase64: z.string().min(1),
+					mimeType: z.string().min(1).max(120),
+				}),
</file context>

targetElement: HTMLElement,
): HTMLInputElement | HTMLTextAreaElement | HTMLElement | null {
const activeElement = document.activeElement;
if (activeElement instanceof HTMLInputElement) return activeElement;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Only use document.activeElement when it belongs to the current voice target container. Without a containment check, dictation can be inserted into whichever unrelated input is currently focused.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/desktop/src/renderer/voice-input/voiceDictationTarget.ts, line 23:

<comment>Only use `document.activeElement` when it belongs to the current voice target container. Without a containment check, dictation can be inserted into whichever unrelated input is currently focused.</comment>

<file context>
@@ -0,0 +1,145 @@
+	targetElement: HTMLElement,
+): HTMLInputElement | HTMLTextAreaElement | HTMLElement | null {
+	const activeElement = document.activeElement;
+	if (activeElement instanceof HTMLInputElement) return activeElement;
+	if (activeElement instanceof HTMLTextAreaElement) return activeElement;
+	if (activeElement instanceof HTMLElement && activeElement.isContentEditable) {
</file context>

@@ -0,0 +1,16 @@
import type { StateStorage } from "zustand/middleware";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Local storage operations are not exception-safe; window.localStorage access should be wrapped in try/catch to prevent DOMException from crashing zustand persist flows.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/desktop/src/renderer/hotkeys/stores/browserLocalStorage.ts, line 6:

<comment>Local storage operations are not exception-safe; `window.localStorage` access should be wrapped in try/catch to prevent DOMException from crashing zustand persist flows.</comment>

<file context>
@@ -0,0 +1,16 @@
+export const browserLocalStorage: StateStorage = {
+	getItem: (name) => {
+		if (typeof window === "undefined") return null;
+		return window.localStorage.getItem(name);
+	},
+	removeItem: (name) => {
</file context>

Comment thread apps/desktop/scripts/build-local-prod.sh Outdated
Comment thread apps/desktop/scripts/build-local-prod.sh Outdated
@jylee1-jp
Copy link
Copy Markdown

RAI v2 — Summary

PR-level: BLOCK (0 approved, 1 block, 2 abstain, 0 defer)

Archetype classification

General PR (general) — kind: explicit, confidence: 95%

Large feature PR (+6682/-203, 71 files) adding Voice Control dictation end-to-end: new tRPC routes, DB migration, settings UI, MediaRecorder/OpenAI transcription pipeline, keyboard shortcut registration, and focus tracking. New business logic, API surfaces, and dependencies disqualify small-refactoring. Commit history also reveals unrelated concerns merged in — file navigator lag fix (perf commit + useFilesTabBridge changes) and Factory Droid preset addition — making this a mixed-concern PR; the voice control, performance fix, and preset feature should ideally be split into separate PRs for cleaner review and rollback granularity.

Per-strategy verdicts

preconditions — BLOCK

Files: (no files)
Irreversibility: reversible
Clear conditions:

  • The diff contains multiple unrelated concerns — file navigator lag fix (commits c40da1b, bf6d115, useFilesTabBridge.ts +68/-11, FilesTab.tsx +9/-3), Factory Droid preset addition (b0f27a5, preset-icons/index.ts), fs:events observability (packages/workspace-fs/src/watch.ts, packages/host-service/src/events/event-bus.ts), and build-local-prod.sh (074cad6) — none of which appear in the PR description. Either add a dedicated section for each bundled concern (Why/Scope/Approach) or split them into separate PRs so downstream strategies can evaluate each change against stated intent and assess rollback scope independently. (resolve via: pr_description, pr_comment)
  • The Media > Screen Recording section is explicitly marked 'TODO'. Attach the recording (or an external link) showing the enable flow, settings deep-link navigation, shortcut search placement, and a press-and-hold dictation insertion. A UX-heavy feature of this scope — spanning settings UI, keyboard shortcut editing, and two distinct text-insertion targets — cannot be adequately reviewed for correctness without it. (resolve via: pr_description, pr_comment)
    Findings:
  • [high] Multiple unrelated concerns bundled without description. File navigator lag fix (commits c40da1b, bf6d115), Factory Droid preset (b0f27a5), fs:events observability, and build-local-prod.sh (074cad6) appear in the diff but are not mentioned in the PR description. Downstream strategies have no stated intent to validate these changes against. (PR commit list; useFilesTabBridge.ts, FilesTab.tsx, preset-icons/index.ts, packages/host-service/src/events/event-bus.ts, packages/workspace-fs/src/watch.ts)
  • [medium] Screen recording is explicitly TODO. UX flows (press-and-hold activation, settings deep-link navigation, shortcut search result placement, terminal text insertion without execution) cannot be verified from description text alone. (PR description, Media > Screen Recording)
  • [low] Full lint check (bun run lint) could not complete from author's checkout; a tracked-file Biome check was substituted. CI lint pass status should be confirmed before merge. (PR description, Validation section)

bug-detection — ABSTAIN

Files: apps/desktop/package.json, apps/desktop/src/lib/trpc/routers/index.ts, apps/desktop/src/lib/trpc/routers/permissions.test.ts, apps/desktop/src/lib/trpc/routers/permissions/native-permissions.test.ts, apps/desktop/src/lib/trpc/routers/permissions/native-permissions.ts and 66 more
Reason: Strategy execution failed: [strategy-executor:bug-detection] Claude CLI timed out after 600s

audio-api-security — ABSTAIN (ad_hoc)

Files: apps/desktop/package.json, apps/desktop/src/lib/trpc/routers/index.ts, apps/desktop/src/lib/trpc/routers/permissions.test.ts, apps/desktop/src/lib/trpc/routers/permissions/native-permissions.test.ts, apps/desktop/src/lib/trpc/routers/permissions/native-permissions.ts and 66 more
Reason: Verdict parsing failed: schema validation failed: [
{
"code": "invalid_value",
"values": [
"pr_description",
"commit_message",
"pr_comment"
],
"path": [
"author_judgment_requests",
0,
"resolution_via",
1
],
"message": "Invalid option: expected one of "pr_description"|"commit_message"|"pr_comment""
},
{
"code": "invalid_value",
"values": [
"pr_description",
"commit_message",
"pr_comment"
],
"path": [
"author_judgment_requests",
1,
"resolution_via",
1
],
"message": "Invalid option: expected one of "pr_description"|"commit_message"|"pr_comment""
},
{
"code": "invalid_value",
"values": [
"pr_description",
"commit_message",
"pr_comment"
],
"path": [
"author_judgment_requests",
2,
"resolution_via",
1
],
"message": "Invalid option: expected one of "pr_description"|"commit_message"|"pr_comment""
},
{
"code": "invalid_value",
"values": [
"pr_description",
"commit_message",
"pr_comment"
],
"path": [
"author_judgment_requests",
3,
"resolution_via",
1
],
"message": "Invalid option: expected one of "pr_description"|"commit_message"|"pr_comment""
}
]

Token usage

Model input output cache create cache read
total 18 14,327 96,772 194,467
sonnet 18 14,327 96,772 194,467

@jylee1-jp
Copy link
Copy Markdown

RAI v2 — Author judgment pathway

2 outstanding request(s) across 2 category(ies).

Clarifications

  • The diff includes file navigator lag fix, Factory Droid preset, fs:events observability, and build-local-prod.sh changes that are not described in the PR body. Are these intentionally bundled with this voice control PR? If yes, please add a section covering Why/Scope for each bundled concern. If no, please split them into separate PRs so each can be reviewed and rolled back independently.
    From: preconditions
    Resolve via: pr_description, pr_comment
    Affects verdicts: preconditions

Resolutions

  • The Screen Recording section is marked TODO. Please attach the recording showing: (1) enabling Voice Control in General settings, (2) navigating via deep-link to the Keyboard shortcut row, (3) the Voice Control shortcut appearing below the search bar when searching 'voice', and (4) a dictation insertion into chat or terminal via press-and-hold.
    From: preconditions
    Resolve via: pr_description, pr_comment
    Affects verdicts: preconditions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants