feat: add voice bridge -- talk to AI agents via speech #416

marcusquinn · 2026-02-07T03:22:49Z

Summary

Adds a standalone voice bridge that lets you talk to your AI coding agent via speech
~6-8 second round-trip on Apple Silicon: Mic → Silero VAD → Whisper MLX (1.4s) → OpenCode run --attach (4-6s) → Edge TTS (0.4s) → Speaker
Full tool execution capability -- the agent can edit files, run commands, create PRs, and confirm what it did, all by voice

New Files

.agents/scripts/voice-bridge.py -- Main Python voice bridge (~800 lines)
.agents/scripts/voice-helper.sh -- Shell wrapper with talk/status/benchmark/devices/voices commands

Features

Swappable engines: whisper-mlx / faster-whisper for STT, edge-tts / macos-say / facebookMMS for TTS
Voice exit phrases: Substring matching catches natural endings ("that's all", "all for now", "goodbye", etc.)
STT sanity checking: LLM corrects transcription errors before acting (e.g. "test.txte" → "test.txt")
Natural confirmations: "Ok, I can do that" before tasks, "That's done, we've..." after completion
Session handback: Conversation transcript output on exit so the calling agent knows what happened
TTY detection: Graceful degradation when launched from AI tool's Bash (Esc key unavailable, voice exit still works)
OpenCode integration: Uses opencode run --attach to a running opencode serve for low-latency responses

Usage

voice-helper.sh talk                    # Start voice conversation
voice-helper.sh talk whisper-mlx edge-tts en-GB-SoniaNeural  # Explicit config
voice-helper.sh benchmark               # Test component speeds

Updated Docs

README.md: New "Voice Bridge" section as primary voice feature, S2S pipeline moved to "Advanced"
AGENTS.md: Voice entry updated to reference voice bridge
subagent-index.toon: Added voice-bridge.py and voice-helper.sh entries
speech-to-speech.md: Added Voice Bridge section with quick start guide
.gitignore: Added mlx_models/ (runtime model cache)

Testing

Tested live in OpenCode TUI sessions:

Voice conversation with task execution (file creation, cleanup)
STT correction verified (transcription errors corrected by LLM)
Exit phrase detection verified (substring matching works with natural speech)
TTY guard verified (graceful degradation, no crash in subprocess mode)
Session handback transcript output verified on clean exit

Summary by CodeRabbit

Release Notes

New Features
- Voice bridge for direct speech interaction with AI agents
- Swappable speech-to-text and text-to-speech engines with multiple options
- Voice exit phrases, transcript corrections, and session handback support
- ~6-8 second round-trip latency on Apple Silicon
Documentation
- Updated guides with voice bridge setup and usage examples
- Added feature tables and voice interaction workflows

Standalone voice interface that connects microphone to OpenCode via Silero VAD, Whisper MLX STT, and Edge TTS. ~6-8s round-trip on Apple Silicon with full tool execution capability. Features: - Swappable STT (whisper-mlx, faster-whisper) and TTS (edge-tts, macos-say, facebookMMS) engines - Voice exit phrases with substring matching for natural endings - STT sanity checking via LLM prompt (corrects transcription errors) - Session handback transcript on exit for calling agent context - TTY detection with graceful degradation in TUI subprocess - Natural conversational confirmations for task execution New files: - .agents/scripts/voice-bridge.py (main Python bridge, ~800 lines) - .agents/scripts/voice-helper.sh (shell wrapper with setup/talk/ status/benchmark/voices/devices commands) Updated: README.md, AGENTS.md, subagent-index.toon, speech-to-speech.md, .gitignore

gemini-code-assist · 2026-02-07T03:23:10Z

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the user interaction model by introducing a robust voice interface for AI coding agents. The primary purpose is to enable a more natural and efficient way for users to command and receive feedback from their AI, moving beyond traditional text-based interactions. This change aims to accelerate development workflows by allowing hands-free operation and real-time conversational guidance for complex coding tasks, ultimately making the AI agent more accessible and responsive.

Highlights

Voice Bridge Introduction: A new standalone voice bridge (voice-bridge.py and voice-helper.sh) has been added, enabling users to interact with AI coding agents via speech.
Performance: The system achieves a ~6-8 second round-trip on Apple Silicon, from microphone input to speaker output, facilitating fast conversational interaction.
Agent Capabilities: The AI agent gains full tool execution capability, allowing it to edit files, run commands, create PRs, and confirm actions, all controlled by voice.
Swappable Engines: The voice bridge supports swappable Speech-to-Text (STT) engines (e.g., whisper-mlx, faster-whisper) and Text-to-Speech (TTS) engines (e.g., edge-tts, macos-say, facebookMMS).
Enhanced Interaction: Features like voice exit phrases, LLM-based STT sanity checking (correcting transcription errors), session handback (transcript on exit), and graceful degradation for TTY detection improve user experience.
OpenCode Integration: The bridge integrates with OpenCode using opencode run --attach for low-latency responses, automatically starting opencode serve if not already running.

Changelog

.agents/AGENTS.md
- Updated the 'Voice' section to reference the new voice bridge (voice-helper.sh talk).
.agents/scripts/voice-bridge.py
- Added a new Python script implementing the core logic for the voice bridge, including VAD, STT, LLM integration, and TTS.
- Includes support for multiple STT (WhisperMLX, FasterWhisper, MacOSDictation) and TTS (EdgeTTS, MacOSSay, FacebookMMS) engines.
- Manages audio input/output, speech detection, transcription, LLM querying, and speech synthesis.
- Implements features like voice exit phrases, STT sanity checking, and session transcript handback.
.agents/scripts/voice-helper.sh
- Added a new shell script to simplify interaction with the voice bridge.
- Provides commands for starting voice conversations (talk), listing audio devices (devices), listing TTS voices (voices), checking component status (status), and benchmarking performance (benchmark).
- Includes dependency checks and automatic installation for Python packages and ensures the OpenCode server is running.
.agents/subagent-index.toon
- Updated the 'Voice AI' entry to include voice-bridge in its capabilities.
- Added new entries for voice-helper.sh and voice-bridge.py with their respective commands and descriptions.
.agents/tools/voice/speech-to-speech.md
- Introduced a new 'Voice Bridge (Recommended)' section, detailing its quick start, architecture, round-trip performance, and features.
- The existing speech-to-speech pipeline is now positioned as an 'Advanced' option for more complex use cases.
.gitignore
- Added mlx_models/ to the ignore list to prevent committing ML model caches downloaded at runtime.
README.md
- Refactored the 'Voice Integration' section into 'Voice Bridge - Talk to Your AI Agent', providing a prominent quick start guide and feature overview for the new voice bridge.
- The detailed information about the traditional Speech-to-Speech pipeline has been moved to an 'Advanced' section.

Activity

Initial implementation of the voice bridge feature, including core Python script and helper shell script.
Documentation updates across README.md, AGENTS.md, subagent-index.toon, and speech-to-speech.md to reflect the new voice bridge and its usage.
Testing was conducted live in OpenCode TUI sessions, verifying voice conversation with task execution, STT correction, exit phrase detection, TTY guard, and session handback transcript output.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-02-07T03:23:14Z

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 5 minutes and 12 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Walkthrough

This PR introduces a comprehensive Voice Bridge system enabling real-time voice interactions with AI agents. The implementation adds a Python orchestration layer with pluggable STT/TTS/VAD components, a shell script wrapper for environment setup and command-line access, and documentation updates explaining the new voice interface.

Changes

Cohort / File(s)	Summary
Voice Bridge Core Implementation `.agents/scripts/voice-bridge.py`, `.agents/scripts/voice-helper.sh`	New voice bridge system with Silero VAD, pluggable STT engines (Whisper MLX, Faster Whisper, macOS Dictation), TTS engines (Edge TTS, macOS Say, Facebook MMS), OpenCode LLM integration, multi-threaded audio capture, transcription, and playback coordination with CLI argument parsing and device enumeration.
Voice Bridge Documentation & Registry `.agents/AGENTS.md`, `.agents/subagent-index.toon`, `.agents/tools/voice/speech-to-speech.md`	Added voice bridge references and new documentation sections describing the simplified voice interaction architecture, command examples, latency estimates, and features; updated registry entries for new scripts and tools/voice description.
Project Configuration `.gitignore`, `README.md`	Added mlx_models/ directory to ignore patterns; comprehensively rewrote voice section of README with Voice Bridge as primary interface, updated Quick Start commands, added performance metrics (~6-8 second round-trip), and introduced feature tables alongside advanced pipeline documentation.

Sequence Diagram

sequenceDiagram
    participant User
    participant VoiceBridge as Voice Bridge
    participant VAD as Silero VAD
    participant STT as STT Engine
    participant OpenCode as OpenCode LLM
    participant TTS as TTS Engine
    participant Speaker

    User->>VoiceBridge: Start voice session
    VoiceBridge->>VoiceBridge: Initialize VAD, STT, TTS
    
    loop Voice Interaction Loop
        VoiceBridge->>VAD: Stream audio chunks
        VAD-->>VoiceBridge: Speech detected?
        
        alt Speech Detected
            VoiceBridge->>STT: Transcribe audio_int16
            STT-->>VoiceBridge: Transcript text
            VoiceBridge->>VoiceBridge: Accumulate transcript history
            
            VoiceBridge->>OpenCode: Query with transcript
            OpenCode-->>VoiceBridge: Response text
            
            VoiceBridge->>TTS: Synthesize response
            TTS-->>VoiceBridge: Audio stream ready
            
            VoiceBridge->>Speaker: Play audio output
            Speaker-->>VoiceBridge: Playback complete
        else Silence Detected
            VoiceBridge->>VoiceBridge: Check exit phrase / timeout
        end
    end
    
    User->>VoiceBridge: Exit (Esc/Ctrl+C)
    VoiceBridge-->>User: Emit transcript handback

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

The Python voice bridge introduces 834 lines of heterogeneous logic spanning VAD detection, multiple STT/TTS engine implementations, OpenCode server communication, multi-threaded audio processing, and CLI orchestration. The shell script adds 423 lines of system-level dependency management, server lifecycle control, and command dispatching. The breadth of new public APIs (14+ functions/classes), integration density, and need to validate correct orchestration of multiple components across platform-specific implementations (macOS vs. general) requires careful review of each subsystem.

Poem

🎤 Speak your code dreams to the silicon,
A voice bridge sings 'tween human and machine—
Whisper, transcribe, reason, and respond,
Round-trip through latency's now-silent stream. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 37.25% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately captures the main addition: a voice bridge feature enabling speech-based interaction with AI agents. It is concise, specific, and clearly reflects the primary change across all modified files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/voice-bridge

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-07T03:23:25Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 25 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Feb 7 03:23:19 UTC 2026: Code review monitoring started
Sat Feb 7 03:23:20 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 25
Sat Feb 7 03:23:20 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Sat Feb 7 03:23:22 UTC 2026: Codacy analysis completed with auto-fixes
Sat Feb 7 03:23:23 UTC 2026: Applied 1 automatic fixes

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 25
VULNERABILITIES: 0

Generated on: Sat Feb 7 03:23:24 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

gemini-code-assist

Code Review

This pull request introduces a significant new feature: a voice bridge to allow users to interact with AI agents via speech. The implementation includes a comprehensive Python script for the bridge logic, a helper shell script for user interaction, and updates to documentation.

The core Python script is well-structured with separate classes for different components like VAD, STT, and TTS. However, I've identified a few areas for improvement:

There's a critical issue with the SIGINT handler that prevents graceful shutdown.
The voice barge-in feature is currently non-functional due to the audio callback logic.
There are several instances of unused imports and dead code that should be cleaned up for better maintainability.
The benchmark command in the helper script uses a deprecated and insecure method for creating temporary files.

I've left specific comments with suggestions to address these points. Overall, this is a great addition, and with these fixes, it will be more robust and maintainable.

gemini-code-assist · 2026-02-07T03:25:38Z

.agents/scripts/voice-bridge.py

+        if self.is_speaking:
+            return


The voice barge-in feature appears to be non-functional. These lines cause the _audio_callback to return immediately during TTS playback, preventing any voice input from being processed. This makes voice-triggered barge-in impossible, and the associated logic (e.g., self.barge_in, _barge_in_frames) is effectively dead code. The comment on line 411 is also misleading, as using headphones won't enable barge-in with the current code.

To fix this, you would need a more sophisticated approach, likely involving acoustic echo cancellation (AEC). Given the complexity, I recommend either removing the non-functional barge-in code and comments or disabling this check and clearly documenting that it requires headphones and may have false positives.

gemini-code-assist · 2026-02-07T03:25:38Z

.agents/scripts/voice-bridge.py

+        return
+
+    # Handle Ctrl+C gracefully
+    signal.signal(signal.SIGINT, lambda s, f: sys.exit(0))


The custom SIGINT handler calls sys.exit(0), which terminates the process immediately. This will prevent the finally block in the run method from executing, so the graceful shutdown logic, including _print_handback(), will be skipped. This can lead to loss of the session transcript. Removing this line will allow the default KeyboardInterrupt to be raised, which is handled correctly in the run method.

gemini-code-assist · 2026-02-07T03:25:38Z

.agents/scripts/voice-helper.sh

+# edge-tts
+try:
+    import asyncio, edge_tts, tempfile, os
+    async def t():
+        c = edge_tts.Communicate(text, 'en-US-GuyNeural')
+        f = tempfile.mktemp(suffix='.mp3')
+        await c.save(f)
+        os.unlink(f)
+    start = time.time()
+    asyncio.run(t())
+    print(f'  edge-tts:        {time.time()-start:.3f}s')
+except Exception as e:
+    print(f'  edge-tts:        FAILED ({e})')


The Python code for the edge-tts benchmark uses tempfile.mktemp, which is insecure and has been deprecated. It's vulnerable to a race condition where another process could create a file with the same name between the time mktemp returns the name and your script attempts to use it. You should use tempfile.NamedTemporaryFile within a with block to create temporary files securely.

Suggested change

# edge-tts

try:

import asyncio, edge_tts, tempfile, os

async def t():

c = edge_tts.Communicate(text, 'en-US-GuyNeural')

f = tempfile.mktemp(suffix='.mp3')

await c.save(f)

os.unlink(f)

start = time.time()

asyncio.run(t())

print(f' edge-tts: {time.time()-start:.3f}s')

except Exception as e:

print(f' edge-tts: FAILED ({e})')

# edge-tts

try:

import asyncio, edge_tts, tempfile

async def t():

c = edge_tts.Communicate(text, 'en-US-GuyNeural')

with tempfile.NamedTemporaryFile(suffix='.mp3', delete=True) as f:

await c.save(f.name)

start = time.time()

asyncio.run(t())

print(f' edge-tts: {time.time()-start:.3f}s')

except Exception as e:

print(f' edge-tts: FAILED ({e})')

gemini-code-assist · 2026-02-07T03:25:38Z

.agents/scripts/voice-helper.sh

+# macos say
+try:
+    import subprocess, tempfile, os
+    f = tempfile.mktemp(suffix='.aiff')
+    start = time.time()
+    subprocess.run(['say', '-v', 'Samantha', '-o', f, text], check=True, capture_output=True)
+    print(f'  macos-say:       {time.time()-start:.3f}s')
+    os.unlink(f)
+except Exception as e:
+    print(f'  macos-say:       FAILED ({e})')


The Python code for the macos-say benchmark uses tempfile.mktemp, which is insecure and has been deprecated due to race condition vulnerabilities. Please use tempfile.NamedTemporaryFile to ensure temporary files are created securely.

Suggested change

# macos say

try:

import subprocess, tempfile, os

f = tempfile.mktemp(suffix='.aiff')

start = time.time()

subprocess.run(['say', '-v', 'Samantha', '-o', f, text], check=True, capture_output=True)

print(f' macos-say: {time.time()-start:.3f}s')

os.unlink(f)

except Exception as e:

print(f' macos-say: FAILED ({e})')

# macos say

try:

import subprocess, tempfile

with tempfile.NamedTemporaryFile(suffix='.aiff', delete=True) as f:

start = time.time()

subprocess.run(['say', '-v', 'Samantha', '-o', f.name, text], check=True, capture_output=True)

print(f' macos-say: {time.time()-start:.3f}s')

except Exception as e:

print(f' macos-say: FAILED ({e})')

gemini-code-assist · 2026-02-07T03:25:38Z

.agents/scripts/voice-bridge.py

+import io
+import json
+import logging
+import os
+import signal
+import subprocess
+import sys
+import tempfile
+import threading
+import time
+import wave
+from collections import deque
+from pathlib import Path
+from queue import Empty, Queue


Several modules are imported but not used in the file: io, json, wave, pathlib.Path, queue.Empty, and queue.Queue. Removing these unused imports will improve code cleanliness and reduce the script's memory footprint.

gemini-code-assist · 2026-02-07T03:25:38Z

.agents/scripts/voice-bridge.py

+    def _start_server(self):
+        """Start opencode serve in background."""
+        log.info(f"Starting opencode serve on port {self.server_port}...")
+        proc = subprocess.Popen(
+            ["opencode", "serve", "--port", str(self.server_port)],
+            stdout=subprocess.DEVNULL,
+            stderr=subprocess.DEVNULL,
+            cwd=self.cwd,
+        )
+        # Wait for server to be ready
+        for _ in range(20):
+            time.sleep(0.5)
+            try:
+                import urllib.request
+
+                req = urllib.request.Request(self.server_url, method="HEAD")
+                urllib.request.urlopen(req, timeout=1)
+                self.use_attach = True
+                log.info("OpenCode server started")
+                return proc
+            except Exception:
+                continue
+        log.warning("OpenCode server failed to start, using standalone mode")
+        return proc


The _start_server method is defined but never called, making it dead code. If it's not intended for future use, it should be removed to improve maintainability.

gemini-code-assist · 2026-02-07T03:25:38Z

.agents/scripts/voice-bridge.py

+            for line in lines:
+                stripped = line.strip()
+                if stripped.startswith("> Build+"):
+                    continue
+                if stripped.startswith("$") and "aidevops" in stripped:
+                    continue
+                if stripped.startswith("aidevops v"):
+                    continue
+                if not stripped:
+                    continue
+                clean_lines.append(stripped)
+            response = " ".join(clean_lines)


The response from opencode is cleaned by stripping specific hardcoded prefixes. This approach is fragile and might break if the output format of opencode changes. Consider checking if opencode can provide a more structured output format (e.g., JSON) to make parsing more robust. If not, adding a comment to explain why this manual parsing is necessary would be helpful for future maintenance.

- Remove unused imports (io, json, wave, Path, Empty, Queue, signal) - Remove dead _start_server method (server started by voice-helper.sh) - Remove SIGINT handler that prevented graceful shutdown with handback - Remove non-functional barge-in code and BARGE_IN_FRAMES constant - Fix tempfile.mktemp (deprecated/insecure) to NamedTemporaryFile in benchmark - Add comment about fragile opencode output parsing

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.agents/subagent-index.toon (1)
81-122: ⚠️ Potential issue | 🟡 Minor

TOON header count is stale: scripts[39] but there are 41 entries.

After adding voice-helper.sh and voice-bridge.py, the actual script count in lines 82–122 is 41, not 39. Update the header to match.
-<!--TOON:scripts[39]{name,purpose}:
+<!--TOON:scripts[41]{name,purpose}:

🤖 Fix all issues with AI agents

In @.agents/scripts/voice-bridge.py:
- Around line 17-35: Remove the unused imports flagged by static analysis:
delete io, json, wave, Path (from pathlib), Empty, and Queue from the
top-of-file import block in .agents/scripts/voice-bridge.py; update the import
lines that currently include these symbols (e.g., remove Path from "from pathlib
import Path" and remove "Empty, Queue" from "from queue import Empty, Queue")
and run tests/lint to ensure nothing else in the file references these symbols
before committing.
- Around line 283-306: Delete the dead helper method _start_server from the
OpenCodeBridge class: it's never invoked (OpenCodeBridge.__init__ only calls
_check_server() and server startup is handled by
voice-helper.sh:ensure_opencode_server), so remove the entire _start_server
method definition to eliminate unused code and related imports/variables if they
become unused after its removal.

In @.agents/scripts/voice-helper.sh:
- Around line 265-343: The cmd_benchmark function leaks temp files when the
macOS "say" subprocess raises because tempfile.mktemp() is used and os.unlink(f)
is skipped; replace the tempfile.mktemp() usage and ensure the file is always
removed by wrapping the subprocess.run(['say', ...]) call in a try/finally that
unlinks the temp path (or switch to tempfile.NamedTemporaryFile(delete=True) and
write/read via its .name), updating the macos-say block so that the temporary
file is always cleaned up even on failure.

In @.agents/tools/voice/speech-to-speech.md:
- Around line 289-313: The phrase "The full S2S pipeline below is for advanced
use cases" in the "Voice Bridge (Recommended)" section is incorrect; update that
sentence to reference "above" instead of "below" or move the entire "Voice
Bridge (Recommended)" section so it appears before the S2S pipeline;
specifically edit the sentence that reads "The full S2S pipeline below is for
advanced use cases" in the Voice Bridge header block so it either says "above"
or reflow the document so the S2S pipeline content appears after the Voice
Bridge section.

🧹 Nitpick comments (3)

.agents/scripts/voice-helper.sh (1)
127-153: ensure_opencode_server silently succeeds on timeout — downstream will run without a server.

When the server fails to start within the 20-attempt window, the function logs a warning but returns 0. cmd_talk then proceeds assuming the server is available, and the Python bridge will fall back to standalone mode — so this is technically handled downstream. However, the warning message "continuing anyway..." could be clearer that it's falling back to cold-start mode, which is significantly slower (~30s vs ~6s).
Suggested clarity improvement
-    print_warning "OpenCode server slow to start, continuing anyway..."
+    print_warning "OpenCode server not ready after 10s — voice bridge will use cold-start mode (~30s per query)"
.agents/scripts/voice-bridge.py (2)
140-181: EdgeTTS speak(): the NamedTemporaryFile pattern is safe but the noqa comment is a no-op.

Line 144's # noqa: F401 suppresses a rule that isn't enabled in the project's Ruff config (confirmed by Ruff's RUF100 warning). The import itself is intentionally used as a fail-fast availability check — that's fine, but the noqa directive should be removed.

The temp file handling (lines 161–179) is correct: the with block closes the file handle before communicate.save() writes to it, and os.unlink in the finally block ensures cleanup.
Remove stale noqa
-        import edge_tts  # noqa: F401 - verify import
+        import edge_tts  # verify availability at init time
522-545: Barge-in code is effectively dead — mic is muted during TTS playback.

Lines 409–413 return early from _audio_callback when self.is_speaking is True, so _barge_in_frames is never populated. The check at line 539 (if self.barge_in and self._barge_in_frames) will always be False.

The comment at lines 409–411 explains this is intentional (no echo cancellation → must mute mic), but the remaining barge-in infrastructure at lines 522–545 and the instance variables at lines 398–400 are dead code. Consider either removing the barge-in scaffolding or adding a # TODO: enable when echo cancellation is available comment so future maintainers know it's deliberately dormant.

.agents/scripts/voice-bridge.py

.agents/scripts/voice-helper.sh

.agents/tools/voice/speech-to-speech.md

github-actions · 2026-02-07T03:29:07Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 25 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Feb 7 03:29:00 UTC 2026: Code review monitoring started
Sat Feb 7 03:29:00 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 25
Sat Feb 7 03:29:00 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Sat Feb 7 03:29:02 UTC 2026: Codacy analysis completed with auto-fixes
Sat Feb 7 03:29:03 UTC 2026: Applied 1 automatic fixes

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 25
VULNERABILITIES: 0

Generated on: Sat Feb 7 03:29:05 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

- TOON scripts[39] → scripts[41] after adding voice-bridge.py and voice-helper.sh - Fix 'below' → 'above' in speech-to-speech.md Voice Bridge section

github-actions · 2026-02-07T03:48:11Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 25 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Feb 7 03:48:04 UTC 2026: Code review monitoring started
Sat Feb 7 03:48:05 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 25
Sat Feb 7 03:48:05 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Sat Feb 7 03:48:07 UTC 2026: Codacy analysis completed with auto-fixes
Sat Feb 7 03:48:08 UTC 2026: Applied 1 automatic fixes

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 25
VULNERABILITIES: 0

Generated on: Sat Feb 7 03:48:10 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-02-07T03:48:32Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

gemini-code-assist bot reviewed Feb 7, 2026

View reviewed changes

coderabbitai bot requested changes Feb 7, 2026

View reviewed changes

.agents/scripts/voice-bridge.py Show resolved Hide resolved

.agents/scripts/voice-bridge.py Outdated Show resolved Hide resolved

.agents/scripts/voice-helper.sh Show resolved Hide resolved

.agents/tools/voice/speech-to-speech.md Outdated Show resolved Hide resolved

fix: update TOON script count and fix directional reference in docs

7e13395

- TOON scripts[39] → scripts[41] after adding voice-bridge.py and voice-helper.sh - Fix 'below' → 'above' in speech-to-speech.md Voice Bridge section

coderabbitai bot approved these changes Feb 7, 2026

View reviewed changes

marcusquinn merged commit e79c828 into main Feb 7, 2026
9 of 11 checks passed

coderabbitai bot mentioned this pull request Feb 8, 2026

feat: add hyprwhspr speech-to-text subagent for Linux (t027) #575

Merged

feat: add voice bridge -- talk to AI agents via speech #416

feat: add voice bridge -- talk to AI agents via speech #416

Uh oh!

Conversation

marcusquinn commented Feb 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New Files

Features

Usage

Updated Docs

Testing

Summary by CodeRabbit

Release Notes

Uh oh!

gemini-code-assist bot commented Feb 7, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

github-actions bot commented Feb 7, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Feb 7, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

github-actions bot commented Feb 7, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

sonarqubecloud bot commented Feb 7, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

marcusquinn commented Feb 7, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 7, 2026 •

edited

Loading