-
Notifications
You must be signed in to change notification settings - Fork 2.7k
feat: Add Vosk STT plugin for offline speech-to-text #4466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hi @davidzhao 👋 |
📝 WalkthroughWalkthroughThis PR adds a new Vosk-based LiveKit Agents plugin (livekit-plugins-vosk) with packaging, optional dependency wiring, model management, a streaming STT implementation (with speaker diarization support), documentation, and automatic plugin registration. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant STT as STT Instance
participant Stream as SpeechStream
participant Audio as Audio Processing
participant Recognizer as KaldiRecognizer
participant Parser as Result Parser
App->>STT: stream()
activate STT
STT->>Stream: create SpeechStream
deactivate STT
activate Stream
loop Frame Flow
App->>Stream: push_frame(AudioFrame)
activate Audio
Stream->>Audio: convert_frame_to_pcm16()
Audio-->>Stream: pcm16 bytes
Stream->>Audio: resample if needed
Audio-->>Stream: resampled bytes
deactivate Audio
Stream->>Recognizer: accept_waveform(pcm_data)
activate Recognizer
Recognizer-->>Stream: JSON result
deactivate Recognizer
Stream->>Parser: parse_vosk_result(json)
activate Parser
Parser-->>Stream: SpeechEvent
deactivate Parser
Stream-->>App: emit SpeechEvent
end
App->>Stream: flush()
Stream->>Recognizer: get_result()
Recognizer-->>Stream: final JSON
Stream->>Parser: parse_vosk_result(json, is_final=true)
Parser-->>Stream: final SpeechEvent
Stream-->>App: emit final SpeechEvent
deactivate Stream
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (2)
🧰 Additional context used🧠 Learnings (2)📓 Common learnings📚 Learning: 2026-01-16T07:44:56.353ZApplied to files:
🔇 Additional comments (2)
✏️ Tip: You can disable this entire section by setting Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/models.py`:
- Around line 51-52: The type annotation on validate_model_path uses the Python
3.10 union operator (str | Path) which breaks Python 3.9; update the function
signature for validate_model_path to use typing.Union[str, Path] and add the
corresponding import from typing (Union) at the top of the module so the
annotation is compatible with Python 3.9.
In `@livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/stt.py`:
- Around line 410-428: _in _convert_audio_buffer_to_pcm16_, the current
resampling logic only uses the first element of AudioResampler.push()
(frames[0]), which drops any additional resampled frames; update the function to
iterate through all frames returned by resampler.push(merged_frame), concatenate
them (or merge/append into a single combined frame replacing merged_frame)
before calling _convert_frame_to_pcm16, ensuring no resampled audio is lost—use
the same approach as the earlier resampling handling that loops over frames to
build the complete merged_frame.
- Around line 82-129: The constructor (__init__) currently calls
validate_model_path for both acoustic models and speaker models, but Vosk
speaker models use a flat layout and will fail that strict check; modify
initialization to use a lighter validation for speaker models: either create a
new validate_speaker_model_path that only verifies that the provided
speaker_model_path exists and is a directory/file (or None), or conditionally
call a simpler check when setting self._speaker_model_path instead of
validate_model_path; update the assignment in __init__ (the line assigning
self._speaker_model_path and the STTOptions speaker_model_path handling) to use
the new lighter validator so diarization initialization no longer raises
FileNotFoundError for valid Vosk speaker model packages.
In `@livekit-plugins/livekit-plugins-vosk/README.md`:
- Line 23: Replace the bare URL in the README sentence "Vosk requires
pre-downloaded models. Download from: https://alphacephei.com/vosk/models" with
a Markdown link to satisfy MD034; edit that line so the URL is formatted as link
text (e.g., "Download the Vosk models" linking to
https://alphacephei.com/vosk/models) to remove the bare URL while preserving the
same destination and meaning.
- Line 168: Replace the bare URL "See https://alphacephei.com/vosk/models for
the complete list." in the supported-languages section of README.md with a
Markdown link (e.g., "See the VOSK models page
(https://alphacephei.com/vosk/models)" replaced by "See the VOSK models page:
[VOSK models](https://alphacephei.com/vosk/models)") to satisfy MD034; update
the exact line containing that sentence so the link is in Markdown form rather
than a raw URL.
🧹 Nitpick comments (1)
livekit-plugins/livekit-plugins-vosk/pyproject.toml (1)
14-24: Missing Python version classifiers.The classifiers only list Python 3.9 and 3.10, but the
requires-python = ">=3.9.0"constraint and livekit-agents support suggest Python 3.11, 3.12, and 3.13 should also be listed for consistency with the parent project.Suggested classifiers update
classifiers = [ "Intended Audience :: Developers", "License :: OSI Approved :: Apache Software License", "Topic :: Multimedia :: Sound/Audio", "Topic :: Multimedia :: Video", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.10", + "Programming Language :: Python :: 3.11", + "Programming Language :: Python :: 3.12", + "Programming Language :: Python :: 3.13", "Programming Language :: Python :: 3 :: Only", ]
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (11)
CONTRIBUTING.mdlivekit-agents/pyproject.tomllivekit-plugins/livekit-plugins-vosk/README.mdlivekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/__init__.pylivekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/log.pylivekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/models.pylivekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/py.typedlivekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/stt.pylivekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/version.pylivekit-plugins/livekit-plugins-vosk/pyproject.tomlpyproject.toml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/log.pylivekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/version.pylivekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/models.pylivekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/stt.pylivekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/__init__.py
🧠 Learnings (2)
📚 Learning: 2026-01-16T07:44:56.353Z
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Follow the Plugin System pattern where plugins in livekit-plugins/ are separate packages registered via the Plugin base class
Applied to files:
livekit-plugins/livekit-plugins-vosk/pyproject.tomlCONTRIBUTING.md
📚 Learning: 2026-01-16T07:44:56.353Z
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Implement Model Interface Pattern for STT, TTS, LLM, and Realtime models with provider-agnostic interfaces, fallback adapters for resilience, and stream adapters for different streaming patterns
Applied to files:
livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/stt.py
🧬 Code graph analysis (1)
livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/__init__.py (1)
livekit-agents/livekit/agents/plugin.py (2)
Plugin(13-56)register_plugin(31-36)
🪛 markdownlint-cli2 (0.18.1)
livekit-plugins/livekit-plugins-vosk/README.md
23-23: Bare URL used
(MD034, no-bare-urls)
168-168: Bare URL used
(MD034, no-bare-urls)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: type-check (3.13)
- GitHub Check: type-check (3.9)
- GitHub Check: unit-tests
🔇 Additional comments (18)
pyproject.toml (1)
55-55: LGTM!The workspace member addition follows the established pattern for plugin registration.
CONTRIBUTING.md (1)
96-97: LGTM!Adding the Vosk plugin to the mypy typechecking command ensures CI coverage for the new plugin.
livekit-agents/pyproject.toml (1)
111-111: Verify version alignment with other plugins.The Vosk optional dependency uses
>=1.3.10while most other plugins in this file use>=1.3.12. Based on the PR commit message mentioning "update vosk plugin version to align with monorepo synchronization," verify this version is intentional or if it should be>=1.3.12for consistency.livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/version.py (1)
1-15: LGTM!Standard version module following the established plugin pattern with proper license header.
livekit-plugins/livekit-plugins-vosk/pyproject.toml (1)
25-29: Verify Vosk version requirement.The PR description states
vosk>=0.3.50as a dependency, but this file declaresvosk>=0.3.45. Please verify which version is the intended minimum requirement to ensure documentation and code are aligned.livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/log.py (1)
15-17: LGTM!Standard logging module setup following Python best practices with
__name__for proper logger hierarchy.livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/models.py (1)
20-49: Model identifiers and default cache directory are well organized.Nice centralization of constants for reuse.
livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/stt.py (10)
48-69: Thread-safe model caching looks solid.Cache lock and lazy loading are straightforward.
71-78: STTOptions keeps configuration tidy.Clear defaults and focused surface area.
131-142: Prewarm path is clean and safe.Graceful logging on failure is fine for a best-effort warmup.
143-149: Model/provider accessors are clear.
151-204: Batch recognition flow is clean and offloaded appropriately.
205-215: Stream constructor wiring looks correct.
218-234: Stream initialization matches parent expectations.
235-387: Streaming loop and flush handling are well structured.
389-407: PCM16 conversion and downmixing are handled cleanly.
431-519: Result parsing maps Vosk output to SpeechEvent cleanly.livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/__init__.py (1)
20-45: Plugin registration and exports align with the LiveKit plugin pattern.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| def validate_model_path(model_path: str | Path) -> Path: | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Python 3.9 PEP 604 union operator str | Path support
💡 Result:
Python 3.9 does not support PEP 604’s union operator (X | Y) at runtime—it was introduced in Python 3.10. So str | Path is only supported on 3.10+. [1], [2], [3]
What to use on Python 3.9
Use typing.Union (or Optional when None is involved):
from pathlib import Path
from typing import Union
StrOrPath = Union[str, Path]Optional[T] is equivalent to Union[T, None] (and T | None in 3.10+). [4]
About from __future__ import annotations on 3.9
It can defer evaluation of annotations (so the code may import/define without immediately erroring), but Python 3.9 still can’t evaluate str | Path as a type union when annotations are later resolved (e.g., via typing.get_type_hints). The portable solution remains Union[str, Path]. [3], [4]
Sources: PEP 604 (Python-Version 3.10) [1]; Python 3.10 “What’s New” (PEP 604) [2]; typing spec note “Before Python 3.10 … did not support |” [3]; Python 3.9 typing.Union/Optional docs [4].
🏁 Script executed:
fd "models.py" --path "*/vosk/*" -x cat -n {}Repository: livekit/agents
Length of output: 273
🏁 Script executed:
find . -name "models.py" -path "*/vosk/*" | head -5Repository: livekit/agents
Length of output: 127
🏁 Script executed:
cat -n ./livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/models.py | sed -n '45,60p'Repository: livekit/agents
Length of output: 514
🏁 Script executed:
cat -n ./livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/models.py | head -30Repository: livekit/agents
Length of output: 1204
Use Union[str, Path] instead of str | Path for Python 3.9 compatibility.
PEP 604's union operator (|) was introduced in Python 3.10; Python 3.9 does not support this syntax at runtime. The coding guidelines require Python 3.9+ compatibility.
Suggested fix
-from pathlib import Path
+from pathlib import Path
+from typing import Union
@@
-def validate_model_path(model_path: str | Path) -> Path:
+def validate_model_path(model_path: Union[str, Path]) -> Path:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def validate_model_path(model_path: str | Path) -> Path: | |
| """ | |
| def validate_model_path(model_path: Union[str, Path]) -> Path: | |
| """ |
🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/models.py` around
lines 51 - 52, The type annotation on validate_model_path uses the Python 3.10
union operator (str | Path) which breaks Python 3.9; update the function
signature for validate_model_path to use typing.Union[str, Path] and add the
corresponding import from typing (Union) at the top of the module so the
annotation is compatible with Python 3.9.
| def __init__( | ||
| self, | ||
| *, | ||
| model_path: str, | ||
| language: str = "en", | ||
| sample_rate: int = 16000, | ||
| enable_words: bool = True, | ||
| max_alternatives: int = 0, | ||
| speaker_model_path: str | None = None, | ||
| ): | ||
| """ | ||
| Create a new instance of Vosk STT. | ||
|
|
||
| Args: | ||
| model_path: Path to the Vosk model directory. Download models from | ||
| https://alphacephei.com/vosk/models | ||
| language: Language code for metadata (e.g., "en", "es", "fr") | ||
| sample_rate: Audio sample rate in Hz. Vosk typically uses 16000. | ||
| enable_words: Whether to include word-level timestamps in results | ||
| max_alternatives: Number of alternative transcriptions to return (0 = disabled) | ||
| speaker_model_path: Optional path to speaker identification model for diarization | ||
| """ | ||
| # Validate model path exists | ||
| self._model_path = validate_model_path(model_path) | ||
| self._speaker_model_path = ( | ||
| validate_model_path(speaker_model_path) if speaker_model_path else None | ||
| ) | ||
|
|
||
| super().__init__( | ||
| capabilities=stt.STTCapabilities( | ||
| streaming=True, | ||
| interim_results=True, | ||
| diarization=bool(speaker_model_path), | ||
| aligned_transcript="word" if enable_words else False, | ||
| offline_recognize=True, | ||
| ) | ||
| ) | ||
|
|
||
| self._opts = STTOptions( | ||
| model_path=str(self._model_path), | ||
| sample_rate=sample_rate, | ||
| language=language, | ||
| enable_words=enable_words, | ||
| max_alternatives=max_alternatives, | ||
| speaker_model_path=str(self._speaker_model_path) if self._speaker_model_path else None, | ||
| ) | ||
|
|
||
| self._label = f"vosk-{language}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for the validate_model_path function definition
find . -name "*.py" -type f | xargs grep -l "def validate_model_path" | head -5Repository: livekit/agents
Length of output: 127
🏁 Script executed:
#!/bin/bash
# Look for imports and usage of validate_model_path in the STT file
grep -n "validate_model_path" livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/stt.pyRepository: livekit/agents
Length of output: 251
🏁 Script executed:
#!/bin/bash
# Find files related to validation or model handling
fd -e py . livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/ | head -20Repository: livekit/agents
Length of output: 394
🏁 Script executed:
#!/bin/bash
# Read the models.py file to see validate_model_path implementation
cat -n livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/models.py | head -100Repository: livekit/agents
Length of output: 3264
🌐 Web query:
Vosk speaker model vosk-model-spk-0.4 directory structure contents files
💡 Result:
vosk-model-spk-0.4 (speaker ID model) unzips to a single folder (often renamed to model-spk/) that contains no subdirectories—just these files at the top level:
model-spk/
final.ext.raw
mean.vec
mfcc.conf
README.txt
transform.mat
This model is published as the “Speaker identification model” vosk-model-spk-0.4 on the official Vosk models page. [1] The above on-disk file set is shown directly in an example project’s model-spk folder listing. [2]
Sources: [1] [2]
Speaker model validation fails with validate_model_path.
The validate_model_path function enforces acoustic model structure (checking for am/final.mdl and conf/model.conf). However, Vosk speaker models (vosk-model-spk-0.4) have a flat directory structure with files like final.ext.raw, mean.vec, and mfcc.conf—no am/ or conf/ subdirectories. Passing a speaker model path to validate_model_path will raise FileNotFoundError at init, breaking diarization.
Implement a dedicated validation function for speaker models that only checks for path existence and directory status, or conditionally apply the lighter validation when speaker_model_path is provided.
🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/stt.py` around
lines 82 - 129, The constructor (__init__) currently calls validate_model_path
for both acoustic models and speaker models, but Vosk speaker models use a flat
layout and will fail that strict check; modify initialization to use a lighter
validation for speaker models: either create a new validate_speaker_model_path
that only verifies that the provided speaker_model_path exists and is a
directory/file (or None), or conditionally call a simpler check when setting
self._speaker_model_path instead of validate_model_path; update the assignment
in __init__ (the line assigning self._speaker_model_path and the STTOptions
speaker_model_path handling) to use the new lighter validator so diarization
initialization no longer raises FileNotFoundError for valid Vosk speaker model
packages.
| def _convert_audio_buffer_to_pcm16(buffer: AudioBuffer, target_sample_rate: int) -> bytes: | ||
| """ | ||
| Convert AudioBuffer to PCM16 format for Vosk. | ||
| """ | ||
| # Merge all frames in the buffer | ||
| merged_frame = buffer.merge() # type: ignore | ||
|
|
||
| # Resample if needed | ||
| if merged_frame.sample_rate != target_sample_rate: | ||
| resampler = rtc.AudioResampler( | ||
| merged_frame.sample_rate, | ||
| target_sample_rate, | ||
| quality=rtc.AudioResamplerQuality.HIGH, | ||
| ) | ||
| frames = resampler.push(merged_frame) | ||
| if frames: | ||
| merged_frame = frames[0] | ||
|
|
||
| return _convert_frame_to_pcm16(merged_frame) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, find the file and understand the imports
find . -name "stt.py" -path "*/vosk/*" | head -5Repository: livekit/agents
Length of output: 124
🏁 Script executed:
#!/bin/bash
# Search for AudioResampler definition and usage
rg -n "class AudioResampler|def push" -S -A 3 | head -50Repository: livekit/agents
Length of output: 3884
🏁 Script executed:
#!/bin/bash
# Look for type stubs (.pyi files) that might define AudioResampler
fd -e pyi | xargs rg -l "AudioResampler" 2>/dev/null | head -10Repository: livekit/agents
Length of output: 40
🏁 Script executed:
#!/bin/bash
# Check imports in the stt.py file to see where rtc comes from
head -50 livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/stt.py | rg -n "import|from"Repository: livekit/agents
Length of output: 514
🏁 Script executed:
#!/bin/bash
# Search for AudioResampler usage throughout the codebase
rg "AudioResampler" -n -B 2 -A 5Repository: livekit/agents
Length of output: 35142
🏁 Script executed:
#!/bin/bash
# Check if there are Python stubs or type hints in the codebase
find . -name "*.pyi" | xargs grep -l "AudioResampler" 2>/dev/nullRepository: livekit/agents
Length of output: 42
🏁 Script executed:
#!/bin/bash
# Look at the full context of the function in question
sed -n '410,428p' livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/stt.pyRepository: livekit/agents
Length of output: 706
🏁 Script executed:
#!/bin/bash
# Check if there are any other usages of resampler.push() pattern in codebase
rg "resampler\.push\(" -n -B 2 -A 5Repository: livekit/agents
Length of output: 17680
Handle all frames returned by AudioResampler.push() to prevent audio loss.
AudioResampler.push() returns a list of frames. Using only frames[0] drops remaining resampled audio. The same file handles this correctly at lines 320-325 by iterating through all frames. Concatenate all resampled frames:
Suggested fix
- frames = resampler.push(merged_frame)
- if frames:
- merged_frame = frames[0]
-
- return _convert_frame_to_pcm16(merged_frame)
+ frames = resampler.push(merged_frame)
+ if frames:
+ return b"".join(_convert_frame_to_pcm16(f) for f in frames)
+
+ return _convert_frame_to_pcm16(merged_frame)🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-vosk/livekit/plugins/vosk/stt.py` around
lines 410 - 428, _in _convert_audio_buffer_to_pcm16_, the current resampling
logic only uses the first element of AudioResampler.push() (frames[0]), which
drops any additional resampled frames; update the function to iterate through
all frames returned by resampler.push(merged_frame), concatenate them (or
merge/append into a single combined frame replacing merged_frame) before calling
_convert_frame_to_pcm16, ensuring no resampled audio is lost—use the same
approach as the earlier resampling handling that loops over frames to build the
complete merged_frame.
|
|
||
| ## Download Models | ||
|
|
||
| Vosk requires pre-downloaded models. Download from: https://alphacephei.com/vosk/models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace bare model-download URL with a Markdown link.
Markdownlint MD034 disallows bare URLs in prose.
♻️ Suggested fix
-Vosk requires pre-downloaded models. Download from: https://alphacephei.com/vosk/models
+Vosk requires pre-downloaded models. Download from: [Vosk models](https://alphacephei.com/vosk/models)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Vosk requires pre-downloaded models. Download from: https://alphacephei.com/vosk/models | |
| Vosk requires pre-downloaded models. Download from: [Vosk models](https://alphacephei.com/vosk/models) |
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
23-23: Bare URL used
(MD034, no-bare-urls)
🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-vosk/README.md` at line 23, Replace the bare
URL in the README sentence "Vosk requires pre-downloaded models. Download from:
https://alphacephei.com/vosk/models" with a Markdown link to satisfy MD034; edit
that line so the URL is formatted as link text (e.g., "Download the Vosk models"
linking to https://alphacephei.com/vosk/models) to remove the bare URL while
preserving the same destination and meaning.
| - Swedish | ||
| - And more... | ||
|
|
||
| See https://alphacephei.com/vosk/models for the complete list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace bare URL with a Markdown link in the supported-languages section.
This keeps the README compliant with MD034.
♻️ Suggested fix
-See https://alphacephei.com/vosk/models for the complete list.
+See the [Vosk models list](https://alphacephei.com/vosk/models) for the complete list.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| See https://alphacephei.com/vosk/models for the complete list. | |
| See the [Vosk models list](https://alphacephei.com/vosk/models) for the complete list. |
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
168-168: Bare URL used
(MD034, no-bare-urls)
🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-vosk/README.md` at line 168, Replace the bare
URL "See https://alphacephei.com/vosk/models for the complete list." in the
supported-languages section of README.md with a Markdown link (e.g., "See the
VOSK models page (https://alphacephei.com/vosk/models)" replaced by "See the
VOSK models page: [VOSK models](https://alphacephei.com/vosk/models)") to
satisfy MD034; update the exact line containing that sentence so the link is in
Markdown form rather than a raw URL.
This PR introduces livekit-plugins-vosk, a new offline Speech-to-Text (STT) plugin for the LiveKit Agents framework. This plugin enables fully local, privacy-focused speech recognition without relying on external API dependencies.
Dependencies
vosk>=0.3.50numpy>=1.26Supported Languages
Vosk supports 20+ languages:
See https://alphacephei.com/vosk/models for the complete list.
Summary by CodeRabbit
New Features
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.