feat(speechmatics): add max_speakers parameter for speaker diarization #3524

nsepehr · 2025-09-29T06:02:47Z

Summary

This PR adds support for the max_speakers parameter to the Speechmatics STT plugin, allowing developers to limit the number of unique speakers detected during diarization.

Problem

Currently, when using the Speechmatics STT plugin with diarization enabled, there's no way to specify the maximum number of speakers. The transcription_config parameter (which is deprecated) accepts a speaker_diarization_config with max_speakers, but this value is not preserved when the plugin processes the configuration.

Solution

Added max_speakers as a direct parameter to the STT __init__ method
Updated the STTOptions dataclass to include the max_speakers field
Modified _process_config to include max_speakers in the speaker_diarization_config when sending to the Speechmatics API
Added proper handling for extracting max_speakers from the deprecated transcription_config parameter for backward compatibility
Updated documentation to explain the new parameter

Use Case

This parameter is particularly useful for scenarios where the number of participants is known in advance, such as:

Two-person interviews or conversations
Small group discussions with a fixed number of participants
Customer service calls (agent and customer)
Educational settings with known speaker counts

Testing

Tested locally with a multi-speaker agent implementation
Verified that the parameter is correctly passed to the Speechmatics API configuration
Confirmed backward compatibility with the deprecated transcription_config parameter

Example Usage

stt = speechmatics.STT(
    language="en",
    enable_diarization=True,
    max_speakers=2,  # Limit to 2 speakers
    diarization_sensitivity=0.5,
    speaker_active_format="@[{speaker_id}]: {text}",
)

Breaking Changes

None - this is a backward-compatible addition.

CLAassistant · 2025-09-29T06:02:54Z

All committers have signed the CLA.

livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/stt.py

- Added max_speakers parameter to STT __init__ method - Updated STTOptions dataclass to include max_speakers field - Modified _process_config to include max_speakers in speaker_diarization_config - Added handling for extracting max_speakers from deprecated transcription_config - Updated documentation to explain the new parameter - Fixed compatibility with livekit-agents 1.2.6 (removed diarization from STTCapabilities) - Updated minimum livekit-agents version to 1.2.6 This parameter allows limiting the number of unique speakers detected during diarization, which is useful for scenarios with a known number of participants (e.g., 2-person interviews, small group meetings with fixed participants).

…peechmatics/stt.py Co-authored-by: Long Chen <[email protected]>

Refactored _process_config to build all configuration parameters upfront and pass them to TranscriptionConfig constructor, rather than creating an instance and mutating it afterward. This addresses review feedback to avoid assigning dict values directly to dataclass fields and instead use proper dataclass initialization patterns. Also applied ruff formatting fixes.

livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/stt.py

- Simplify TranscriptionConfig initialization to use direct mutation - Add mypy configuration for speechmatics module - Fix type incompatibility issues with deprecated parameters - Add type: ignore comments for untyped imports and decorators - Remove unnecessary type: ignore comments where types are properly handled

Incorporates essential type checking improvements while maintaining max_speakers: - Import and use SpeakerDiarizationConfig dataclass instead of dict - Fix additional_vocab to use dict format as per type annotation - Improve handling of deprecated transcription_config parameter - Add proper type conversion for AudioEncoding - Simplify import statement in utils.py - Apply ruff formatting

nsepehr

Updated with requested changes and merged with #3599

longcw · 2025-10-08T06:23:14Z

livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/stt.py

+                speaker_sensitivity=self._stt_options.diarization_sensitivity,
+                prefer_current_speaker=self._stt_options.prefer_current_speaker,
+                # TODO: speakers field is not supported by SpeakerDiarizationConfig yet
+                # speakers={s.label: s.speaker_identifiers for s in self._stt_options.known_speakers},


can you use the dict for now with a type: ignore to ignore the type check?

The SpeakerDiarizationConfig IS a dataclass provided by the Speechmatics SDK. We're using it correctly here. Whether you pass the dataclass or a dict, they serialize to the same JSON structure. The dataclass approach is cleaner and type-safe. I don't think we should change that.

I think we should keep the support of speakers options using dict for now.

Ok... updated it with using dict. Can you review now please.

livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/stt.py

The Speechmatics API expects additional_vocab as a list of objects with 'content' and 'sounds_like' fields, not a dict. Updated to match API requirements with type: ignore since SDK type annotation is misleading. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

nsepehr

Updated the additional_vocab to match with the expected API.

nsepehr · 2025-10-10T06:07:42Z

@longcw can you please review this so it can be landed for the next release 🙏

- Replace typed SpeakerDiarizationConfig with dict + type: ignore - Add support for the speakers field from known_speakers - Remove unused SpeakerDiarizationConfig import - Maintain backward compatibility while allowing API evolution

livekit#3524) Co-authored-by: Long Chen <[email protected]> Co-authored-by: Claude <[email protected]>

* staging: (114 commits) Added min_confidence_threshold for deepgram flux. livekit-agents 1.2.15 (livekit#3658) fix livekit#3650 cartesia version backward compatibility (livekit#3651) Unprompted STT Reconnection at startup (livekit#3649) enable zero retention mode in elevenlabs (livekit#3647) fix: heartbeat (livekit#3648) feat: Integrate streaming endpoints for Sarvam APIs (livekit#3498) turn_detection: reduce max_endpointing_delay to 3s (livekit#3640) fix: exclude temperature parameter for gpt-5 and similar models (livekit#3573) add backwards compatibility for google's realtime model (livekit#3630) Align Google STT plugin with official documentation (livekit#3628) feat(speechmatics): add max_speakers parameter for speaker diarization (livekit#3524) fix(deepgram): send CloseStream message before closing TTS WebSocket (livekit#3608) chore: Remove duplicate docstring for `preemptive_generation` parameter in AgentSession (livekit#3624) Add RTZR(ReturnZero) STT Plugin for LiveKit Agents (livekit#3376) feat(telemetry/utils): add ttft reporting to LangFuse (livekit#3594) catch delete_room errors and disable delete_room_on_close by default (livekit#3600) lift google realtime api out of beta (livekit#3614) fix: lock pyav to <16 due to build issue (livekit#3593) Updating Cartesia Version (livekit#3570) ...

livekit#3524) Co-authored-by: Long Chen <[email protected]> Co-authored-by: Claude <[email protected]>

nsepehr force-pushed the feat/speechmatics-max-speakers branch 2 times, most recently from a3a7974 to c395ae5 Compare September 29, 2025 06:18

longcw reviewed Sep 29, 2025

View reviewed changes

livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/stt.py Outdated Show resolved Hide resolved

livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/stt.py Show resolved Hide resolved

nsepehr force-pushed the feat/speechmatics-max-speakers branch from c395ae5 to 400f516 Compare September 30, 2025 05:25

nsepehr force-pushed the feat/speechmatics-max-speakers branch from 400f516 to 8400a67 Compare September 30, 2025 05:32

nsepehr and others added 3 commits October 1, 2025 17:29

Update livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/s…

04ae148

…peechmatics/stt.py Co-authored-by: Long Chen <[email protected]>

Merge branch 'main' into feat/speechmatics-max-speakers

4b66311

nsepehr commented Oct 5, 2025

View reviewed changes

livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/stt.py Outdated Show resolved Hide resolved

livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/stt.py Show resolved Hide resolved

longcw reviewed Oct 6, 2025

View reviewed changes

livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/stt.py Outdated Show resolved Hide resolved

livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/stt.py Show resolved Hide resolved

nsepehr force-pushed the feat/speechmatics-max-speakers branch 3 times, most recently from 5b7e278 to da4d474 Compare October 8, 2025 03:47

nsepehr force-pushed the feat/speechmatics-max-speakers branch 2 times, most recently from 175a1db to 92433d0 Compare October 8, 2025 04:16

nsepehr force-pushed the feat/speechmatics-max-speakers branch from 92433d0 to fa814f6 Compare October 8, 2025 04:24

nsepehr commented Oct 8, 2025

View reviewed changes

nsepehr mentioned this pull request Oct 8, 2025

[FEATURE] Add support for known speakers in SpeakerDiarizationConfig speechmatics/speechmatics-python-sdk#44

Open

8 tasks

longcw reviewed Oct 8, 2025

View reviewed changes

nsepehr commented Oct 8, 2025

View reviewed changes

longcw approved these changes Oct 10, 2025

View reviewed changes

davidzhao merged commit bf30045 into livekit:main Oct 12, 2025
9 checks passed

longcw mentioned this pull request Oct 16, 2025

[draft] fix speechmatics type checking #3599

Closed

akshaym1shra pushed a commit to akshaym1shra/agents that referenced this pull request Oct 28, 2025

feat(speechmatics): add max_speakers parameter for speaker diarization (

3b05436

livekit#3524) Co-authored-by: Long Chen <[email protected]> Co-authored-by: Claude <[email protected]>

akshaym1shra pushed a commit to akshaym1shra/agents that referenced this pull request Oct 28, 2025

feat(speechmatics): add max_speakers parameter for speaker diarization (

ef32d83

livekit#3524) Co-authored-by: Long Chen <[email protected]> Co-authored-by: Claude <[email protected]>

akshaym1shra pushed a commit to akshaym1shra/agents that referenced this pull request Nov 3, 2025

feat(speechmatics): add max_speakers parameter for speaker diarization (

8aa5c20

livekit#3524) Co-authored-by: Long Chen <[email protected]> Co-authored-by: Claude <[email protected]>

feat(speechmatics): add max_speakers parameter for speaker diarization #3524

feat(speechmatics): add max_speakers parameter for speaker diarization #3524

Uh oh!

Conversation

nsepehr commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Use Case

Testing

Example Usage

Breaking Changes

Uh oh!

CLAassistant commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nsepehr left a comment

Choose a reason for hiding this comment

Uh oh!

longcw Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

nsepehr Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

longcw Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

nsepehr Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nsepehr left a comment

Choose a reason for hiding this comment

Uh oh!

nsepehr commented Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nsepehr commented Sep 29, 2025 •

edited

Loading

CLAassistant commented Sep 29, 2025 •

edited

Loading