Skip to content

Add MagpieTTS backend with all bugfixes#1251

Open
vmendelev wants to merge 3 commits intomainfrom
pr3/magpie-tts-backend
Open

Add MagpieTTS backend with all bugfixes#1251
vmendelev wants to merge 3 commits intomainfrom
pr3/magpie-tts-backend

Conversation

@vmendelev
Copy link
Collaborator

Summary

  • Add MagpieTTS backend implementing the refactored InferenceBackend interface
  • Includes get_config_class() returning MagpieTTSConfig for YAML-based configuration
  • All bugfixes from the feature branch consolidated into one clean implementation

Bugfixes included

  • Checkpoint + hparams loading as alternative to .nemo file
  • Dummy wav creation when no context audio is provided
  • Decoder cache reset per request batch to avoid cross-request leakage
  • HF resolve URL caching via huggingface_hub to avoid 429s
  • KV cache disabled to prevent shape mismatches under batched requests

Depends on

Files

  • recipes/multimodal/server/backends/magpie_tts_backend.py — full backend implementation

Test plan

  • Start server with --backend magpie_tts --model /path --codec_model /path
  • Verify inference produces audio output
  • Verify checkpoint + hparams mode works

🤖 Generated with Claude Code

Introduces the unified NeMo inference server with a pluggable backend
architecture. All backend-specific logic lives in backend modules, not
the server. Backends declare their config via get_config_class() and
can register additional routes via get_extra_routes().

- recipes/multimodal/server/unified_server.py: Generic FastAPI server
  with request batching and OpenAI-compatible /v1/chat/completions
- recipes/multimodal/server/backends/base.py: Abstract InferenceBackend
  with get_config_class(), get_extra_routes() for future extensibility
- recipes/multimodal/server/backends/__init__.py: Lazy-loading registry
- nemo_skills/inference/server/serve_unified.py: CLI entrypoint with
  YAML config support and backward-compatible CLI args

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
vmendelev and others added 2 commits February 18, 2026 09:52
Replace ~20 hard-coded backend-specific CLI arguments with a generic
parse_extra_args() that converts unknown flags to a config dict. This
makes serve_unified.py truly backend-agnostic — new backends no longer
need to edit the server entrypoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implements MagpieTTSBackend with get_config_class() for the refactored
unified server. Includes all bugfixes from the feature branch:
- Checkpoint + hparams loading (alternative to .nemo)
- Dummy wav for missing context audio
- Decoder cache reset per request batch
- HF resolve URL caching via huggingface_hub
- KV cache disabled to avoid shape mismatches
- Batch size configurable via config

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vmendelev vmendelev force-pushed the pr3/magpie-tts-backend branch from f4bc5cd to 1a9f4fa Compare February 18, 2026 17:54
@vmendelev vmendelev force-pushed the pr2/unified-server-refactor branch 4 times, most recently from 6e7b703 to ac7771f Compare March 2, 2026 10:18
Base automatically changed from pr2/unified-server-refactor to main March 2, 2026 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant