Skip to content

chore[notask]: register diar_streaming_sortformer_4spk-v2.1 GGUFs#2138

Merged
GustavoA1604 merged 3 commits into
tetherto:mainfrom
ishanvohra2:chore/add-parakeet-softformer-model
May 20, 2026
Merged

chore[notask]: register diar_streaming_sortformer_4spk-v2.1 GGUFs#2138
GustavoA1604 merged 3 commits into
tetherto:mainfrom
ishanvohra2:chore/add-parakeet-softformer-model

Conversation

@ishanvohra2

Copy link
Copy Markdown
Contributor

🎯 What problem does this PR solve?

  • The streaming variant of NVIDIA's Sortformer diarizer (nvidia/diar_streaming_sortformer_4spk-v2.1) is not yet exposed through the qvac model registry, so consumers of @qvac/transcription-parakeet can't pull it via the registry.
  • Only the offline v1 GGUF and an external ONNX v2 entry exist today.

📝 How does it solve it?

  • Adds three new entries to packages/registry-server/data/models.prod.json for nvidia/diar_streaming_sortformer_4spk-v2.1, one per quantization tier converted with packages/transcription-parakeet/scripts/convert-nemo-to-gguf.py:
    • f16 → ~251 MiB
    • q8_0 → ~134 MiB
    • q4_0 → ~72 MiB
  • All three use engine: "@qvac/transcription-parakeet", point at s3:///qvac_models_compiled/ggml/parakeet/2026-05-20/diar_streaming_sortformer_4spk-v2.1.<quant>.gguf, link back to the HF model card, and tag transcription / parakeet / sortformer / streaming / 4spk / ggml / en.
  • licenseId set to nvidia-open-model-license (v2.1 switched from the v1 CC-BY-4.0 license per the HF model card).

🧪 How was it tested?

  • Ran scripts/convert-nemo-to-gguf.py against the upstream .nemo checkpoint for each quant; converter auto-detected model_type=sortformer and reported num_spks=4, tf_layers=18, tf_d_model=192, layers=17, use_bias=True for all three outputs.
  • JSON validated locally (python -c "import json; json.load(open('packages/registry-server/data/models.prod.json'))").
  • GGUF artifacts still need to be uploaded to the listed S3 path before the registry entries resolve.

@ishanvohra2 ishanvohra2 requested review from a team as code owners May 20, 2026 10:40
GustavoA1604
GustavoA1604 previously approved these changes May 20, 2026
@ishanvohra2 ishanvohra2 requested a review from yuranich May 20, 2026 11:21
Comment thread packages/registry-server/data/models.prod.json Outdated
yuranich
yuranich previously approved these changes May 20, 2026
Zbig9000
Zbig9000 previously approved these changes May 20, 2026
@GustavoA1604

Copy link
Copy Markdown
Contributor

/review

@github-actions

Copy link
Copy Markdown
Contributor

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ✅ APPROVED

**Requirements:**
- 1 Team Member approval ✅ (1/1)
- 1 Team Lead OR Management approval ✅ (2/1)



---
*This comment is automatically updated when reviews change.*

@yuranich yuranich added the verified Authorize secrets / label-gate in PR workflows label May 20, 2026
@GustavoA1604 GustavoA1604 merged commit c3c83cb into tetherto:main May 20, 2026
18 of 23 checks passed
Proletter pushed a commit that referenced this pull request May 24, 2026
)

* chore: Add parakeet softformer model to qvac registry

* remove deprecated parameter

---------

Co-authored-by: Ishan Vohra <ishanvohra@Ishans-MacBook-Air.local>
@ishanvohra2 ishanvohra2 deleted the chore/add-parakeet-softformer-model branch May 25, 2026 05:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tier1 verified Authorize secrets / label-gate in PR workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants