Skip to content
This repository was archived by the owner on Apr 20, 2026. It is now read-only.

Use mtp3 for 1k8k, 1k1k gb200 fp8 disagg low latency configs#143

Merged
trevor-m merged 1 commit intomainfrom
trevor-m/mtp3
Feb 5, 2026
Merged

Use mtp3 for 1k8k, 1k1k gb200 fp8 disagg low latency configs#143
trevor-m merged 1 commit intomainfrom
trevor-m/mtp3

Conversation

@trevor-m
Copy link
Copy Markdown
Collaborator

@trevor-m trevor-m commented Feb 5, 2026

Summary by CodeRabbit

  • Updates
    • Optimized GB200 FP8 low-latency recipe configurations (1K1K and 1K8K deployment profiles) with enhanced speculative execution parameters and refined resource allocation settings to improve performance and throughput characteristics.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 5, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Two GB200 FP8 low-latency MTP recipe configurations are updated with renamed versions, increased speculative inference parameters (num-steps and num-draft-tokens), and adjusted memory allocation settings in the 1k1k variant.

Changes

Cohort / File(s) Summary
GB200 FP8 Low-Latency MTP Recipes
recipes/gb200-fp8/1k1k/low-latency-mtp.yaml, recipes/gb200-fp8/1k8k/low-latency-mtp.yaml
Recipe versions renamed to v3; speculative-num-steps increased from 2 to 3; speculative-num-draft-tokens increased from 3 to 4 in both prefill and decode sections. Additionally, 1k1k recipe adjusts mem-fraction-static to 0.95 and max-running-requests to 128.

Estimated Code Review Effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly Related PRs

Suggested Reviewers

  • ishandhanani
  • kyleliang-nv

Poem

🐰 Three hops for speculative dreams,
Tokens draft in bountiful streams,
Memory fractions tuned just right,
Recipes gleaming, polished bright! ✨

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch trevor-m/mtp3

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@trevor-m trevor-m merged commit b8d6635 into main Feb 5, 2026
3 of 5 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant