Skip to content

recipes: consolidate gb200 fp8 8k1k overrides#230

Open
weireweire wants to merge 1 commit intoishandhanani:mainfrom
weireweire:codex-gb200-fp8-8k1k-override-recipe
Open

recipes: consolidate gb200 fp8 8k1k overrides#230
weireweire wants to merge 1 commit intoishandhanani:mainfrom
weireweire:codex-gb200-fp8-8k1k-override-recipe

Conversation

@weireweire
Copy link
Copy Markdown
Collaborator

@weireweire weireweire commented Mar 30, 2026

Summary

  • add a consolidated recipes/gb200-fp8/8k1k.yaml override recipe
  • group shared config into base and keep variant-specific topology/decode settings in overrides
  • annotate parameters by category, including parallel, disagg, and size-limit sections

Validation

  • validate_config_file('recipes/gb200-fp8/8k1k.yaml')

Summary by CodeRabbit

  • New Features
    • Added a GB200-FP8 benchmark configuration with multiple performance variants optimized for different scenarios: low-latency, maximum-throughput, and mid-curve strategies.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 30, 2026

📝 Walkthrough

Walkthrough

Added a new GB200-FP8 "8k1k" YAML recipe file with a shared base configuration containing Dynamo versioning, model/container settings, GPU topology, benchmark parameters, and sglang backend settings. Includes multiple override sections for different inference modes with tailored parallelism, scheduling, and speculative decoding parameters.

Changes

Cohort / File(s) Summary
GB200-FP8 8k1k Recipe
recipes/gb200-fp8/8k1k.yaml
New consolidated YAML recipe with base configuration (Dynamo versioning, model/container/precision, GPU topology with 4 GPUs per node, sa-bench parameters with 8192 input/1024 output sequence lengths, sglang backend config). Includes five override sections: override_stp_lowlat for STP low-latency, zip_override_stp_max_tpt for zipped STP max-throughput variants, override_lowlat_mtp and override_midcurve_mtp for MTP modes with EAGLE speculative decoding, each modifying frontend parallelism, topology, decode kernel/scheduler flags, and environment variables.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • ishandhanani
  • kyleliang-nv

Poem

🐰 Eight-kay, one-kay tokens dance in light,
With GB200's FP8 might,
MTP and STP paths align,
Overrides sparkle, config divine,
Low-latency hops to throughput's height! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: consolidation of GB200 FP8 8k1k recipe overrides into a single YAML file with base configuration and variant overrides.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@recipes/gb200-fp8/8k1k.yaml`:
- Around line 4-7: Update the header comment map to match the actual recipe
keys: replace "override_lowlat" with "override_stp_lowlat" and
"zip_override_stp_curve" with "zip_override_stp_max_tpt", and also adjust
"override_lowlat_mtp" to the corresponding "override_stp_lowlat_mtp" if that key
exists in the recipe; leave "override_midcurve_mtp" as-is but verify it matches
the real key. Ensure the comment lines exactly mirror the real keys
(override_stp_lowlat, override_stp_lowlat_mtp, zip_override_stp_max_tpt,
override_midcurve_mtp) so future edits use the correct names.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 16f4af16-8635-406e-8977-a9cf4312acf2

📥 Commits

Reviewing files that changed from the base of the PR and between 50c91ba and e9e5f66.

📒 Files selected for processing (1)
  • recipes/gb200-fp8/8k1k.yaml

Comment on lines +4 to +7
# override_lowlat - STP low-latency
# override_lowlat_mtp - MTP low-latency
# zip_override_stp_curve - STP mid-curve + max-throughput
# override_midcurve_mtp - MTP mid-curve
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix the section map in the header.

The comment still points to override_lowlat and zip_override_stp_curve, but the actual keys are override_stp_lowlat and zip_override_stp_max_tpt. That makes the recipe easy to edit incorrectly.

📝 Proposed fix
-#   override_lowlat       - STP low-latency
+#   override_stp_lowlat   - STP low-latency
 #   override_lowlat_mtp   - MTP low-latency
-#   zip_override_stp_curve - STP mid-curve + max-throughput
+#   zip_override_stp_max_tpt - STP mid-curve + max-throughput
 #   override_midcurve_mtp - MTP mid-curve
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# override_lowlat - STP low-latency
# override_lowlat_mtp - MTP low-latency
# zip_override_stp_curve - STP mid-curve + max-throughput
# override_midcurve_mtp - MTP mid-curve
# override_stp_lowlat - STP low-latency
# override_lowlat_mtp - MTP low-latency
# zip_override_stp_max_tpt - STP mid-curve + max-throughput
# override_midcurve_mtp - MTP mid-curve
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@recipes/gb200-fp8/8k1k.yaml` around lines 4 - 7, Update the header comment
map to match the actual recipe keys: replace "override_lowlat" with
"override_stp_lowlat" and "zip_override_stp_curve" with
"zip_override_stp_max_tpt", and also adjust "override_lowlat_mtp" to the
corresponding "override_stp_lowlat_mtp" if that key exists in the recipe; leave
"override_midcurve_mtp" as-is but verify it matches the real key. Ensure the
comment lines exactly mirror the real keys (override_stp_lowlat,
override_stp_lowlat_mtp, zip_override_stp_max_tpt, override_midcurve_mtp) so
future edits use the correct names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant