Skip to content

fix(cherry-pick): KV Router bindings docs#3337

Merged
saturley-hall merged 1 commit into
release/0.5.1from
rupei/release-0.5.1-router-bindings-docs-again
Oct 1, 2025
Merged

fix(cherry-pick): KV Router bindings docs#3337
saturley-hall merged 1 commit into
release/0.5.1from
rupei/release-0.5.1-router-bindings-docs-again

Conversation

@PeaBrane
Copy link
Copy Markdown
Contributor

@PeaBrane PeaBrane commented Oct 1, 2025

Overview:

as titled, main PR in #3308

Summary by CodeRabbit

  • New Features
    • Benchmark data synthesizer adds min/max input/output length controls and dataset-based inputs.
    • Deployment manifests add health checks and support for tolerations/affinity in SSH keygen job.
  • Bug Fixes
    • More robust generation streaming: ensures finish_reason is set and errors on empty outputs.
  • Chores
    • Bumped container images and Helm charts to 0.5.1 across backends and examples.
    • Updated build scripts and registry references.
    • Increased default router snapshot threshold to 1,000,000.
  • Documentation
    • Revised guides, examples, and support matrix to reflect 0.5.1, new flags, and updated workflows.

Signed-off-by: PeaBrane <yanrpei@gmail.com>
@PeaBrane PeaBrane requested a review from a team as a code owner October 1, 2025 01:31
@PeaBrane PeaBrane requested a review from a team October 1, 2025 01:31
@PeaBrane PeaBrane requested review from a team as code owners October 1, 2025 01:31
@github-actions github-actions Bot added the fix label Oct 1, 2025
@PeaBrane PeaBrane changed the base branch from main to release/0.5.1 October 1, 2025 01:32
@PeaBrane PeaBrane changed the title fix(cherry-pick): router bindings docs fix(cherry-pick): KV Router bindings docs Oct 1, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 1, 2025

Caution

Review failed

Failed to post review comments

Walkthrough

Repository-wide version bump to 0.5.1 across images, charts, and docs; CLI/docs rename from file to dataset for router benchmarks with new ISL/OSL knobs; router snapshot threshold default increased to 1,000,000 in Python/Rust/docs; Docker build path adjusted for TRT-LLM; NATS stream setup simplified; minor handler/operator/env merge changes; tests updated.

Changes

Cohort / File(s) Summary of changes
Image/tag updates to 0.5.1
benchmarks/incluster/benchmark_job.yaml, components/backends/sglang/... (README.md, deploy/.yaml), components/backends/trtllm/... (deploy/README.md, deploy/.yaml), components/backends/vllm/... (deploy/README.md, deploy/.yaml), examples/... (basics/kubernetes/, custom_backend/, multimodal/, deployments/ECS/.json), recipes/... (gpt-oss-120b/, llama-3-70b/), tests/planner/... (perf_test_configs/, profiling_results/*), deploy/inference-gateway/helm/dynamo-gaie/values.yaml, docs/benchmarks/*, docs/_includes/install.rst
Replace placeholder image tags with nvcr.io/nvidia/ai-dynamo images at version 0.5.1. No other config/logic changes.
Router snapshot threshold default
components/frontend/src/dynamo/frontend/main.py, lib/bindings/python/rust/llm/entrypoint.rs, lib/llm/src/kv_router.rs, docs/architecture/kv_cache_routing.md
Default router_snapshot_threshold updated from 10,000 to 1,000,000 in CLI, Rust defaults, Python binding signature, and docs.
Benchmark data synthesis and CLI
benchmarks/prefix_data_generator/synthesizer.py, benchmarks/router/real_data_benchmark.py, benchmarks/router/README.md
Add ISL/OSL knobs (min/max) and switch from “file” to “dataset” terminology and flows; update function signatures, CLI args, logging, and output naming.
Docker/TRT-LLM build changes
container/Dockerfile.trtllm, container/build.sh, container/Dockerfile.trtllm_prebuilt (removed)
Add GITHUB_TRTLLM_COMMIT build arg, env vars (ENV, TENSORRT_LIB_DIR, LD_LIBRARY_PATH), copy wheels stage; propagate commit arg in build script; remove prebuilt Dockerfile.
Operator/Helm/versioning
deploy/cloud/helm/.../Chart.yaml (crds, platform, operator), deploy/helm/chart/Chart.yaml, deploy/cloud/operator/internal/dynamo/component_planner.go, .../component_planner_test.go, .../graph.go, .../graph_test.go, deploy/cloud/helm/platform/components/operator/templates/mpi-run-ssh-keygen-job.yaml, deploy/cloud/operator/Earthfile, deploy/cloud/operator/internal/secrets/docker_test.go
Bump chart versions to 0.5.1; rename env var PROMETHEUS_PORT→PLANNER_PROMETHEUS_PORT in planner code/tests; change env merge precedence; add test envs/override case; template gains optional tolerations/affinity; DOCKER_SERVER registry updated; tests use nvcr.io host.
NATS transport setup
lib/runtime/src/transports/nats.rs
Replace create/get with get_or_create_stream; simplify control flow; adjust purge using stream object; add debug logs.
TRT-LLM request handler guard
components/backends/trtllm/src/dynamo/trtllm/request_handlers/handler_base.py
Yield error finish_reason when no outputs mid-run; attach finish_reason="unknown" on final chunk if missing; add warning.
Docs structure/content
components/backends/trtllm/gpt-oss.md, examples/deployments/ECS/README.md, docs/kubernetes/create_deployment.md, docs/support_matrix.md
Reorganized guides, updated examples (pull secret name, commands), added TRT-LLM Python 3.11 note; terminology updates.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant CLI as real_data_benchmark.py
  participant Synth as Synthesizer
  participant DS as Input Dataset
  participant LoadGen as genai-perf

  User->>CLI: Run with --input-dataset and ISL/OSL knobs
  CLI->>DS: Check dataset path and line count
  alt Synthesis required (min/max ISL/OSL provided or caching criteria)
    CLI->>Synth: synthesize_requests(num, max_isl, min_isl, min_osl, max_osl)
    Synth->>DS: Read requests
    Synth-->>CLI: Filtered/clipped requests (synthetic_trace.jsonl)
    CLI->>LoadGen: Invoke with synthetic_trace.jsonl and schedule
  else Use dataset directly
    CLI->>LoadGen: Invoke with input_dataset and schedule
  end
  LoadGen-->>CLI: Results/artifacts
  CLI-->>User: Benchmark summary
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I thump my paw: 0.5.1!
New tags on crates, the builds all run.
ISL, OSL trim the streams,
A million snaps in routing dreams.
NATS now finds-or-makes its beam—
Carrot raised: ship this release gleam! 🥕🐇

Pre-merge checks

❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The title “fix(cherry-pick): KV Router bindings docs” only references documentation for KV Router bindings, but the pull request’s changes span version bumps, image tag updates, and functional code modifications across many components, not just documentation; thus it does not accurately summarize the main scope of the changeset. Please update the title to clearly reflect the overall scope—such as version bump to 0.5.1 and associated image tag and code updates—instead of only mentioning the cherry-picked docs change.
Description Check ⚠️ Warning The pull request description only provides a brief overview and omits the required “Details,” “Where should the reviewer start?,” and “Related Issues” sections defined by the repository’s template, leaving the reviewer without information on what files changed or which issue this addresses. Please expand the description to follow the template by adding a “Details” section summarizing key changes, a “Where should the reviewer start?” section listing critical files, and a “Related Issues” section with proper issue references.
Docstring Coverage ⚠️ Warning Docstring coverage is 38.89% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@saturley-hall saturley-hall merged commit 6af195f into release/0.5.1 Oct 1, 2025
10 checks passed
@saturley-hall saturley-hall deleted the rupei/release-0.5.1-router-bindings-docs-again branch October 1, 2025 02:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants