Skip to content

Conversation

@athreesh
Copy link
Contributor

@athreesh athreesh commented Aug 19, 2025

This PR consolidates Dynamo's Kubernetes documentation and addresses tech writer feedback to improve clarity and navigation.

📚 Documentation Consolidation

  • Added Grove documentation - New guide for advanced Kubernetes scheduling with PodGangSet/PodClique support
  • Added K8s metrics guide - Complete Prometheus/Grafana setup documentation
  • Streamlined deployment guides - Reorganized dynamo_deploy docs with clearer structure
  • Consistent naming - Standardized all references to "Dynamo Kubernetes Platform"
  • Removed duplicate docs - Cleaned up redundant symlinked component documentation

🏷️ Fixed Heading Issues (Tech Writer Feedback)

Redundant Headings:

  • docs/architecture/architecture.md: "High level architecture and key benefits" → "Key benefits"
  • docs/guides/dynamo_deploy/minikube.md: Removed redundant "Setting Up Minikube" H2, promoted subsections
  • components/backends/sglang/docs/sgl-http-server.md: Demoted duplicate H1 "Introduction" → H2
  • components/backends/vllm/deepseek-r1.md: Demoted duplicate H1 "Instructions" → H2
  • docs/architecture/sla_planner.md: "Architecture" → "Design"
  • components/README.md: "Core Components" → "Core Services"
  • components/backends/sglang/docs/multinode-examples.md: Removed redundant "Multi-node sized models" H2

Ambiguous Quick Start Headings:

  • docs/architecture/sla_planner.md: "Quick Start:" → "To deploy SLA Planner:"
  • components/backends/sglang/README.md: "Quick Start" → "SGLang Quick Start"
  • components/backends/trtllm/README.md: "Quick Start" → "TensorRT-LLM Quick Start"
  • components/backends/vllm/README.md: "Quick Start" → "vLLM Quick Start"
  • docs/components/router/README.md: "Quick Start" → "KV Router Quick Start"
  • docs/guides/dynamo_deploy/model_caching_with_fluid.md: "Quick Start" → "Pre-deployment Steps"

Summary by CodeRabbit

  • New Features

    • Added sample Kubernetes deployment manifests for vLLM, SGLang, and TensorRT-LLM (aggregated, disaggregated, and router/planner variants).
    • Introduced extensive TRT-LLM engine configuration examples (DeepSeek R1, Llama 4 Eagle, Gemma3, Multimodal).
    • Added a basic multimodal example placeholder.
  • Documentation

    • Rebranded to “Dynamo Kubernetes Platform” and unified the deployment guide around DynamoGraphDeployment.
    • Added guides for Grove, metrics with Prometheus/Grafana, Minikube setup, and updated GKE notes.
    • Improved architecture and backend docs; refined headings for clarity.
    • Introduced a Markdown documentation style guide.
  • Chores

    • Removed/cleaned redundant link-only docs.

- Add Grove documentation for advanced scheduling capabilities
- Add K8s metrics setup guide for Prometheus and Grafana
- Consolidate and streamline dynamo_deploy guides
- Improve Dynamo Cloud/Platform documentation clarity
- Update Minikube and GKE setup guides with clearer terminology
- Add multinode deployment improvements with Grove/KAI-Scheduler info
Redundant Headings Fixed:
- docs/architecture/architecture.md: Changed 'High level architecture and key benefits' to 'Key benefits'
- docs/guides/dynamo_deploy/minikube.md: Removed redundant 'Setting Up Minikube' H2, promoted subsections
- components/backends/sglang/docs/sgl-http-server.md: Demoted second H1 'Introduction' to H2
- components/backends/vllm/deepseek-r1.md: Demoted second H1 'Instructions' to H2
- docs/architecture/sla_planner.md: Changed 'Architecture' H2 to 'Design'
- components/README.md: Changed 'Core Components' to 'Core Services'
- components/backends/sglang/docs/multinode-examples.md: Removed redundant 'Multi-node sized models' H2

Ambiguous Quick Start Headings Fixed:
- docs/architecture/sla_planner.md: Changed 'Quick Start:' to 'To deploy SLA Planner:'
- components/backends/sglang/README.md: 'Quick Start' → 'SGLang Quick Start'
- components/backends/trtllm/README.md: 'Quick Start' → 'TensorRT-LLM Quick Start'
- components/backends/vllm/README.md: 'Quick Start' → 'vLLM Quick Start'
- docs/components/router/README.md: 'Quick Start' → 'KV Router Quick Start'
- docs/guides/dynamo_deploy/model_caching_with_fluid.md: 'Quick Start' → 'Pre-deployment Steps'

All changes improve documentation clarity and eliminate redundancy in Sphinx breadcrumbs/TOC.
@copy-pr-bot
Copy link

copy-pr-bot bot commented Aug 19, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added the docs label Aug 19, 2025
@athreesh athreesh closed this Aug 19, 2025
@athreesh athreesh changed the title docs: Docs consolidation only docs: CLOSE ME Aug 19, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 19, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

Adds a Markdown docs style rule and broad documentation edits (headings/structure). Introduces multiple backup Kubernetes/DynamoGraphDeployment manifests for SGLang, TensorRT-LLM, and vLLM (agg, disagg, router, planner variants). Adds numerous TRT-LLM engine configuration .bak files (DeepSeek R1, Llama 4 Eagle, Gemma3, GPT-OSS, multimodal). Minor doc link/file cleanups and a placeholder example.

Changes

Cohort / File(s) Summary
Docs style rules
/.cursor/rules/docs-rules.mdc
New Markdown editorial/style rules applied per .md files (YAML front matter with globs).
Component README heading tweaks
/components/README.md, /components/backends/sglang/README.md, /components/backends/trtllm/README.md, /components/backends/vllm/README.md, /docs/components/router/README.md
Renamed top-level section headers (e.g., “Core Components”→“Core Services”, “Quick Start”→backend-specific labels).
SGLang deploy manifests (.bak)
/components/backends/sglang/deploy/agg*.yaml.bak, /components/backends/sglang/deploy/disagg*.yaml.bak
New DynamoGraphDeployment YAMLs for agg, agg_router, disagg, disagg_planner with images, resources, and worker commands (decode/prefill).
vLLM deploy manifests (.bak)
/components/backends/vllm/deploy/agg*.yaml.bak, /components/backends/vllm/deploy/disagg*.yaml.bak
New DynamoGraphDeployment YAMLs for agg, agg_router, disagg, disagg_planner (router-mode, planner/Prometheus, prefill/decode workers).
TRT-LLM deploy manifests (.bak)
/components/backends/trtllm/deploy/agg*.yaml.bak, /components/backends/trtllm/deploy/disagg*.yaml.bak
New DynamoGraphDeployment YAMLs for agg, agg_router, disagg with worker args and engine config references.
TRT-LLM engine configs — common/simple
/components/backends/trtllm/engine_configs/{agg.yaml.bak,decode.yaml.bak,prefill.yaml.bak}, /.../deepseek_r1/simple/*.{yaml}.bak
Added baseline and DeepSeek R1 “simple” configs (tp/moe params, kv_cache, cuda_graph, overlap scheduler flags).
TRT-LLM engine configs — MTP (DeepSeek R1)
/components/backends/trtllm/engine_configs/deepseek_r1/mtp/*.{yaml}.bak
Added MTP prefill/decode/agg configs with speculative MTP settings and fp8 cache.
TRT-LLM engine configs — WideEP
/components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/*.{yaml}.bak
Added WideEP MOE configs, EPLB settings, multi-node tp/moe=16 variants.
TRT-LLM engine configs — Llama4 Eagle
/components/backends/trtllm/engine_configs/llama4/eagle/*.{yaml}.bak, /.../eagle_one_model/*.{yaml}.bak
Added Eagle speculative decoding configs (one-model and split variants), with draft lengths and model dir.
TRT-LLM engine configs — Gemma3
/components/backends/trtllm/engine_configs/gemma3/vswa_*.yaml.bak
Added VSWA agg/decode/prefill configs with attention windows.
TRT-LLM engine configs — GPT-OSS
/components/backends/trtllm/engine_configs/gpt_oss/*.{yaml}.bak
Added CUTLASS MOE-based prefill/decode configs with UCX transceiver and chunked prefill.
TRT-LLM engine configs — Multimodal
/components/backends/trtllm/engine_configs/multimodal/*.{yaml}.bak, /.../multimodal/llama4/*.{yaml}.bak
Added multimodal prefill/decode configs (generic and Llama4) with chunked prefill and KV cache settings.
SGLang docs adjustments
/components/backends/sglang/docs/*
Heading level changes; added a pointer file; removed a subheader; one docs mirror emptied.
Architecture/docs restructuring
/docs/architecture/*, /docs/guides/dynamo_deploy/*
Renamed sections, reframed “Dynamo Cloud”→“Dynamo Kubernetes Platform”, unified CRD-based deployment guide, added Grove, metrics, Minikube updates, operator doc link/path fixes.
Docs mirrors cleaned
/docs/components/backends/*/README.md, /docs/components/backends/sglang/docs/multinode-examples.md
Removed single-line redirect content, resulting in empty files.
Example placeholder
/examples/multimodal
New file with placeholder content “multimodal_v1”.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant Frontend
  participant Router as KV Router (optional)
  participant Prefill as Prefill Worker
  participant Decode as Decode Worker

  Client->>Frontend: HTTP request
  alt Router mode enabled
    Frontend->>Router: Route request (kv)
    Router->>Prefill: Prefill (optional)
    Router->>Decode: Decode/Generate
  else Aggregated
    Frontend->>Decode: Generate
  end
  Decode-->>Frontend: Tokens/Response
  Frontend-->>Client: Stream/Response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

I thump my paws upon the ground,
New charts and YAMLs all around.
Prefill hops, Decode springs,
Routers stash their key-value things.
Docs groomed neat, engines aligned—
A warren of workflows, perfectly designed. 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1945f59 and ddcc343.

📒 Files selected for processing (68)
  • .cursor/rules/docs-rules.mdc (1 hunks)
  • components/README.md (1 hunks)
  • components/backends/sglang/README.md (1 hunks)
  • components/backends/sglang/deploy/agg.yaml.bak (1 hunks)
  • components/backends/sglang/deploy/agg_router.yaml.bak (1 hunks)
  • components/backends/sglang/deploy/disagg.yaml.bak (1 hunks)
  • components/backends/sglang/deploy/disagg_planner.yaml.bak (1 hunks)
  • components/backends/sglang/docs/dsr1-wideep.md (1 hunks)
  • components/backends/sglang/docs/multinode-examples.md (0 hunks)
  • components/backends/sglang/docs/sgl-http-server.md (1 hunks)
  • components/backends/trtllm/README.md (1 hunks)
  • components/backends/trtllm/deploy/agg.yaml.bak (1 hunks)
  • components/backends/trtllm/deploy/agg_router.yaml.bak (1 hunks)
  • components/backends/trtllm/deploy/disagg.yaml.bak (1 hunks)
  • components/backends/trtllm/deploy/disagg_router.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/agg.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/decode.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_agg.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/simple/agg.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/simple/decode.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/dep16_agg.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/eplb.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_agg.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/gemma3/vswa_agg.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/gemma3/vswa_decode.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/gemma3/vswa_prefill.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/gpt_oss/decode.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/gpt_oss/prefill.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/llama4/eagle/eagle_agg.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/llama4/eagle/eagle_decode.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/llama4/eagle/eagle_prefill.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_decode.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_prefill.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/multimodal/agg.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/multimodal/decode.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/multimodal/llama4/decode.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/multimodal/llama4/prefill.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/multimodal/prefill.yaml.bak (1 hunks)
  • components/backends/trtllm/engine_configs/prefill.yaml.bak (1 hunks)
  • components/backends/vllm/README.md (1 hunks)
  • components/backends/vllm/deepseek-r1.md (1 hunks)
  • components/backends/vllm/deploy/agg.yaml.bak (1 hunks)
  • components/backends/vllm/deploy/agg_router.yaml.bak (1 hunks)
  • components/backends/vllm/deploy/disagg.yaml.bak (1 hunks)
  • components/backends/vllm/deploy/disagg_planner.yaml.bak (1 hunks)
  • components/backends/vllm/deploy/disagg_router.yaml.bak (1 hunks)
  • docs/architecture/architecture.md (1 hunks)
  • docs/architecture/sla_planner.md (2 hunks)
  • docs/components/backends/sglang/docs/multinode-examples.md (0 hunks)
  • docs/components/backends/trtllm/README.md (0 hunks)
  • docs/components/backends/vllm/README.md (0 hunks)
  • docs/components/router/README.md (1 hunks)
  • docs/guides/dynamo_deploy/README.md (1 hunks)
  • docs/guides/dynamo_deploy/dynamo_cloud.md (1 hunks)
  • docs/guides/dynamo_deploy/dynamo_operator.md (2 hunks)
  • docs/guides/dynamo_deploy/gke_setup.md (1 hunks)
  • docs/guides/dynamo_deploy/grove.md (1 hunks)
  • docs/guides/dynamo_deploy/k8s_metrics.md (1 hunks)
  • docs/guides/dynamo_deploy/minikube.md (3 hunks)
  • docs/guides/dynamo_deploy/model_caching_with_fluid.md (1 hunks)
  • docs/guides/dynamo_deploy/multinode-deployment.md (3 hunks)
  • docs/guides/dynamo_deploy/quickstart.md (5 hunks)
  • examples/multimodal (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@athreesh athreesh deleted the docs-consolidation-only branch August 27, 2025 04:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants