Skip to content

Conversation

@whoisj
Copy link
Collaborator

@whoisj whoisj commented Jun 10, 2025

This change moves the examples/llm benchmarking code to benchmarking/llm.

Includes:

  • instructions for additional models, model frameworks, etc. to the LLM benchmarking instructions.
  • instructions for using Prometheus + Grafana to visualize Dynamo worker metrics.
  • corrections and style changes to the README.

DIS-153
DIS-156
DIS-13

Summary by CodeRabbit

  • Documentation
    • Improved and reorganized the LLM benchmarking README for better clarity, formatting, and step-by-step instructions.
    • Added notes, tips, and clearer hardware requirements to aid users in setup and execution.
    • Introduced a new README file linking to the updated benchmarking documentation.

This change moves the examples/llm benchmarking code to benchmarking/llm.

Includes corrections and style changes to the README as well.
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jun 10, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link

👋 Hi whoisj! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

@github-actions github-actions bot added the external-contribution Pull request is from an external contributor label Jun 10, 2025
@whoisj whoisj changed the title DRAFT chore: Move Benchmarking to Top Level chore: Move Benchmarking to Top Level Jun 10, 2025
@github-actions github-actions bot added the chore label Jun 10, 2025
@whoisj whoisj removed the external-contribution Pull request is from an external contributor label Jun 10, 2025
@whoisj whoisj marked this pull request as draft June 10, 2025 20:27
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 10, 2025

Walkthrough

A new README was added to the benchmarks/llm directory, containing a reference to another README. The referenced README at examples/llm/benchmarks/README.md was extensively reorganized and reformatted for clarity, with improved instructions, consistent formatting, added notes and tips, and expanded guidance, but no functional changes to commands or logic.

Changes

File(s) Change Summary
benchmarks/llm/README.md Added new README with Apache 2.0 license header and a single markdown link referencing another README.
examples/llm/benchmarks/README.md Reorganized, reformatted, and clarified documentation; added hardware notes, tips, numbered steps, and expanded guidance; corrected paths; improved formatting and consistency.

Poem

In README burrows, neat and bright,
The docs got fluffed and set just right.
With tips and notes and steps anew,
The path for benchmarks now feels true.
A rabbit hops with pride today—
Clear docs make work a hop away!
🐇📚✨


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b1c3ad3 and 4384519.

📒 Files selected for processing (1)
  • benchmarks/llm/README.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • benchmarks/llm/README.md
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Test - vllm

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
benchmarks/llm/README.md (1)

1-2: ⚠️ Potential issue

Add proper Markdown link and context: The README currently lists only a raw path. Replace it with a descriptive title and a Markdown link, for example:

- ../../examples/llm/benchmarks/README.md
+ # LLM Benchmarking Guide
+ See the [LLM Deployment Benchmarking Guide](../../examples/llm/benchmarks/README.md).
♻️ Duplicate comments (2)
examples/llm/benchmarks/README.md (2)

102-102: Duplicate: Standardize callout casing


157-157: Duplicate: Standardize callout casing

🧹 Nitpick comments (4)
examples/llm/benchmarks/README.md (4)

67-67: Nit: Standardize callout casing: The file mixes [!NOTE] and [!Important]. For consistency, consider using uppercase for all admonitions (e.g., [!IMPORTANT], [!NOTE], [!TIP]).


123-128: Typo in step title: Change “Config NATS and ETCD” to “Configure NATS and ETCD” for grammatical correctness.


153-153: Nit: Preposition consistency: Consider using “shown in the [Collecting Performance Numbers]” rather than “shown on” for a more natural phrasing.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~153-~153: The preposition “in” seems more likely in this position than the preposition “on”.
Context: ...ollect the performance numbers as shown on the [Collecting Performance Numbers](#c...

(AI_EN_LECTOR_REPLACEMENT_PREPOSITION_ON_IN)


195-195: Nit: Missing article: Update “Use NGINX as load balancer” to “Use NGINX as a load balancer” for grammatical accuracy.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~195-~195: You might be missing the article “a” here.
Context: ...ve` instance per node. 3. Use NGINX as load balancer ```bash apt update &&...

(AI_EN_LECTOR_MISSING_DETERMINER_A)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8e728b and 9f16d04.

📒 Files selected for processing (2)
  • benchmarks/llm/README.md (1 hunks)
  • examples/llm/benchmarks/README.md (3 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/llm/benchmarks/README.md

[uncategorized] ~98-~98: The preposition “in” seems more likely in this position than the preposition “on”.
Context: ...ollect the performance numbers as shown on the [Collecting Performance Numbers](#c...

(AI_EN_LECTOR_REPLACEMENT_PREPOSITION_ON_IN)


[uncategorized] ~153-~153: The preposition “in” seems more likely in this position than the preposition “on”.
Context: ...ollect the performance numbers as shown on the [Collecting Performance Numbers](#c...

(AI_EN_LECTOR_REPLACEMENT_PREPOSITION_ON_IN)


[uncategorized] ~195-~195: You might be missing the article “a” here.
Context: ...ve` instance per node. 3. Use NGINX as load balancer ```bash apt update &&...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~206-~206: The preposition “in” seems more likely in this position than the preposition “on”.
Context: ...ollect the performance numbers as shown on the [Collecting Performance Numbers](#c...

(AI_EN_LECTOR_REPLACEMENT_PREPOSITION_ON_IN)


[style] ~216-~216: Using many exclamation marks might seem excessive (in this case: 16 exclamation marks for a text that’s 5031 characters long)
Context: ...orkspace/benchmarks/llm/perf.sh ``` > [!Tip] > See [GenAI-Perf tutorial](https:/...

(EN_EXCESSIVE_EXCLAMATION)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Test - vllm
🔇 Additional comments (11)
examples/llm/benchmarks/README.md (11)

27-33: LGTM: The hardware callout and initial build step are clear and properly formatted.


37-39: LGTM: The model download step is well-documented with the correct Hugging Face command.


43-45: LGTM: NATS and ETCD startup instructions are concise and correctly formatted.


79-83: LGTM: The single-node container run step is clear with proper code block formatting.


114-118: LGTM: The multi-node container run command is accurate and well-structured.


133-138: LGTM: Node 0 worker startup is clearly documented with proper logging guidance.


143-148: LGTM: Node 1 worker startup is clearly documented with proper logging guidance.


161-165: LGTM: The vLLM aggregated baseline container startup step is clear and correctly formatted.


171-189: LGTM: The vLLM serve commands are consolidated in a single code block with clear GPU assignments and log redirection.


197-201: LGTM: The NGINX installation and config steps are clear and correctly reference the moved nginx.conf path.


212-214: LGTM: The benchmarking script invocation has been updated to the new path under /workspace/benchmarks/llm.

This change adds instructions for deploying additional models, model frameworks, etc. to the LLM benchmarking instructions.
@whoisj whoisj requested a review from kthui June 11, 2025 22:19
@whoisj whoisj added the documentation Improvements or additions to documentation label Jun 11, 2025
@whoisj whoisj added the docs label Jun 11, 2025
@whoisj whoisj marked this pull request as ready for review June 11, 2025 22:20
@github-actions github-actions bot removed the docs label Jun 11, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
examples/llm/benchmarks/README.md (7)

101-101: Use ‘in’ instead of ‘on’ for section reference
Change to “…as shown in the [Collecting Performance Numbers] section below.”


106-108: Add missing preposition ‘for’ in requirement statement

- > Two 8xH100-80GB nodes are required the following instructions.
+ > Two 8xH100-80GB nodes are required for the following instructions.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~107-~107: Possible missing comma found.
Context: ...!Important] > Two 8xH100-80GB nodes are required the following instructions. In the fol...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~107-~107: Possible missing preposition found.
Context: ...t] > Two 8xH100-80GB nodes are required the following instructions. In the followi...

(AI_HYDRA_LEO_MISSING_IN)


158-158: Use ‘in’ instead of ‘on’ for section reference
Update to “…as shown in the Collecting Performance Numbers section above.”

🧰 Tools
🪛 LanguageTool

[uncategorized] ~158-~158: The preposition “in” seems more likely in this position than the preposition “on”.
Context: ...ollect the performance numbers as shown on the [Collecting Performance Numbers](#c...

(AI_EN_LECTOR_REPLACEMENT_PREPOSITION_ON_IN)


164-165: Add missing preposition ‘for’ here as well

- > One (or two) 8xH100-80GB nodes are required the following instructions.
+ > One (or two) 8xH100-80GB nodes are required for the following instructions.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~164-~164: Possible missing comma found.
Context: ...t] > One (or two) 8xH100-80GB nodes are required the following instructions. With the D...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~164-~164: Possible missing preposition found.
Context: ...(or two) 8xH100-80GB nodes are required the following instructions. With the Dynam...

(AI_HYDRA_LEO_MISSING_IN)


203-203: Add article for grammatical correctness

- ## Use NGINX as load balancer
+ ## Use NGINX as a load balancer
🧰 Tools
🪛 LanguageTool

[uncategorized] ~203-~203: You might be missing the article “a” here.
Context: ...e` instance per node. 3. Use NGINX as load balancer ```bash apt update &&...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


242-246: Align list style with repository conventions
Convert dashes to asterisks for unordered lists:

- - [Dynamo Multinode Deployments](../../../docs/examples/multinode.md)
- - [Dynamo TensorRT LLM Deployments](../../../docs/examples/trtllm.md)
- - [Aggregated Deployment of Very Large Models](../../../docs/examples/multinode.md#aggregated-deployment)
- - [Dynamo vLLM Deployments](../../../docs/examples/llm_deployment.md)
+ * [Dynamo Multinode Deployments](../../../docs/examples/multinode.md)
+ * [Dynamo TensorRT LLM Deployments](../../../docs/examples/trtllm.md)
+ * [Aggregated Deployment of Very Large Models](../../../docs/examples/multinode.md#aggregated-deployment)
+ * [Dynamo vLLM Deployments](../../../docs/examples/llm_deployment.md)
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

242-242: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


243-243: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


244-244: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


244-244: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


245-245: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


238-238: Add comma for clarity

- so long as an accessible endpoint is available for it to interact with.
+ so long as an accessible endpoint is available, for it to interact with.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~238-~238: Possible missing comma found.
Context: ...f tool will report the same metrics and measurements so long as an accessible endpoint is av...

(AI_HYDRA_LEO_MISSING_COMMA)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f16d04 and b1c3ad3.

📒 Files selected for processing (1)
  • examples/llm/benchmarks/README.md (3 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/llm/benchmarks/README.md

[uncategorized] ~107-~107: Possible missing comma found.
Context: ...!Important] > Two 8xH100-80GB nodes are required the following instructions. In the fol...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~107-~107: Possible missing preposition found.
Context: ...t] > Two 8xH100-80GB nodes are required the following instructions. In the followi...

(AI_HYDRA_LEO_MISSING_IN)


[uncategorized] ~158-~158: The preposition “in” seems more likely in this position than the preposition “on”.
Context: ...ollect the performance numbers as shown on the [Collecting Performance Numbers](#c...

(AI_EN_LECTOR_REPLACEMENT_PREPOSITION_ON_IN)


[uncategorized] ~164-~164: Possible missing comma found.
Context: ...t] > One (or two) 8xH100-80GB nodes are required the following instructions. With the D...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~164-~164: Possible missing preposition found.
Context: ...(or two) 8xH100-80GB nodes are required the following instructions. With the Dynam...

(AI_HYDRA_LEO_MISSING_IN)


[uncategorized] ~203-~203: You might be missing the article “a” here.
Context: ...e` instance per node. 3. Use NGINX as load balancer ```bash apt update &&...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[style] ~225-~225: Using many exclamation marks might seem excessive (in this case: 16 exclamation marks for a text that’s 6328 characters long)
Context: ...orkspace/benchmarks/llm/perf.sh ``` > [!Tip] > See [GenAI-Perf tutorial](https:/...

(EN_EXCESSIVE_EXCLAMATION)


[uncategorized] ~238-~238: Possible missing comma found.
Context: ...f tool will report the same metrics and measurements so long as an accessible endpoint is av...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 markdownlint-cli2 (0.17.2)
examples/llm/benchmarks/README.md

242-242: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


243-243: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


244-244: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


244-244: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


245-245: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Test - vllm
🔇 Additional comments (7)
examples/llm/benchmarks/README.md (7)

25-43: Well-structured Prerequisites section
The heading, important callout, numbered steps, and fenced code blocks are clear and consistent.


49-65: Hardware configuration details are clear
The blockquote format and bold labels for GPUs, CPU, NVLink, and InfiniBand make requirements easy to scan.


67-101: Disaggregated Single Node instructions are clear
The important callout, step numbering, tips, and code examples enhance readability and usability.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~72-~72: Possible missing comma found.
Context: ...llowing instructions. In the following setup we compare Dynamo disaggregated vLLM pe...

(AI_HYDRA_LEO_MISSING_COMMA)


[typographical] ~75-~75: Consider adding a comma here.
Context: ... (ms). For more details on your use case please see the [Performance Tuning Guide](/doc...

(PLEASE_COMMA)


161-199: Aggregated baseline benchmarking section is well-organized
The numbered steps, code blocks, tips, and notes are consistently formatted.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~164-~164: Possible missing comma found.
Context: ...t] > One (or two) 8xH100-80GB nodes are required the following instructions. With the D...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~164-~164: Possible missing preposition found.
Context: ...(or two) 8xH100-80GB nodes are required the following instructions. With the Dynam...

(AI_HYDRA_LEO_MISSING_IN)


217-225: Collecting Performance Numbers instructions are clear
Fenced script invocation and contextual tip are spot-on.

🧰 Tools
🪛 LanguageTool

[style] ~225-~225: Using many exclamation marks might seem excessive (in this case: 16 exclamation marks for a text that’s 6328 characters long)
Context: ...orkspace/benchmarks/llm/perf.sh ``` > [!Tip] > See [GenAI-Perf tutorial](https:/...

(EN_EXCESSIVE_EXCLAMATION)


231-239: Supporting Additional Models section is concise and helpful
The guidance on reuse and adaptation is well-phrased.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~238-~238: Possible missing comma found.
Context: ...f tool will report the same metrics and measurements so long as an accessible endpoint is av...

(AI_HYDRA_LEO_MISSING_COMMA)


249-252: Metrics and Visualization section is concise
The link to Prometheus/Grafana guide is clear and correctly referenced.

Copy link
Contributor

@kthui kthui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may also want @tanmayv25 's review on the README.md, because I think he is working on verifying the instruction and adding result interpretation section at the bottom.

Co-authored-by: Jacky <[email protected]>
Signed-off-by: Tanmay Verma <[email protected]>
@whoisj whoisj merged commit 2ae9ab9 into ai-dynamo:main Jun 12, 2025
8 checks passed
@whoisj whoisj deleted the whoisj/benchmarks/relocate branch June 12, 2025 00:14
nealvaidya pushed a commit that referenced this pull request Jun 26, 2025
Signed-off-by: Tanmay Verma <[email protected]>
Co-authored-by: Tanmay Verma <[email protected]>
Co-authored-by: Jacky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore documentation Improvements or additions to documentation size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants