Skip to content

Conversation

@GuanLuo
Copy link
Contributor

@GuanLuo GuanLuo commented Jun 4, 2025

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • New Features

    • Added a benchmark watcher tool with an HTTP readiness endpoint to monitor worker readiness during benchmarking.
    • Introduced a shell script to automate multi-node environment setup for benchmarking tasks.
  • Documentation

    • Updated the benchmarking guide with instructions on monitoring deployment readiness and using the new watcher service.
  • Chores

    • Updated configuration files to disable request logging for specific worker components.

Copy link
Contributor

@rmccorm4 rmccorm4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments but LGTM

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
examples/llm/benchmarks/benchmark_watcher.py (1)

130-133: Server thread should be daemonised (same feedback still unaddressed)

If the watcher is launched in a larger orchestration, a non-daemon thread will keep the process alive even after the main asyncio task finishes.

-        self._server_thread = threading.Thread(target=start_server, args=(self.server,))
+        self._server_thread = threading.Thread(
+            target=start_server,
+            args=(self.server,),
+            daemon=True,   # allow clean shutdown
+        )
🧹 Nitpick comments (3)
examples/llm/benchmarks/README.md (2)

177-186: Tighten wording & fix minor grammar issues in the new “Monitor Benchmark Startup Status” section

A few tiny fixes improve readability:

-When running dynamo deployment, you may have multiple instances of the same worker kind ...
-... which may result in inaccurate benchmark result at the beginning of the benchmark.
-... you may additionally deploy benchmark watcher to provide signal on whether the full deployment is ready.
-... the watcher will create a HTTP server on port 7001 by default
+When running a Dynamo deployment, you may have multiple instances of the same worker kind …
+… which may result in inaccurate benchmark results at the beginning of the run.
+… you can additionally deploy a benchmark watcher to signal when the full deployment is ready.
+… the watcher will create an HTTP server on port 7001 by default

This is purely cosmetic; feel free to skip if style guides differ.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~179-~179: Possible missing article found.
Context: ... Benchmark Startup Status When running dynamo deployment, you may have multiple insta...

(AI_HYDRA_LEO_MISSING_A)


[uncategorized] ~181-~181: Possible missing article found.
Context: ... run as soon as dynamo is responsive to inference request, which may result in inaccurate...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~181-~181: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...hich may result in inaccurate benchmark result at the beginning of the benchmark. In s...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)


[uncategorized] ~182-~182: Possible missing article found.
Context: ...lly deploy benchmark watcher to provide signal on whether the full deployment is ready...

(AI_HYDRA_LEO_MISSING_A)


[misspelling] ~184-~184: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...dition to that, the watcher will create a HTTP server on port 7001 by default, wh...

(EN_A_VS_AN)


202-226: Filename typo & markdown-lint violations

  1. The script name alternates between vllm_multinode_setup.sh (correct) and vllm_mutinode_setup.sh (typo).
  2. The bullet list (lines 205-213) uses -; elsewhere in the doc * is used – triggers MD004.
  3. The fenced block starting line 215 has no language spec – triggers MD040.
-  source vllm_mutinode_setup.sh
+  source vllm_multinode_setup.sh
@@
-  HEAD_NODE_IP=NODE_0_IP RAY_LEADER_NODE_IP=NODE_0_IP source vllm_mutinode_setup.sh
+  HEAD_NODE_IP=NODE_0_IP RAY_LEADER_NODE_IP=NODE_0_IP source vllm_multinode_setup.sh
@@
-  ```
+  ```bash

While optional, fixing these avoids broken copy-pastes and keeps CI markdown-lint green.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~202-~202: You might be missing the article “the” here.
Context: ...1" 200 - ``` ## Utility for Setting Up Environment ### vLLM - vllm_multinode_setup.sh i...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~211-~211: You might be missing the article “the” here.
Context: ... otherwise start the Ray cluster with current node as the head node. - print the co...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

🪛 markdownlint-cli2 (0.17.2)

205-205: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


208-208: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


210-210: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


212-212: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


215-215: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

examples/llm/benchmarks/benchmark_watcher.py (1)

1-1: Remove blanket # type: ignore directive

Disabling mypy for the whole file hides real type errors (e.g., misuse of await, wrong return types). Delete the line and add targeted # type: ignore[...] only where necessary.

-# type: ignore  # Ignore all mypy errors in this file
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 413a166 and 843da74.

📒 Files selected for processing (2)
  • examples/llm/benchmarks/README.md (1 hunks)
  • examples/llm/benchmarks/benchmark_watcher.py (1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/llm/benchmarks/README.md

[uncategorized] ~179-~179: Possible missing article found.
Context: ... Benchmark Startup Status When running dynamo deployment, you may have multiple insta...

(AI_HYDRA_LEO_MISSING_A)


[uncategorized] ~181-~181: Possible missing article found.
Context: ... run as soon as dynamo is responsive to inference request, which may result in inaccurate...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~181-~181: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...hich may result in inaccurate benchmark result at the beginning of the benchmark. In s...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)


[uncategorized] ~182-~182: Possible missing article found.
Context: ...lly deploy benchmark watcher to provide signal on whether the full deployment is ready...

(AI_HYDRA_LEO_MISSING_A)


[misspelling] ~184-~184: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...dition to that, the watcher will create a HTTP server on port 7001 by default, wh...

(EN_A_VS_AN)


[uncategorized] ~202-~202: You might be missing the article “the” here.
Context: ...1" 200 - ``` ## Utility for Setting Up Environment ### vLLM - vllm_multinode_setup.sh i...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~211-~211: You might be missing the article “the” here.
Context: ... otherwise start the Ray cluster with current node as the head node. - print the co...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

🪛 markdownlint-cli2 (0.17.2)
examples/llm/benchmarks/README.md

205-205: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


208-208: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


210-210: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


212-212: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


215-215: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

🪛 Pylint (3.3.7)
examples/llm/benchmarks/benchmark_watcher.py

[refactor] 46-46: Useless return at end of function or method

(R1711)


[refactor] 98-98: Consider using 'sys.exit' instead

(R1722)


[refactor] 110-110: Too few public methods (1/2)

(R0903)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
examples/llm/benchmarks/benchmark_watcher.py (2)

1-1: Blanket # type: ignore still present

Previous review asked to remove the file-wide mypy suppression; it remains unchanged, hiding all type issues. Replace with targeted # type: ignore[...] only where inevitable.


130-133: ⚠️ Potential issue

Run HTTP server thread as daemon (or manage lifecycle)

The thread is non-daemon, so if the main asyncio task exits (e.g., timeout error) the process may hang. At minimum:

- self._server_thread = threading.Thread(target=start_server, args=(self.server,))
+ self._server_thread = threading.Thread(
+     target=start_server,
+     args=(self.server,),
+     daemon=True,
+ )

Alternatively expose a shutdown path and join the thread on service stop.

🧹 Nitpick comments (3)
examples/llm/benchmarks/README.md (2)

179-186: Tighten wording & fix article usage

Minor language issues reduce clarity:

  1. inaccurate benchmark result → should be plural “results”.
  2. a HTTP server → use “an HTTP server”.
-... may result in inaccurate benchmark result ...
-... the watcher will create a HTTP server ...
+... may result in inaccurate benchmark results ...
+... the watcher will create an HTTP server ...
🧰 Tools
🪛 LanguageTool

[uncategorized] ~179-~179: Possible missing article found.
Context: ... Benchmark Startup Status When running dynamo deployment, you may have multiple insta...

(AI_HYDRA_LEO_MISSING_A)


[uncategorized] ~181-~181: Possible missing article found.
Context: ... run as soon as dynamo is responsive to inference request, which may result in inaccurate...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~181-~181: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...hich may result in inaccurate benchmark result at the beginning of the benchmark. In s...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)


[uncategorized] ~182-~182: Possible missing article found.
Context: ...lly deploy benchmark watcher to provide signal on whether the full deployment is ready...

(AI_HYDRA_LEO_MISSING_A)


[misspelling] ~184-~184: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...dition to that, the watcher will create a HTTP server on port 7001 by default, wh...

(EN_A_VS_AN)


205-212: Align unordered-list style with Markdown guidelines

The rest of the document uses asterisks for bullet lists; this new section switches to dashes, triggering MD004. Consistency aids readability and keeps markdown-lint quiet.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

205-205: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


208-208: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


210-210: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


212-212: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

examples/llm/benchmarks/benchmark_watcher.py (1)

45-55: Silence default HTTP request logging

BaseHTTPRequestHandler logs every request to stderr which clutters benchmark output. Override log_message to no-op or hook into logger.

🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 46-46: Useless return at end of function or method

(R1711)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 843da74 and 1d01a3b.

📒 Files selected for processing (2)
  • examples/llm/benchmarks/README.md (1 hunks)
  • examples/llm/benchmarks/benchmark_watcher.py (1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/llm/benchmarks/README.md

[uncategorized] ~179-~179: Possible missing article found.
Context: ... Benchmark Startup Status When running dynamo deployment, you may have multiple insta...

(AI_HYDRA_LEO_MISSING_A)


[uncategorized] ~181-~181: Possible missing article found.
Context: ... run as soon as dynamo is responsive to inference request, which may result in inaccurate...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~181-~181: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...hich may result in inaccurate benchmark result at the beginning of the benchmark. In s...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)


[uncategorized] ~182-~182: Possible missing article found.
Context: ...lly deploy benchmark watcher to provide signal on whether the full deployment is ready...

(AI_HYDRA_LEO_MISSING_A)


[misspelling] ~184-~184: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...dition to that, the watcher will create a HTTP server on port 7001 by default, wh...

(EN_A_VS_AN)

🪛 markdownlint-cli2 (0.17.2)
examples/llm/benchmarks/README.md

205-205: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


208-208: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


210-210: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


212-212: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

🪛 Pylint (3.3.7)
examples/llm/benchmarks/benchmark_watcher.py

[refactor] 46-46: Useless return at end of function or method

(R1711)


[refactor] 109-109: Too few public methods (1/2)

(R0903)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
examples/llm/benchmarks/README.md (3)

340-349: Fix grammar and article usage in the new section.

Proposed diff to address missing articles, pluralization, and the “HTTP” article:

- When running dynamo deployment, you may have multiple instances
+ When running a Dynamo deployment, you may have multiple instances

- ...as soon as dynamo is responsive to inference request...
+ ...as soon as Dynamo is responsive to an inference request...

- ...which may result in inaccurate benchmark result at the beginning...
+ ...which may result in inaccurate benchmark results at the beginning...

- ...you may additionally deploy benchmark watcher to provide signal...
+ ...you may additionally deploy the benchmark watcher to provide a signal...

- ...the watcher will create a HTTP server on port 7001...
+ ...the watcher will create an HTTP server on port 7001...
🧰 Tools
🪛 LanguageTool

[uncategorized] ~342-~342: Possible missing article found.
Context: ... Benchmark Startup Status When running dynamo deployment, you may have multiple insta...

(AI_HYDRA_LEO_MISSING_A)


[uncategorized] ~344-~344: Possible missing article found.
Context: ... run as soon as dynamo is responsive to inference request, which may result in inaccurate...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~344-~344: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...hich may result in inaccurate benchmark result at the beginning of the benchmark. In s...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)


[uncategorized] ~345-~345: Possible missing article found.
Context: ...lly deploy benchmark watcher to provide signal on whether the full deployment is ready...

(AI_HYDRA_LEO_MISSING_A)


[misspelling] ~347-~347: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...dition to that, the watcher will create a HTTP server on port 7001 by default, wh...

(EN_A_VS_AN)


368-376: Unify markdown list style and correct pluralization.

Switch dash lists to asterisk style and update “environment variable” to plural:

- Depending on whether environment variable `HEAD_NODE_IP` and `RAY_LEADER_NODE_IP` are set
+ Depending on whether *environment variables* `HEAD_NODE_IP` and `RAY_LEADER_NODE_IP` are set

-   - start nats server and etcd on the current node if `HEAD_NODE_IP` is not set, otherwise
-   - set the environment variables as expected by dynamo.
-   - run Ray and connect to the Ray cluster started by `RAY_LEADER_NODE_IP`, otherwise start
-   - print the command with `HEAD_NODE_IP` and `RAY_LEADER_NODE_IP` set, which can be used in
+   * start NATS server and etcd on the current node if `HEAD_NODE_IP` is not set, otherwise
+   * set the environment variables as expected by Dynamo.
+   * run Ray and connect to the Ray cluster started by `RAY_LEADER_NODE_IP`, otherwise start
+   * print the command with `HEAD_NODE_IP` and `RAY_LEADER_NODE_IP` set, which can be used in
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

368-368: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


371-371: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


373-373: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


375-375: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


378-389: Clarify script invocation and prerequisites.

Recommend explicitly referencing the script’s relative path and ensuring it’s executable for reliable reuse:

> **Note:** Make sure you run from the examples directory or prefix the script:
> ```bash
> source ./vllm_multinode_setup.sh
> ```
> You can also mark it executable:
> ```bash
> chmod +x vllm_multinode_setup.sh
> ```
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d01a3b and e026ae2.

📒 Files selected for processing (1)
  • examples/llm/benchmarks/README.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/llm/benchmarks/README.md

[uncategorized] ~342-~342: Possible missing article found.
Context: ... Benchmark Startup Status When running dynamo deployment, you may have multiple insta...

(AI_HYDRA_LEO_MISSING_A)


[uncategorized] ~344-~344: Possible missing article found.
Context: ... run as soon as dynamo is responsive to inference request, which may result in inaccurate...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~344-~344: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...hich may result in inaccurate benchmark result at the beginning of the benchmark. In s...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)


[uncategorized] ~345-~345: Possible missing article found.
Context: ...lly deploy benchmark watcher to provide signal on whether the full deployment is ready...

(AI_HYDRA_LEO_MISSING_A)


[misspelling] ~347-~347: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...dition to that, the watcher will create a HTTP server on port 7001 by default, wh...

(EN_A_VS_AN)

🪛 markdownlint-cli2 (0.17.2)
examples/llm/benchmarks/README.md

368-368: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


371-371: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


373-373: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


375-375: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Mirror Repository to GitLab
  • GitHub Check: Build and Test - vllm
🔇 Additional comments (1)
examples/llm/benchmarks/README.md (1)

350-363: Great addition: HTTP readiness snippet examples.

The curl examples for 400/200 HTTP responses directly address previous feedback on illustrating the readiness endpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants