-
Notifications
You must be signed in to change notification settings - Fork 688
chore: add utilities for benchmarking #1371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
rmccorm4
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments but LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
examples/llm/benchmarks/benchmark_watcher.py (1)
130-133: Server thread should be daemonised (same feedback still unaddressed)If the watcher is launched in a larger orchestration, a non-daemon thread will keep the process alive even after the main asyncio task finishes.
- self._server_thread = threading.Thread(target=start_server, args=(self.server,)) + self._server_thread = threading.Thread( + target=start_server, + args=(self.server,), + daemon=True, # allow clean shutdown + )
🧹 Nitpick comments (3)
examples/llm/benchmarks/README.md (2)
177-186: Tighten wording & fix minor grammar issues in the new “Monitor Benchmark Startup Status” sectionA few tiny fixes improve readability:
-When running dynamo deployment, you may have multiple instances of the same worker kind ... -... which may result in inaccurate benchmark result at the beginning of the benchmark. -... you may additionally deploy benchmark watcher to provide signal on whether the full deployment is ready. -... the watcher will create a HTTP server on port 7001 by default +When running a Dynamo deployment, you may have multiple instances of the same worker kind … +… which may result in inaccurate benchmark results at the beginning of the run. +… you can additionally deploy a benchmark watcher to signal when the full deployment is ready. +… the watcher will create an HTTP server on port 7001 by defaultThis is purely cosmetic; feel free to skip if style guides differ.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~179-~179: Possible missing article found.
Context: ... Benchmark Startup Status When running dynamo deployment, you may have multiple insta...(AI_HYDRA_LEO_MISSING_A)
[uncategorized] ~181-~181: Possible missing article found.
Context: ... run as soon as dynamo is responsive to inference request, which may result in inaccurate...(AI_HYDRA_LEO_MISSING_THE)
[uncategorized] ~181-~181: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...hich may result in inaccurate benchmark result at the beginning of the benchmark. In s...(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)
[uncategorized] ~182-~182: Possible missing article found.
Context: ...lly deploy benchmark watcher to provide signal on whether the full deployment is ready...(AI_HYDRA_LEO_MISSING_A)
[misspelling] ~184-~184: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...dition to that, the watcher will create a HTTP server on port 7001 by default, wh...(EN_A_VS_AN)
202-226: Filename typo & markdown-lint violations
- The script name alternates between
vllm_multinode_setup.sh(correct) andvllm_mutinode_setup.sh(typo).- The bullet list (lines 205-213) uses
-; elsewhere in the doc*is used – triggers MD004.- The fenced block starting line 215 has no language spec – triggers MD040.
- source vllm_mutinode_setup.sh + source vllm_multinode_setup.sh @@ - HEAD_NODE_IP=NODE_0_IP RAY_LEADER_NODE_IP=NODE_0_IP source vllm_mutinode_setup.sh + HEAD_NODE_IP=NODE_0_IP RAY_LEADER_NODE_IP=NODE_0_IP source vllm_multinode_setup.sh @@ - ``` + ```bashWhile optional, fixing these avoids broken copy-pastes and keeps CI markdown-lint green.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~202-~202: You might be missing the article “the” here.
Context: ...1" 200 - ``` ## Utility for Setting Up Environment ### vLLM -vllm_multinode_setup.shi...(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[uncategorized] ~211-~211: You might be missing the article “the” here.
Context: ... otherwise start the Ray cluster with current node as the head node. - print the co...(AI_EN_LECTOR_MISSING_DETERMINER_THE)
🪛 markdownlint-cli2 (0.17.2)
205-205: Unordered list style
Expected: asterisk; Actual: dash(MD004, ul-style)
208-208: Unordered list style
Expected: asterisk; Actual: dash(MD004, ul-style)
210-210: Unordered list style
Expected: asterisk; Actual: dash(MD004, ul-style)
212-212: Unordered list style
Expected: asterisk; Actual: dash(MD004, ul-style)
215-215: Fenced code blocks should have a language specified
null(MD040, fenced-code-language)
examples/llm/benchmarks/benchmark_watcher.py (1)
1-1: Remove blanket# type: ignoredirectiveDisabling mypy for the whole file hides real type errors (e.g., misuse of
await, wrong return types). Delete the line and add targeted# type: ignore[...]only where necessary.-# type: ignore # Ignore all mypy errors in this file
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
examples/llm/benchmarks/README.md(1 hunks)examples/llm/benchmarks/benchmark_watcher.py(1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/llm/benchmarks/README.md
[uncategorized] ~179-~179: Possible missing article found.
Context: ... Benchmark Startup Status When running dynamo deployment, you may have multiple insta...
(AI_HYDRA_LEO_MISSING_A)
[uncategorized] ~181-~181: Possible missing article found.
Context: ... run as soon as dynamo is responsive to inference request, which may result in inaccurate...
(AI_HYDRA_LEO_MISSING_THE)
[uncategorized] ~181-~181: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...hich may result in inaccurate benchmark result at the beginning of the benchmark. In s...
(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)
[uncategorized] ~182-~182: Possible missing article found.
Context: ...lly deploy benchmark watcher to provide signal on whether the full deployment is ready...
(AI_HYDRA_LEO_MISSING_A)
[misspelling] ~184-~184: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...dition to that, the watcher will create a HTTP server on port 7001 by default, wh...
(EN_A_VS_AN)
[uncategorized] ~202-~202: You might be missing the article “the” here.
Context: ...1" 200 - ``` ## Utility for Setting Up Environment ### vLLM - vllm_multinode_setup.sh i...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[uncategorized] ~211-~211: You might be missing the article “the” here.
Context: ... otherwise start the Ray cluster with current node as the head node. - print the co...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
🪛 markdownlint-cli2 (0.17.2)
examples/llm/benchmarks/README.md
205-205: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
208-208: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
210-210: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
212-212: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
215-215: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
🪛 Pylint (3.3.7)
examples/llm/benchmarks/benchmark_watcher.py
[refactor] 46-46: Useless return at end of function or method
(R1711)
[refactor] 98-98: Consider using 'sys.exit' instead
(R1722)
[refactor] 110-110: Too few public methods (1/2)
(R0903)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
examples/llm/benchmarks/benchmark_watcher.py (2)
1-1: Blanket# type: ignorestill presentPrevious review asked to remove the file-wide mypy suppression; it remains unchanged, hiding all type issues. Replace with targeted
# type: ignore[...]only where inevitable.
130-133:⚠️ Potential issueRun HTTP server thread as daemon (or manage lifecycle)
The thread is non-daemon, so if the main asyncio task exits (e.g., timeout error) the process may hang. At minimum:
- self._server_thread = threading.Thread(target=start_server, args=(self.server,)) + self._server_thread = threading.Thread( + target=start_server, + args=(self.server,), + daemon=True, + )Alternatively expose a shutdown path and join the thread on service stop.
🧹 Nitpick comments (3)
examples/llm/benchmarks/README.md (2)
179-186: Tighten wording & fix article usageMinor language issues reduce clarity:
inaccurate benchmark result→ should be plural “results”.a HTTP server→ use “an HTTP server”.-... may result in inaccurate benchmark result ... -... the watcher will create a HTTP server ... +... may result in inaccurate benchmark results ... +... the watcher will create an HTTP server ...🧰 Tools
🪛 LanguageTool
[uncategorized] ~179-~179: Possible missing article found.
Context: ... Benchmark Startup Status When running dynamo deployment, you may have multiple insta...(AI_HYDRA_LEO_MISSING_A)
[uncategorized] ~181-~181: Possible missing article found.
Context: ... run as soon as dynamo is responsive to inference request, which may result in inaccurate...(AI_HYDRA_LEO_MISSING_THE)
[uncategorized] ~181-~181: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...hich may result in inaccurate benchmark result at the beginning of the benchmark. In s...(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)
[uncategorized] ~182-~182: Possible missing article found.
Context: ...lly deploy benchmark watcher to provide signal on whether the full deployment is ready...(AI_HYDRA_LEO_MISSING_A)
[misspelling] ~184-~184: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...dition to that, the watcher will create a HTTP server on port 7001 by default, wh...(EN_A_VS_AN)
205-212: Align unordered-list style with Markdown guidelinesThe rest of the document uses asterisks for bullet lists; this new section switches to dashes, triggering MD004. Consistency aids readability and keeps markdown-lint quiet.
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
205-205: Unordered list style
Expected: asterisk; Actual: dash(MD004, ul-style)
208-208: Unordered list style
Expected: asterisk; Actual: dash(MD004, ul-style)
210-210: Unordered list style
Expected: asterisk; Actual: dash(MD004, ul-style)
212-212: Unordered list style
Expected: asterisk; Actual: dash(MD004, ul-style)
examples/llm/benchmarks/benchmark_watcher.py (1)
45-55: Silence default HTTP request logging
BaseHTTPRequestHandlerlogs every request to stderr which clutters benchmark output. Overridelog_messageto no-op or hook intologger.🧰 Tools
🪛 Pylint (3.3.7)
[refactor] 46-46: Useless return at end of function or method
(R1711)
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
examples/llm/benchmarks/README.md(1 hunks)examples/llm/benchmarks/benchmark_watcher.py(1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/llm/benchmarks/README.md
[uncategorized] ~179-~179: Possible missing article found.
Context: ... Benchmark Startup Status When running dynamo deployment, you may have multiple insta...
(AI_HYDRA_LEO_MISSING_A)
[uncategorized] ~181-~181: Possible missing article found.
Context: ... run as soon as dynamo is responsive to inference request, which may result in inaccurate...
(AI_HYDRA_LEO_MISSING_THE)
[uncategorized] ~181-~181: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...hich may result in inaccurate benchmark result at the beginning of the benchmark. In s...
(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)
[uncategorized] ~182-~182: Possible missing article found.
Context: ...lly deploy benchmark watcher to provide signal on whether the full deployment is ready...
(AI_HYDRA_LEO_MISSING_A)
[misspelling] ~184-~184: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...dition to that, the watcher will create a HTTP server on port 7001 by default, wh...
(EN_A_VS_AN)
🪛 markdownlint-cli2 (0.17.2)
examples/llm/benchmarks/README.md
205-205: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
208-208: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
210-210: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
212-212: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
🪛 Pylint (3.3.7)
examples/llm/benchmarks/benchmark_watcher.py
[refactor] 46-46: Useless return at end of function or method
(R1711)
[refactor] 109-109: Too few public methods (1/2)
(R0903)
Signed-off-by: GuanLuo <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
examples/llm/benchmarks/README.md (3)
340-349: Fix grammar and article usage in the new section.Proposed diff to address missing articles, pluralization, and the “HTTP” article:
- When running dynamo deployment, you may have multiple instances + When running a Dynamo deployment, you may have multiple instances - ...as soon as dynamo is responsive to inference request... + ...as soon as Dynamo is responsive to an inference request... - ...which may result in inaccurate benchmark result at the beginning... + ...which may result in inaccurate benchmark results at the beginning... - ...you may additionally deploy benchmark watcher to provide signal... + ...you may additionally deploy the benchmark watcher to provide a signal... - ...the watcher will create a HTTP server on port 7001... + ...the watcher will create an HTTP server on port 7001...🧰 Tools
🪛 LanguageTool
[uncategorized] ~342-~342: Possible missing article found.
Context: ... Benchmark Startup Status When running dynamo deployment, you may have multiple insta...(AI_HYDRA_LEO_MISSING_A)
[uncategorized] ~344-~344: Possible missing article found.
Context: ... run as soon as dynamo is responsive to inference request, which may result in inaccurate...(AI_HYDRA_LEO_MISSING_THE)
[uncategorized] ~344-~344: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...hich may result in inaccurate benchmark result at the beginning of the benchmark. In s...(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)
[uncategorized] ~345-~345: Possible missing article found.
Context: ...lly deploy benchmark watcher to provide signal on whether the full deployment is ready...(AI_HYDRA_LEO_MISSING_A)
[misspelling] ~347-~347: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...dition to that, the watcher will create a HTTP server on port 7001 by default, wh...(EN_A_VS_AN)
368-376: Unify markdown list style and correct pluralization.Switch dash lists to asterisk style and update “environment variable” to plural:
- Depending on whether environment variable `HEAD_NODE_IP` and `RAY_LEADER_NODE_IP` are set + Depending on whether *environment variables* `HEAD_NODE_IP` and `RAY_LEADER_NODE_IP` are set - - start nats server and etcd on the current node if `HEAD_NODE_IP` is not set, otherwise - - set the environment variables as expected by dynamo. - - run Ray and connect to the Ray cluster started by `RAY_LEADER_NODE_IP`, otherwise start - - print the command with `HEAD_NODE_IP` and `RAY_LEADER_NODE_IP` set, which can be used in + * start NATS server and etcd on the current node if `HEAD_NODE_IP` is not set, otherwise + * set the environment variables as expected by Dynamo. + * run Ray and connect to the Ray cluster started by `RAY_LEADER_NODE_IP`, otherwise start + * print the command with `HEAD_NODE_IP` and `RAY_LEADER_NODE_IP` set, which can be used in🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
368-368: Unordered list style
Expected: asterisk; Actual: dash(MD004, ul-style)
371-371: Unordered list style
Expected: asterisk; Actual: dash(MD004, ul-style)
373-373: Unordered list style
Expected: asterisk; Actual: dash(MD004, ul-style)
375-375: Unordered list style
Expected: asterisk; Actual: dash(MD004, ul-style)
378-389: Clarify script invocation and prerequisites.Recommend explicitly referencing the script’s relative path and ensuring it’s executable for reliable reuse:
> **Note:** Make sure you run from the examples directory or prefix the script: > ```bash > source ./vllm_multinode_setup.sh > ``` > You can also mark it executable: > ```bash > chmod +x vllm_multinode_setup.sh > ```
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
examples/llm/benchmarks/README.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/llm/benchmarks/README.md
[uncategorized] ~342-~342: Possible missing article found.
Context: ... Benchmark Startup Status When running dynamo deployment, you may have multiple insta...
(AI_HYDRA_LEO_MISSING_A)
[uncategorized] ~344-~344: Possible missing article found.
Context: ... run as soon as dynamo is responsive to inference request, which may result in inaccurate...
(AI_HYDRA_LEO_MISSING_THE)
[uncategorized] ~344-~344: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...hich may result in inaccurate benchmark result at the beginning of the benchmark. In s...
(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)
[uncategorized] ~345-~345: Possible missing article found.
Context: ...lly deploy benchmark watcher to provide signal on whether the full deployment is ready...
(AI_HYDRA_LEO_MISSING_A)
[misspelling] ~347-~347: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...dition to that, the watcher will create a HTTP server on port 7001 by default, wh...
(EN_A_VS_AN)
🪛 markdownlint-cli2 (0.17.2)
examples/llm/benchmarks/README.md
368-368: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
371-371: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
373-373: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
375-375: Unordered list style
Expected: asterisk; Actual: dash
(MD004, ul-style)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Mirror Repository to GitLab
- GitHub Check: Build and Test - vllm
🔇 Additional comments (1)
examples/llm/benchmarks/README.md (1)
350-363: Great addition: HTTP readiness snippet examples.The curl examples for 400/200 HTTP responses directly address previous feedback on illustrating the readiness endpoint.
Overview:
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
New Features
Documentation
Chores