-
Notifications
You must be signed in to change notification settings - Fork 693
fix: move docker-compose.yml to deploy/, and update frontend port #2121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: move docker-compose.yml to deploy/, and update frontend port #2121
Conversation
WalkthroughThis change updates documentation and example commands across multiple files to standardize the HTTP port from 8000 to 8080 and to reference Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Docs
participant Docker
participant Frontend
User->>Docs: Reads example commands
User->>Docker: Runs docker compose -f deploy/docker-compose.yml up -d
Docker->>Frontend: Starts services (now on port 8080)
User->>Frontend: Sends HTTP request to localhost:8080
Frontend-->>User: Returns response
Estimated code review effort🎯 2 (Simple) | ⏱️ ~7 minutes Possibly related PRs
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🔭 Outside diff range comments (2)
deploy/metrics/README.md (1)
79-82:Required Fileslist is outdated after moving the compose file
docker-compose.ymlno longer lives in this directory, so the first bullet misleads users. Either drop the entry or point to../docker-compose.yml.-- [docker-compose.yml](./docker-compose.yml): Defines the Prometheus and Grafana services +- The main Docker Compose file was moved to `deploy/docker-compose.yml`; update links or remove this line.examples/multimodal/README.md (1)
288-292: Left-over8000breaks copy-paste testingThe video-serving example still points to the old port, undermining the global switch to 8080.
-curl -X 'POST' 'http://localhost:8000/v1/chat/completions' -H 'Content-Type: application/json' -d '{ +curl -X 'POST' 'http://localhost:8080/v1/chat/completions' -H 'Content-Type: application/json' -d '{Please scan the remainder of the file (and sibling READMEs) for similar remnants.
🧹 Nitpick comments (6)
lib/runtime/examples/system_metrics/README.md (1)
21-21: Port change: verify that 8081 is free and documented project-wide
SwitchingDYN_SYSTEM_PORTto 8081 is fine, but other docs still reference 8000/8080 for various services. Make sure the clash matrix is clear so users don’t accidentally bind multiple services to the same port.README.md (1)
125-125: Synchronise sample request port with startup command
The curl example now targets8080, but the earlier “Running an LLM API server” section still showspython -m dynamo.frontend [--http-port 8080]in brackets (optional). To avoid confusion, consider removing the brackets or stating explicitly that 8080 is the default.docs/guides/dynamo_deploy/create_deployment.md (1)
91-91: Format the launch command as a fenced code blockPresenting the CLI invocation inside a
bash-fenced block improves readability and keeps the style consistent with the rest of the guide.-The front end is launched with "python3 -m dynamo.frontend [--http-port 8080] [--router-mode kv]" +Launch the Frontend with: + +```bash +python3 -m dynamo.frontend --http-port 8080 --router-mode kv +```docs/runtime/README.md (1)
47-53: Add a language hint to the fenced code block
markdownlint(MD040) flags this block because the opening fence lacks a language specifier. Usingbashalso aligns with earlier examples in the same file.-``` +```bash # At the root of the repository: docker compose -f deploy/docker-compose.yml up -d</blockquote></details> <details> <summary>components/backends/vllm/multi-node.md (1)</summary><blockquote> `24-26`: **Explicitly state repository root before running Docker Compose** Readers may execute the guide from an arbitrary working directory on the head node. A one-liner avoids confusion: ```diff -# On head node (node-1) -docker compose -f deploy/docker-compose.yml up -d +# On head node (node-1) – run from the repository root +docker compose -f deploy/docker-compose.yml up -ddeploy/sdk/README.md (1)
100-115: Clarify execution directory & double-check port consistency
The compose file path (
deploy/docker-compose.yml) assumes the user runs the command from repo root, while all preceding examples are executed in place. Add an explicitcdor a note to avoid “file not found” confusion.You updated the example request to port 8080 (👍). Please grep the rest of this README for lingering
8000references to avoid mixed instructions.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
README.md(1 hunks)components/backends/sglang/README.md(1 hunks)components/backends/trtllm/README.md(1 hunks)components/backends/vllm/README.md(1 hunks)components/backends/vllm/multi-node.md(1 hunks)components/metrics/README.md(1 hunks)deploy/docker-compose.yml(2 hunks)deploy/metrics/README.md(2 hunks)deploy/sdk/README.md(2 hunks)docs/architecture/dynamo_flow.md(2 hunks)docs/examples/README.md(1 hunks)docs/guides/dynamo_deploy/create_deployment.md(1 hunks)docs/guides/planner_benchmark/README.md(2 hunks)docs/runtime/README.md(1 hunks)examples/multimodal/README.md(4 hunks)examples/router_standalone/README.md(1 hunks)examples/runtime/hello_world/README.md(1 hunks)lib/runtime/examples/system_metrics/README.md(3 hunks)
🧰 Additional context used
🧠 Learnings (6)
docs/runtime/README.md (1)
Learnt from: GuanLuo
PR: #1371
File: examples/llm/benchmarks/vllm_multinode_setup.sh:18-25
Timestamp: 2025-06-05T01:46:15.509Z
Learning: In multi-node setups with head/worker architecture, the head node typically doesn't need environment variables pointing to its own services (like NATS_SERVER, ETCD_ENDPOINTS) because local processes can access them via localhost. Only worker nodes need these environment variables to connect to the head node's external IP address.
components/backends/vllm/multi-node.md (1)
Learnt from: GuanLuo
PR: #1371
File: examples/llm/benchmarks/vllm_multinode_setup.sh:18-25
Timestamp: 2025-06-05T01:46:15.509Z
Learning: In multi-node setups with head/worker architecture, the head node typically doesn't need environment variables pointing to its own services (like NATS_SERVER, ETCD_ENDPOINTS) because local processes can access them via localhost. Only worker nodes need these environment variables to connect to the head node's external IP address.
docs/guides/dynamo_deploy/create_deployment.md (2)
Learnt from: GuanLuo
PR: #1371
File: examples/llm/benchmarks/vllm_multinode_setup.sh:18-25
Timestamp: 2025-06-05T01:46:15.509Z
Learning: In multi-node setups with head/worker architecture, the head node typically doesn't need environment variables pointing to its own services (like NATS_SERVER, ETCD_ENDPOINTS) because local processes can access them via localhost. Only worker nodes need these environment variables to connect to the head node's external IP address.
Learnt from: nnshah1
PR: #1444
File: tests/fault_tolerance/utils/metrics.py:30-32
Timestamp: 2025-07-01T13:55:03.940Z
Learning: The @dynamo_worker() decorator in the dynamo codebase returns a wrapper that automatically injects the runtime parameter before calling the wrapped function. This means callers only need to provide the non-runtime parameters, while the decorator handles injecting the runtime argument automatically. For example, a function with signature async def get_metrics(runtime, log_dir) decorated with @dynamo_worker() can be called as get_metrics(log_dir) because the decorator wrapper injects the runtime parameter.
components/backends/sglang/README.md (2)
Learnt from: fsaady
PR: #1730
File: examples/sglang/slurm_jobs/scripts/worker_setup.py:230-244
Timestamp: 2025-07-03T10:14:30.570Z
Learning: In examples/sglang/slurm_jobs/scripts/worker_setup.py, background processes (like nats-server, etcd) are intentionally left running even if later processes fail. This design choice allows users to manually connect to nodes and debug issues without having to restart the entire SLURM job from scratch, providing operational flexibility for troubleshooting in cluster environments.
Learnt from: GuanLuo
PR: #1371
File: examples/llm/benchmarks/vllm_multinode_setup.sh:18-25
Timestamp: 2025-06-05T01:46:15.509Z
Learning: In multi-node setups with head/worker architecture, the head node typically doesn't need environment variables pointing to its own services (like NATS_SERVER, ETCD_ENDPOINTS) because local processes can access them via localhost. Only worker nodes need these environment variables to connect to the head node's external IP address.
components/backends/vllm/README.md (1)
Learnt from: GuanLuo
PR: #1371
File: examples/llm/benchmarks/vllm_multinode_setup.sh:18-25
Timestamp: 2025-06-05T01:46:15.509Z
Learning: In multi-node setups with head/worker architecture, the head node typically doesn't need environment variables pointing to its own services (like NATS_SERVER, ETCD_ENDPOINTS) because local processes can access them via localhost. Only worker nodes need these environment variables to connect to the head node's external IP address.
deploy/sdk/README.md (2)
Learnt from: julienmancuso
PR: #2012
File: deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yaml:92-98
Timestamp: 2025-07-18T16:04:31.771Z
Learning: CRD schemas in files like deploy/cloud/helm/crds/templates/*.yaml are auto-generated from Kubernetes library upgrades and should not be manually modified as changes would be overwritten during regeneration.
Learnt from: GuanLuo
PR: #1371
File: examples/llm/benchmarks/vllm_multinode_setup.sh:18-25
Timestamp: 2025-06-05T01:46:15.509Z
Learning: In multi-node setups with head/worker architecture, the head node typically doesn't need environment variables pointing to its own services (like NATS_SERVER, ETCD_ENDPOINTS) because local processes can access them via localhost. Only worker nodes need these environment variables to connect to the head node's external IP address.
🪛 markdownlint-cli2 (0.17.2)
docs/runtime/README.md
49-49: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build and Test - vllm
- GitHub Check: pre-merge-rust (lib/bindings/python)
- GitHub Check: pre-merge-rust (.)
- GitHub Check: pre-merge-rust (lib/runtime/examples)
🔇 Additional comments (11)
deploy/docker-compose.yml (2)
126-127: Grafana provisioning paths updated—mirror the change in lib/runtime
The volume mounts now point to./metrics/*. Remember that the comment at line 16 says this file must stay “in sync with lib/runtime/docker-compose.yml”. Double-check that the same path update was applied there to prevent one compose stack from shipping stale dashboards.
95-95: Incorrect assumption about relative paths in Compose volumes
Docker Compose always resolves relative host paths against the directory containing the Compose file. In our case,./metrics/prometheus.ymlis interpreted as
<repo-root>/deploy/metrics/prometheus.yml
regardless of where you invokedocker compose upor how you pass-f deploy/docker-compose.yml. No changes are needed here.Likely an incorrect or invalid review comment.
lib/runtime/examples/system_metrics/README.md (2)
34-34: LGTM – curl updated correctly
The example curl command matches the new port.
69-69: Defaults Table Updated Successfully
Verified that there are no remainingDYN_SYSTEM_PORT=8000occurrences inlib/runtime/examples/system_metrics. The defaults table is now fully in sync with the environment-variable documentation.docs/examples/README.md (1)
70-70: Port-forward documentation updated—👍
The service now forwards 8080→8080 which aligns with the global port shift.examples/runtime/hello_world/README.md (1)
63-65: Path change looks goodThe command now matches the consolidated compose location; no further action required.
docs/guides/planner_benchmark/README.md (1)
49-50: Confirm endpoint actually runs on 8080Docs now list
--url http://localhost:8080.
Before merging, verify that the default frontend insideexamples/llmreally binds to 8080 in all quick-start scripts; otherwise users will hit connection errors.docs/architecture/dynamo_flow.md (1)
26-27: Mermaid node & narrative kept in sync – looks goodPort switch from 8000 → 8080 is consistently applied in both prose and diagram label.
No further action required.Also applies to: 87-88
components/backends/vllm/README.md (1)
128-130: Port-forward mismatchYou forward
8080:8000, implying the service inside cluster still
listens on 8000. After the global switch to 8080, this should probably be8080:8080.Please confirm and adjust either the manifest or the docs.
examples/multimodal/README.md (2)
76-80: Good update to 8080—but ensure whole doc is sweptPort change here aligns with the new default. Keep an eye on the later “video” sections; a few
8000URLs remain and will trip users.
227-228: Nice catch updating the port-forward commandKubernetes forwarding now matches the new 8080 frontend. Looks good.
Overview:
Move docker-compose.yml from deploy/metrics/... to deploy/... for clarity. Updated frontend port to use 8080.
Details:
The deploy/metrics/... path confused many people, so moved to deploy/... Also, after "dynamo serve" got moved and FE ports defaulted to 8080. Updated README.md.
Where should the reviewer start?
All README.md.
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
DYN-759
Summary by CodeRabbit