Skip to content

Conversation

@keivenchang
Copy link
Contributor

@keivenchang keivenchang commented Sep 5, 2025

Overview:

This PR reverts Dockerfile.vllm and build/run scripts to maintain the same build and run behaviors as before August 28 commit 82bae24. This is to maintain backward compatibility.

Details:

  • Split Dockerfile.vllm dev target into two distinct targets:
    • local-dev: For VS Code/Cursor Dev Container plugin use only
    • dev: For command-line development with run.sh script
  • Add comprehensive feature matrix comparing both development targets
  • Remove --uid/--gid options from build.sh (now handled by local-dev target)
  • Remove DEV_MODE logic from run.sh (simplified workspace mounting)
  • Consolidate ENV variables in both targets for better maintainability
  • Update build.sh to use local-dev target for UID/GID mapping
  • Maintain backward compatibility with existing workflows

Where should the reviewer start?

  • container/Dockerfile.vllm: Review the feature matrix and target separation
  • container/build.sh: Check removal of --uid/--gid options and UID/GID handling
  • container/run.sh: Verify simplified workspace mounting logic

Related Issues:

BUG-5501463

…havior

This reverts changes to maintain the same build and run behaviors as before
August 28 commit 82bae24.

Changes:
- Split Dockerfile.vllm dev target into local-dev (Dev Container) and dev (run.sh)
- Add comprehensive feature matrix comparing both development targets
- Remove --uid/--gid options from build.sh (now handled by local-dev target)
- Remove DEV_MODE logic from run.sh (simplified workspace mounting)
- Consolidate ENV variables in both targets for better maintainability
- Update build.sh to use local-dev target for UID/GID mapping

Signed-off-by: Keiven Chang <[email protected]>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 5, 2025

Walkthrough

Introduces dual development targets in container/Dockerfile.vllm: local-dev (non-root, UID/GID-mapped) and dev (root). Updates container/build.sh to pass host UID/GID only for local-dev. Simplifies container/run.sh by removing --target and dev-mode, refactoring workspace-mount, HF cache, and privileged flag handling.

Changes

Cohort / File(s) Summary
Dockerfile development targets
container/Dockerfile.vllm
Renames prior dev stage to local-dev with non-root user, UID/GID args, workspace envs, copied Rust toolchains and venv, maturin, and extended PYTHONPATH; adds a new root-based dev stage with tooling and similar env; preserves runtime/entrypoint; documents target differences.
Build script target args
container/build.sh
Removes --uid/--gid CLI flags and related help/validation; auto-injects USER_UID/USER_GID only when TARGET=local-dev; retains default flow for other targets.
Run script flow simplification
container/run.sh
Drops --target and DEV_MODE; consolidates workspace-mount handling (adds /workspace, /tmp, /mnt, HF_TOKEN, interactive mode, defaults PRIVILEGED=TRUE); earlier HF_CACHE mount; clarifies PRIVILEGED_STRING logic; removes post-detection workspace block; updates help text.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Dev as Developer
  participant BS as build.sh
  participant DK as Docker
  participant DF as Dockerfile.vllm
  participant IMG as Image

  Dev->>BS: ./container/build.sh --target local-dev
  BS->>BS: Append USER_UID/USER_GID
  BS->>DK: docker build --target local-dev --build-arg UID/GID
  DK->>DF: Build stage local-dev
  DF-->>IMG: Produce local-dev image (non-root)

  Dev->>BS: ./container/build.sh --target dev
  BS->>DK: docker build --target dev
  DK->>DF: Build stage dev
  DF-->>IMG: Produce dev image (root)
Loading
sequenceDiagram
  autonumber
  actor Dev as Developer
  participant RS as run.sh
  participant DK as Docker
  participant C as Container

  Dev->>RS: ./container/run.sh [--mount-workspace ...] [--hf-cache ...] [--privileged ...]
  RS->>RS: Parse options (no --target, no DEV_MODE)
  RS->>RS: If --mount-workspace: add volumes (/workspace,/tmp,/mnt), HF_TOKEN, force -it, default PRIVILEGED=TRUE
  RS->>RS: If HF_CACHE set: mount to /root/.cache/huggingface
  RS->>RS: PRIVILEGED_STRING = "" if FALSE else "--privileged"
  RS->>DK: docker run ... ${PRIVILEGED_STRING} -v ...
  DK-->>C: Start container
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

In a steel pot, two soups now brew—
local-dev hops light, with UID in view.
Rooty dev stirs deep, strong and bright,
run.sh trims paths, keeps mounts tight.
I thump approval, ears held high—
Docker dreams, we multiply! 🐇🚀


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
container/Dockerfile.vllm (1)

331-432: local-dev fails if 'ubuntu' user is absent and lacks arg preflight; harden user setup.

  • This stage assumes an existing “ubuntu” user; most CUDA base images don’t have it. groupmod/usermod will fail.
  • Add an explicit preflight for USER_UID/USER_GID and create the user/group if missing. Also, the COPY of /usr/local/bin from runtime is redundant (this stage is FROM runtime).

Apply:

@@
-FROM runtime AS local-dev
+FROM runtime AS local-dev
@@
-ENV USERNAME=ubuntu
-ARG USER_UID
-ARG USER_GID
+ENV USERNAME=ubuntu
+ARG USER_UID
+ARG USER_GID
 ARG WORKSPACE_DIR=/workspace
@@
-COPY --from=runtime /usr/local/bin /usr/local/bin
+## Inherits /usr/local/bin from runtime; no need to copy again
@@
-RUN apt-get update && apt-get install -y sudo gnupg2 gnupg1 \
-    && echo "$USERNAME ALL=(root) NOPASSWD:ALL" > /etc/sudoers.d/$USERNAME \
-    && chmod 0440 /etc/sudoers.d/$USERNAME \
-    && mkdir -p /home/$USERNAME \
-    && groupmod -g $USER_GID $USERNAME \
-    && usermod -u $USER_UID -g $USER_GID $USERNAME \
-    && chown -R $USERNAME:$USERNAME /home/$USERNAME \
-    && rm -rf /var/lib/apt/lists/* \
-    && chsh -s /bin/bash $USERNAME
+RUN apt-get update && apt-get install -y sudo gnupg2 gnupg1 \
+    && test -n "$USER_UID" -a -n "$USER_GID" || (echo "ERROR: USER_UID and USER_GID must be set for local-dev" >&2; exit 1) \
+    && if ! id -u "$USERNAME" >/dev/null 2>&1; then \
+         groupadd -g "$USER_GID" "$USERNAME" && \
+         useradd -m -u "$USER_UID" -g "$USER_GID" -s /bin/bash "$USERNAME"; \
+       else \
+         groupmod -o -g "$USER_GID" "$USERNAME" && \
+         usermod  -o -u "$USER_UID" -g "$USER_GID" "$USERNAME"; \
+       fi \
+    && echo "$USERNAME ALL=(root) NOPASSWD:ALL" > /etc/sudoers.d/$USERNAME \
+    && chmod 0440 /etc/sudoers.d/$USERNAME \
+    && chown -R $USERNAME:$USERNAME /home/$USERNAME \
+    && rm -rf /var/lib/apt/lists/* \
+    && chsh -s /bin/bash $USERNAME

This keeps the “no default USER_UID/GID” policy while preventing fragile builds. Also aligns with the team learning on explicit UID/GID mapping.

🧹 Nitpick comments (5)
container/Dockerfile.vllm (2)

283-321: Sync the feature matrix with actual behavior (WORKDIR, venv).

  • Matrix lists Working Directory as “/home/ubuntu/dynamo”, but local-dev sets WORKDIR to $HOME (not “.../dynamo”), and DYNAMO_HOME defaults to /workspace. Clarify either matrix or env to avoid confusion.
  • “Python Environment: User-owned venv” is accurate (venv is chowned) but still located at /opt/dynamo/venv. Consider wording “user-writable /opt/dynamo/venv”.

434-508: Dev (root-run) target looks good; mirror env with local-dev where possible.

  • Env and toolchain setup are consistent with runtime and local-dev. No blockers.

If desired, set CARGO_TARGET_DIR to ${WORKSPACE_DIR}/target (instead of hardcoding /workspace/target) for symmetry with local-dev.

container/build.sh (1)

469-471: Correct scoping of UID/GID to local-dev; nice. Add help text and guardrails.

  • Good: Passing host UID/GID only when TARGET=local-dev matches the intended model.
  • Gaps:
    • show_help omits --target; either remove parsing or document it to avoid confusion.
    • If a non-vLLM Dockerfile lacks a local-dev stage, building with --target local-dev will fail late. Consider an early check or clearer error.

Would you like a follow-up patch to add --target to show_help and emit a friendlier error when the chosen Dockerfile doesn’t define that target?

container/run.sh (2)

231-247: Workspace mounting defaults make sense; ensure target usage is consistently removed or documented.

  • Mounts and defaults (HF_CACHE, PRIVILEGED=TRUE, -it) are sensible for local dev via dev target.
  • The script still parses/uses --target to shape IMAGE naming elsewhere. If runtime is now target-agnostic, remove that parsing—or re-document it.

253-257: HF cache mount path correct for root-run dev.

Mounts to /root/.cache/huggingface. If ENTRYPOINT changes to a non-root user later, consider honoring HF_HOME or mapping to $HOME/.cache/huggingface.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4df2e2d and 509b77e.

📒 Files selected for processing (3)
  • container/Dockerfile.vllm (2 hunks)
  • container/build.sh (1 hunks)
  • container/run.sh (2 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2822
File: container/Dockerfile.vllm:343-352
Timestamp: 2025-09-03T01:10:12.599Z
Learning: In the dynamo project's local-dev Docker targets, USER_UID and USER_GID build args are intentionally left without default values to force explicit UID/GID mapping during build time, preventing file permission issues in local development environments where container users need to match host user permissions for mounted volumes.
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.
📚 Learning: 2025-09-03T01:10:12.599Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2822
File: container/Dockerfile.vllm:343-352
Timestamp: 2025-09-03T01:10:12.599Z
Learning: In the dynamo project's local-dev Docker targets, USER_UID and USER_GID build args are intentionally left without default values to force explicit UID/GID mapping during build time, preventing file permission issues in local development environments where container users need to match host user permissions for mounted volumes.

Applied to files:

  • container/build.sh
  • container/Dockerfile.vllm
📚 Learning: 2025-08-30T20:43:49.632Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.

Applied to files:

  • container/Dockerfile.vllm
📚 Learning: 2025-08-30T20:43:10.091Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project, devcontainer.json files use templated container names (like "dynamo-vllm-devcontainer") that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.

Applied to files:

  • container/Dockerfile.vllm
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build and Test - vllm
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (1)
container/run.sh (1)

266-270: Privileged flag handling is clear and explicit.

LGTM. The normalized PRIVILEGED_STRING removes ambiguity.

- Remove unused USER_UID and USER_GID variables from build.sh
- Remove extra blank line in run.sh

Signed-off-by: Keiven Chang <[email protected]>
Copy link
Contributor

@nnshah1 nnshah1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - we can explore a better / different solution in the futue

@keivenchang keivenchang merged commit b6b3a76 into main Sep 5, 2025
12 of 13 checks passed
@keivenchang keivenchang deleted the keivenchang/BUG-5501463__revert-Dockerfile.vllm_build_run_scripts branch September 5, 2025 02:09
keivenchang added a commit that referenced this pull request Sep 5, 2025
nnshah1 pushed a commit that referenced this pull request Sep 8, 2025
…rs (#2892)

Signed-off-by: Keiven Chang <[email protected]>
Co-authored-by: Keiven Chang <[email protected]>
Signed-off-by: nnshah1 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants