feat: add support from building images using vllm from private repos by terrykong · Pull Request #1605 · NVIDIA-NeMo/RL

terrykong · 2025-12-06T08:10:12Z

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Issues

List issues that this PR closes (syntax):

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

...

Summary by CodeRabbit

New Features
- Added SSH agent forwarding capability for Docker builds to access private repositories during build.
- Enhanced build scripts with SSH key handling and host key verification management.
Documentation
- Added comprehensive guide for custom vLLM setup including SSH prerequisites and step-by-step instructions.
- New guidance on runtime configuration and environment management for custom vLLM deployments.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-06T08:13:37Z

📝 Walkthrough

Walkthrough

This PR enables SSH agent forwarding in Docker builds to support cloning private vLLM repositories. Changes include adding the --mount=type=ssh flag to the Dockerfile's RUN instruction, updating the build script to disable SSH host key verification during clone operations, and adding comprehensive SSH-based workflow documentation.

Changes

Cohort / File(s)	Summary
Docker Build SSH Configuration `docker/Dockerfile`, `tools/build-custom-vllm.sh`	Enables SSH forwarding during Docker build by adding `--mount=type=ssh` flag to RUN instruction; updates build script to set `GIT_SSH_COMMAND` to disable host key verification for SSH clones
SSH Workflow Documentation `docs/guides/use-custom-vllm.md`	Adds comprehensive guide covering SSH prerequisites, key loading, agent forwarding setup, step-by-step build instructions, and runtime recommendations for custom vLLM containers using frozen environments

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Minimal code changes (two single-line edits in Dockerfile and shell script)
Documentation additions are additive and clearly structured
Changes follow a coherent SSH-enablement pattern across the build pipeline
No complex logic or control flow modifications to review

Possibly related PRs

fix: update the custom vllm instructions #1116 — Modifies the same Dockerfile and build script to support custom vLLM building with SSH forwarding capabilities

Suggested labels

documentation

Suggested reviewers

yfw

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Test Results For Major Changes	⚠️ Warning	PR introduces major Docker and SSH authentication changes but lacks documented test results, validation evidence, or regression checks in the description.	Add concrete testing evidence including successful private repository builds, confirmation that public builds still work, SSH key handling security review, and any edge cases discovered.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title references 'add support from building images' which contains a grammatical error ('from' should be 'for'). Despite this, it clearly relates to the main change of enabling SSH-based vLLM builds from private repositories.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch custom-vllm-from-gitlab

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0947683 and 8a330e3.

📒 Files selected for processing (3)

docker/Dockerfile (1 hunks)
docs/guides/use-custom-vllm.md (1 hunks)
tools/build-custom-vllm.sh (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

**/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.sh: Use uv run instead of python to execute scripts
Follow the Google Shell Style Guide for shell scripts

Files:

tools/build-custom-vllm.sh

!(**/tests/**|**/test_*.py|**/test_*.sh)

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Add the NVIDIA copyright header to all Python files and shell scripts (excluding tests). The header should include the current year

Files:

tools/build-custom-vllm.sh
docs/guides/use-custom-vllm.md
docker/Dockerfile

**/*.{py,sh}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

The NVIDIA copyright header should appear at the top of all Python files and shell scripts (excluding tests)

Files:

tools/build-custom-vllm.sh

docs/**/*.md

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Update docs/index.md when a new markdown doc is added under docs/**/*.md or a markdown file is renamed, ensuring the document appears in the most appropriate section

Files:

docs/guides/use-custom-vllm.md

🧠 Learnings (1)

📚 Learning: 2025-11-24T17:24:41.976Z

Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-11-24T17:24:41.976Z
Learning: Applies to Dockerfile.ngc_pytorch : Exception: Dockerfile.ngc_pytorch is exempt from the uv run rule

Applied to files:

docker/Dockerfile

🪛 markdownlint-cli2 (0.18.1)

docs/guides/use-custom-vllm.md

92-92: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

102-102: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

121-121: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: build-container / main
GitHub Check: sphinx-build / Build docs
GitHub Check: Lint check
GitHub Check: Lint check
GitHub Check: Post submodule check comment / Comment on PR
GitHub Check: Post automodel integration comment / Comment on PR

🔇 Additional comments (5)

tools/build-custom-vllm.sh (2)

44-46: SSH command configuration looks solid.

The use of GIT_SSH_COMMAND to disable host key verification is appropriate for transient Docker build environments with mounted SSH keys. The explanatory comments make the trade-off clear, and the scoping of the environment variable to just the clone operation is correct.

1-160: Excellent implementation of the build-custom-vllm.sh script.

The script properly handles SSH authentication, maintains strict error handling, uses uv run correctly, and safely manipulates the pyproject.toml via tomlkit. The output messages guide the user through next steps, and the environment export at the end (lines 157-160) provides clear integration points for Docker builds.

docker/Dockerfile (1)

88-106: SSH mount addition is correct but requires buildx documentation.

The --mount=type=ssh flag is valid Docker buildx syntax and integrates well with the existing RUN block. However, this feature requires a recent version of Docker buildx that supports SSH mounts. For users with older tooling or standard docker build, the build may fail without clear guidance.

Please verify or document:

Minimum Docker/buildx version required for SSH mount support

Whether there's a graceful fallback or error message for unsupported versions

Whether documentation or build logs clarify that --ssh default must be passed to docker buildx build (as shown in the markdown guide)

You may want to add an inline comment in the Dockerfile documenting the buildx requirement, or ensure the error message from Docker is sufficient.

docs/guides/use-custom-vllm.md (2)

135-156: Clear explanation of container deployment patterns.

The section on running applications with custom vLLM containers (lines 135-156) provides strong justification for why frozen environments (bare python) are preferred over uv run or NRL_FORCE_REBUILD_VENVS=true. The explanation of the chicken-and-egg problem with vLLM compilation is helpful and the cross-reference to the dependency-management documentation is appropriate.

1-156: No action needed—docs/index.md has already been updated with a reference to guides/use-custom-vllm.md.

Likely an incorrect or invalid review comment.

docs/guides/use-custom-vllm.md

Signed-off-by: Terry Kong <terryk@nvidia.com>

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Terry Kong <terrycurtiskong@gmail.com> Signed-off-by: Terry Kong <terryk@nvidia.com>

This reverts commit 9a91940. Signed-off-by: Terry Kong <terrycurtiskong@gmail.com> Signed-off-by: Terry Kong <terryk@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>

…VIDIA-NeMo#1605) Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Terry Kong <terrycurtiskong@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

…VIDIA-NeMo#1605) Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Terry Kong <terrycurtiskong@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Parth Mannan <pmannan@nvidia.com>

…VIDIA-NeMo#1605) Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Terry Kong <terrycurtiskong@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: yuanhangs <yuanhangs@nvidia.com>

…1605) Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Terry Kong <terrycurtiskong@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

terrykong requested a review from yfw December 6, 2025 08:10

terrykong requested review from a team as code owners December 6, 2025 08:10

terrykong requested a review from jgerh December 6, 2025 08:10

github-actions bot added the documentation Improvements or additions to documentation label Dec 6, 2025

terrykong added the CI:L1 Run doctests, unit tests, and functional tests label Dec 6, 2025

terrykong temporarily deployed to nemo-ci December 6, 2025 08:10 — with GitHub Actions Inactive

coderabbitai bot reviewed Dec 6, 2025

View reviewed changes

docs/guides/use-custom-vllm.md Show resolved Hide resolved

terrykong temporarily deployed to nemo-ci December 6, 2025 08:36 — with GitHub Actions Inactive

terrykong temporarily deployed to nemo-ci December 6, 2025 11:17 — with GitHub Actions Inactive

terrykong added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Dec 7, 2025

terrykong temporarily deployed to nemo-ci December 7, 2025 19:00 — with GitHub Actions Inactive

terrykong temporarily deployed to nemo-ci December 7, 2025 19:04 — with GitHub Actions Inactive

terrykong temporarily deployed to nemo-ci December 7, 2025 21:39 — with GitHub Actions Inactive

terrykong and others added 4 commits December 16, 2025 21:31

feat: add support from building images using vllm from private repos

60515df

Signed-off-by: Terry Kong <terryk@nvidia.com>

Update docs/guides/use-custom-vllm.md

28ddfe0

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Terry Kong <terrycurtiskong@gmail.com> Signed-off-by: Terry Kong <terryk@nvidia.com>

Revert "Update docs/guides/use-custom-vllm.md"

c34c5e6

This reverts commit 9a91940. Signed-off-by: Terry Kong <terrycurtiskong@gmail.com> Signed-off-by: Terry Kong <terryk@nvidia.com>

fix md stuff

7ae4f50

Signed-off-by: Terry Kong <terryk@nvidia.com>

terrykong force-pushed the custom-vllm-from-gitlab branch from 0e219c4 to 7ae4f50 Compare December 16, 2025 21:31

terrykong enabled auto-merge (squash) December 16, 2025 21:31

terrykong added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Dec 16, 2025

terrykong had a problem deploying to nemo-ci December 16, 2025 21:31 — with GitHub Actions Error

chtruong814 approved these changes Dec 16, 2025

View reviewed changes

Merge branch 'main' into custom-vllm-from-gitlab

2301245

terrykong added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Dec 16, 2025

terrykong temporarily deployed to nemo-ci December 16, 2025 22:27 — with GitHub Actions Inactive

terrykong temporarily deployed to nemo-ci December 16, 2025 22:42 — with GitHub Actions Inactive

terrykong temporarily deployed to nemo-ci December 17, 2025 01:05 — with GitHub Actions Inactive

terrykong merged commit df01ca7 into main Dec 17, 2025
40 of 41 checks passed

terrykong deleted the custom-vllm-from-gitlab branch December 17, 2025 02:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add support from building images using vllm from private repos#1605

feat: add support from building images using vllm from private repos#1605
terrykong merged 5 commits intomainfrom
custom-vllm-from-gitlab

terrykong commented Dec 6, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 6, 2025

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

terrykong commented Dec 6, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 6, 2025

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

terrykong commented Dec 6, 2025 •

edited by coderabbitai bot

Loading