Skip to content

[GAUDISW-246357] UBI images improvements#971

Merged
afierka-intel merged 10 commits into
vllm-project:mainfrom
ghandoura:ubi_improvements
Feb 19, 2026
Merged

[GAUDISW-246357] UBI images improvements#971
afierka-intel merged 10 commits into
vllm-project:mainfrom
ghandoura:ubi_improvements

Conversation

@ghandoura
Copy link
Copy Markdown
Contributor

@ghandoura ghandoura commented Feb 13, 2026

  • Use pip check to verify python dependencies during build

  • Use no-cache to reduce the docker image size

  • Allow using latest in sysnapse revision

  • TODO remove CRB

  • Test the changes

More changes to check and implement:

  • 1) Around line 46, is this RUN dnf install -y python3-dnf-plugin-versionlock. It needs to start off with dnf -y update to pull in all updates before centos/epel repos get installed.

  • 2) At line 62, it tries to install libjpeg-devel. The replacement on RHEL 9 is libjpeg-turbo-devel. It’s a drop in replacement that runs even faster. And it is in the main UBI repo and not epel.

  • 3) Down around line 130, is a dnf -y update. Delete this block as this will pull in all kinds of Centos contamination.

  • 4) Not strictly required, but at line 139, make it FROM gaudi-pytorch as vllm-openai to match other vllm images.

  • 5) At line 153, there is another dnf -y update, remove it for the same reason as [FIX for upstream changes ]hpu_model_runner and add UT #3

  • retest

@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements improvements to the RHEL UBI (Red Hat Universal Base Image) Dockerfile for vLLM on Habana Gaudi hardware. The changes enhance the build process by adding dependency verification, reducing image size, and introducing flexible version management for Synapse packages.

Changes:

  • Added pip check command after vLLM installation to verify Python dependency compatibility
  • Added --no-cache-dir flag to all pip install commands to reduce Docker image size
  • Implemented support for using "latest" as a value for SYNAPSE_REVISION to automatically detect and use the newest available revision

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .cd/Dockerfile.rhel.ubi.vllm Outdated
Comment thread .cd/Dockerfile.rhel.ubi.vllm Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .cd/Dockerfile.rhel.ubi.vllm
Use --no-cache-dir for pip installs to reduce image size.

Run pip check during build to validate Python dependencies.

Allow SYNAPSE_REVISION as exact value (e.g. 695) or latest with revision detection.

Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

@github-actions
Copy link
Copy Markdown

✅ CI Passed

All checks passed successfully against the following vllm commit:
17b17c068453e6dc6af79240bb94857ae175cc51

@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

@ghandoura
Copy link
Copy Markdown
Contributor Author

CRB is back because of new libftd-devel dependency

Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

1 similar comment
@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

@PatrykWo PatrykWo self-assigned this Feb 19, 2026
@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

Copy link
Copy Markdown
Collaborator

@PatrykWo PatrykWo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build tested.

@afierka-intel afierka-intel merged commit a3855ac into vllm-project:main Feb 19, 2026
16 checks passed
SKRohit pushed a commit to SKRohit/vllm-gaudi that referenced this pull request Feb 20, 2026
- Use pip check to verify python dependencies during build
- Use no-cache to reduce the docker image size
- Allow using `latest` in sysnapse revision

- [x] TODO remove CRB
- [x] Test the changes

More changes to check and implement:

- [x] 1) Around line 46, is this RUN dnf install -y
python3-dnf-plugin-versionlock. It needs to start off with dnf -y update
to pull in all updates before centos/epel repos get installed.

- [x] 2) At line 62, it tries to install libjpeg-devel. The replacement
on RHEL 9 is libjpeg-turbo-devel. It’s a drop in replacement that runs
even faster. And it is in the main UBI repo and not epel.

- [x] 3) Down around line 130, is a dnf -y update. Delete this block as
this will pull in all kinds of Centos contamination.

- [x] 4) Not strictly required, but at line 139, make it FROM
gaudi-pytorch as vllm-openai to match other vllm images.

- [x] 5) At line 153, there is another dnf -y update, remove it for the
same reason as vllm-project#3

- [x] retest

---------

Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
Co-authored-by: Michał Kuligowski <michal.kuligowski@intel.com>
Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyn@users.noreply.github.com>
Co-authored-by: Patryk Wolsza <patryk.wolsza@intel.com>
Signed-off-by: Rohit kumar Singh <rksingh@habana.ai>
gyou2021 pushed a commit to gyou2021/vllm-gaudi that referenced this pull request Feb 21, 2026
- Use pip check to verify python dependencies during build
- Use no-cache to reduce the docker image size
- Allow using `latest` in sysnapse revision

- [x] TODO remove CRB
- [x] Test the changes


More changes to check and implement:

- [x] 1) Around line 46, is this RUN dnf install -y
python3-dnf-plugin-versionlock. It needs to start off with dnf -y update
to pull in all updates before centos/epel repos get installed.

- [x] 2) At line 62, it tries to install libjpeg-devel. The replacement
on RHEL 9 is libjpeg-turbo-devel. It’s a drop in replacement that runs
even faster. And it is in the main UBI repo and not epel.

- [x] 3) Down around line 130, is a dnf -y update. Delete this block as
this will pull in all kinds of Centos contamination.

- [x] 4) Not strictly required, but at line 139, make it FROM
gaudi-pytorch as vllm-openai to match other vllm images.

- [x] 5) At line 153, there is another dnf -y update, remove it for the
same reason as vllm-project#3

- [x] retest

---------

Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
Co-authored-by: Michał Kuligowski <michal.kuligowski@intel.com>
Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyn@users.noreply.github.com>
Co-authored-by: Patryk Wolsza <patryk.wolsza@intel.com>
PatrykWo added a commit that referenced this pull request Feb 26, 2026
- Use pip check to verify python dependencies during build
- Use no-cache to reduce the docker image size
- Allow using `latest` in sysnapse revision

- [x] TODO remove CRB
- [x] Test the changes

More changes to check and implement:

- [x] 1) Around line 46, is this RUN dnf install -y
python3-dnf-plugin-versionlock. It needs to start off with dnf -y update
to pull in all updates before centos/epel repos get installed.

- [x] 2) At line 62, it tries to install libjpeg-devel. The replacement
on RHEL 9 is libjpeg-turbo-devel. It’s a drop in replacement that runs
even faster. And it is in the main UBI repo and not epel.

- [x] 3) Down around line 130, is a dnf -y update. Delete this block as
this will pull in all kinds of Centos contamination.

- [x] 4) Not strictly required, but at line 139, make it FROM
gaudi-pytorch as vllm-openai to match other vllm images.

- [x] 5) At line 153, there is another dnf -y update, remove it for the
same reason as #3

- [x] retest

---------

Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
Co-authored-by: Michał Kuligowski <michal.kuligowski@intel.com>
Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyn@users.noreply.github.com>
Co-authored-by: Patryk Wolsza <patryk.wolsza@intel.com>
(cherry picked from commit a3855ac)
PatrykWo added a commit that referenced this pull request Feb 26, 2026
- Use pip check to verify python dependencies during build
- Use no-cache to reduce the docker image size
- Allow using `latest` in sysnapse revision

- [x] TODO remove CRB
- [x] Test the changes

More changes to check and implement:

- [x] 1) Around line 46, is this RUN dnf install -y
python3-dnf-plugin-versionlock. It needs to start off with dnf -y update
to pull in all updates before centos/epel repos get installed.

- [x] 2) At line 62, it tries to install libjpeg-devel. The replacement
on RHEL 9 is libjpeg-turbo-devel. It’s a drop in replacement that runs
even faster. And it is in the main UBI repo and not epel.

- [x] 3) Down around line 130, is a dnf -y update. Delete this block as
this will pull in all kinds of Centos contamination.

- [x] 4) Not strictly required, but at line 139, make it FROM
gaudi-pytorch as vllm-openai to match other vllm images.

- [x] 5) At line 153, there is another dnf -y update, remove it for the
same reason as #3

- [x] retest

---------

Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
Co-authored-by: Michał Kuligowski <michal.kuligowski@intel.com>
Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyn@users.noreply.github.com>
Co-authored-by: Patryk Wolsza <patryk.wolsza@intel.com>
(cherry picked from commit a3855ac)
Signed-off-by: PatrykWo <patryk.wolsza@intel.com>
wpyszka pushed a commit that referenced this pull request Feb 26, 2026
…0.15.1 (#1049)

## Summary

This PR cherry-picks all RHEL/UBI Dockerfile changes merged to `main`
after `releases/v0.15.1` into the v0.15.1 release branch.

## Cherry-picked PRs

| PR | Commit | Description |
|----|--------|-------------|
| [#923](#923) |
`6d15fdc` | [GAUDISW-244821] Modify UBI docker to support both internal
and external builds |
| [#811](#811) |
`a0a0d36` | Fix reported version of vllm |
| [#713](#713) |
`40a425f` | Create UBI based vLLM docker build instructions |
| [#974](#974) |
`6db03ad` | [GADC-941] Add libfdt-devel (new habanalabs-thunk
dependency) to UBI Dockerfile |
| [#971](#971) |
`a3855ac` | [GAUDISW-246357] UBI images improvements |
| [#1008](#1008) |
`b3b2fb3` | Fix Dockerfile for RHEL 9.6 build by updating package
installation order |

## Key changes in `.cd/Dockerfile.rhel.ubi.vllm`

- Added new build args: `OS_VERSION`, `OS_STRING`,
`PT_MODULES_REPO_NAME`, `PT_PACKAGE_NAME_NON_DEFAULT_PYTHON_SUBSTRING`,
`PYPI_INDEX_URL`, `HABANA_RPM_REPO_PATH`
- Support `SYNAPSE_REVISION=latest` (auto-detects newest available
revision)
- Detected Synapse revision stored in `/etc/habanalabs/synapse_revision`
for use across stages
- `dnf install` uses `--allowerasing` throughout;
`openssl-fips-provider-so` removal has `|| true` to support RHEL 9.4
- Added packages: `libomp`, `libjpeg-turbo-devel` (replaces
`libjpeg-devel`), `libfdt-devel`
- `pip` calls use `--no-cache-dir`
- vLLM install: replaced `use_existing_torch.py` with `pip install -r
<(sed '/^torch/d' requirements/build.txt)`
- `pip check` added after installation
- Final stage renamed to `AS vllm-openai`
- `OS_STRING` is now parametric (supports both RHEL 9.4 and 9.6)

---------

Signed-off-by: Michal Muszynski <mmuszynski@habana.ai>
Signed-off-by: PatrykWo <patryk.wolsza@intel.com>
Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
Signed-off-by: mhelf-intel <monika.helfer@intel.com>
Signed-off-by: Michal Muszynski <michal.muszynski@intel.com>
Co-authored-by: Michal Muszynski <141021743+mmuszynskihabana@users.noreply.github.com>
Co-authored-by: Adam Ghandoura <adam.ghandoura@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Iryna Boiko <iboiko@habana.ai>
Co-authored-by: aghandoura <adam.ghandoura@gmail.com>
Co-authored-by: mhelf-intel <monika.helfer@intel.com>
Co-authored-by: Michał Kuligowski <michal.kuligowski@intel.com>
Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyn@users.noreply.github.com>
PatrykWo added a commit that referenced this pull request Feb 26, 2026
…0.15.1 (#1049)

This PR cherry-picks all RHEL/UBI Dockerfile changes merged to `main`
after `releases/v0.15.1` into the v0.15.1 release branch.

| PR | Commit | Description |
|----|--------|-------------|
| [#923](#923) |
`6d15fdc` | [GAUDISW-244821] Modify UBI docker to support both internal
and external builds |
| [#811](#811) |
`a0a0d36` | Fix reported version of vllm |
| [#713](#713) |
`40a425f` | Create UBI based vLLM docker build instructions |
| [#974](#974) |
`6db03ad` | [GADC-941] Add libfdt-devel (new habanalabs-thunk
dependency) to UBI Dockerfile |
| [#971](#971) |
`a3855ac` | [GAUDISW-246357] UBI images improvements |
| [#1008](#1008) |
`b3b2fb3` | Fix Dockerfile for RHEL 9.6 build by updating package
installation order |

- Added new build args: `OS_VERSION`, `OS_STRING`,
`PT_MODULES_REPO_NAME`, `PT_PACKAGE_NAME_NON_DEFAULT_PYTHON_SUBSTRING`,
`PYPI_INDEX_URL`, `HABANA_RPM_REPO_PATH`
- Support `SYNAPSE_REVISION=latest` (auto-detects newest available
revision)
- Detected Synapse revision stored in `/etc/habanalabs/synapse_revision`
for use across stages
- `dnf install` uses `--allowerasing` throughout;
`openssl-fips-provider-so` removal has `|| true` to support RHEL 9.4
- Added packages: `libomp`, `libjpeg-turbo-devel` (replaces
`libjpeg-devel`), `libfdt-devel`
- `pip` calls use `--no-cache-dir`
- vLLM install: replaced `use_existing_torch.py` with `pip install -r
<(sed '/^torch/d' requirements/build.txt)`
- `pip check` added after installation
- Final stage renamed to `AS vllm-openai`
- `OS_STRING` is now parametric (supports both RHEL 9.4 and 9.6)

---------

Signed-off-by: Michal Muszynski <mmuszynski@habana.ai>
Signed-off-by: PatrykWo <patryk.wolsza@intel.com>
Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
Signed-off-by: mhelf-intel <monika.helfer@intel.com>
Signed-off-by: Michal Muszynski <michal.muszynski@intel.com>
Co-authored-by: Michal Muszynski <141021743+mmuszynskihabana@users.noreply.github.com>
Co-authored-by: Adam Ghandoura <adam.ghandoura@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Iryna Boiko <iboiko@habana.ai>
Co-authored-by: aghandoura <adam.ghandoura@gmail.com>
Co-authored-by: mhelf-intel <monika.helfer@intel.com>
Co-authored-by: Michał Kuligowski <michal.kuligowski@intel.com>
Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyn@users.noreply.github.com>
adobrzyn added a commit that referenced this pull request Mar 31, 2026
- Use pip check to verify python dependencies during build
- Use no-cache to reduce the docker image size
- Allow using `latest` in sysnapse revision

- [x] TODO remove CRB
- [x] Test the changes


More changes to check and implement:

- [x] 1) Around line 46, is this RUN dnf install -y
python3-dnf-plugin-versionlock. It needs to start off with dnf -y update
to pull in all updates before centos/epel repos get installed.

- [x] 2) At line 62, it tries to install libjpeg-devel. The replacement
on RHEL 9 is libjpeg-turbo-devel. It’s a drop in replacement that runs
even faster. And it is in the main UBI repo and not epel.

- [x] 3) Down around line 130, is a dnf -y update. Delete this block as
this will pull in all kinds of Centos contamination.

- [x] 4) Not strictly required, but at line 139, make it FROM
gaudi-pytorch as vllm-openai to match other vllm images.

- [x] 5) At line 153, there is another dnf -y update, remove it for the
same reason as #3

- [x] retest

---------

Signed-off-by: Adam Ghandoura <adam.ghandoura@intel.com>
Co-authored-by: Michał Kuligowski <michal.kuligowski@intel.com>
Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyn@users.noreply.github.com>
Co-authored-by: Patryk Wolsza <patryk.wolsza@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants