Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docker/rocm.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,8 @@ RUN pip install IPython \
&& pip install orjson \
&& pip install python-multipart \
&& pip install torchao==0.9.0 \
&& pip install pybind11
&& pip install pybind11 \
&& pip install cache-dit==1.3.0

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This install is overwritten a few lines below. The next RUN block does:

rm -rf python/pyproject.toml && mv python/pyproject_other.toml python/pyproject.toml
...
python -m pip --no-cache-dir install -e "python[srt_hip,diffusion_hip]"

and diffusion_hip in python/pyproject_other.toml still pins cache-dit==1.1.8, so pip will downgrade cache-dit from 1.3.0 back to 1.1.8 during the sglang editable install. The net effect is that the built image still ships cache-dit==1.1.8, which is exactly why the amd_ci_install_dependency.sh upgrade is required at job start.

To make this line actually do something (or to render it unnecessary), bump cache-dit==1.1.8 → 1.3.0 in python/pyproject_other.toml (diffusion_hip extra, and likely also diffusion_musa) and in 3rdparty/amd/wheel/sglang/pyproject.toml. Otherwise, this line can be dropped from the PR.


RUN pip uninstall -y sgl_kernel sglang
RUN git clone ${SGL_REPO} \
Expand Down
2 changes: 1 addition & 1 deletion scripts/ci/amd/amd_ci_install_dependency.sh
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ EOF
docker exec ci_sglang pip install --cache-dir=/sgl-data/pip-cache pytest

# Install cache-dit for qwen_image_t2i_cache_dit_enabled test (added in PR 16204)
docker exec ci_sglang pip install --cache-dir=/sgl-data/pip-cache cache-dit || echo "cache-dit installation failed"
docker exec ci_sglang pip install --cache-dir=/sgl-data/pip-cache --upgrade 'cache-dit==1.3.0' || echo "cache-dit installation failed"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change LGTM — pinning to 1.3.0 matches python/pyproject.toml, and --upgrade is needed because the AMD CI base image ships cache-dit==1.1.8 (since python/pyproject_other.toml and 3rdparty/amd/wheel/sglang/pyproject.toml still pin 1.1.8 for diffusion_hip). Once those pyproject pins are bumped, this --upgrade step will become a no-op but won't hurt to keep as a safety net.


# Install accelerate for distributed training and inference support
docker exec ci_sglang pip install --cache-dir=/sgl-data/pip-cache accelerate || echo "accelerate installation failed"
Expand Down
Loading