Skip to content

Add sccache to manylinux PyTorch build image#3369

Merged
subodh-dubey-amd merged 4 commits into
mainfrom
users/subodh-dubey-amd/sccache-dockerfiles
Feb 12, 2026
Merged

Add sccache to manylinux PyTorch build image#3369
subodh-dubey-amd merged 4 commits into
mainfrom
users/subodh-dubey-amd/sccache-dockerfiles

Conversation

@subodh-dubey-amd
Copy link
Copy Markdown
Contributor

@subodh-dubey-amd subodh-dubey-amd commented Feb 11, 2026

Motivation

Add sccache to the manylinux x86_64 Docker image used for building portable Linux PyTorch wheels so later PRs can use sccache (e.g. S3-backed) for faster rebuilds without changing workflows or build scripts in this PR.

Technical Details

  • New: dockerfiles/install_sccache.sh – installs sccache from mozilla/sccache GitHub releases (v0.14.0; x86_64 and aarch64).
  • Updated: dockerfiles/build_manylinux_x86_64.Dockerfile – adds an SCCache install step after the existing CCache block (WORKDIR, COPY, RUN).

Test Plan

  • Build the image: docker buildx build --file dockerfiles/build_manylinux_x86_64.Dockerfile dockerfiles/.
  • In the built image, run sccache --version and confirm 0.14.0.

Test Result

Submission Checklist

@subodh-dubey-amd subodh-dubey-amd marked this pull request as ready for review February 11, 2026 14:18
Comment thread dockerfiles/install_sccache.sh Outdated
Comment on lines +13 to +25
# Map architecture to sccache release naming convention
case "${ARCH}" in
x86_64)
SCCACHE_ARCH="x86_64-unknown-linux-musl"
;;
aarch64)
SCCACHE_ARCH="aarch64-unknown-linux-musl"
;;
*)
echo "Unsupported architecture: ${ARCH}"
exit 1
;;
esac
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't support aarch64 thus you can limit to x86_64.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just yum install sccache?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked: sccache isn’t in EPEL 8/9 (manylinux_2_28 is RHEL 8–based). Only ccache is there. Fedora has rust-sccache but that’s not in EPEL, so we’re installing the official GitHub release to get a current sccache (e.g. 0.14.0). See https://repology.org/project/sccache/versions — there’s no EPEL 8/9 entry for sccache.

@subodh-dubey-amd subodh-dubey-amd merged commit f7303e4 into main Feb 12, 2026
8 checks passed
@subodh-dubey-amd subodh-dubey-amd deleted the users/subodh-dubey-amd/sccache-dockerfiles branch February 12, 2026 04:27
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Feb 12, 2026
lumachad added a commit that referenced this pull request Feb 12, 2026
…90baef3e8dde4805858

Incorporate changes made in #3369.
@lumachad
Copy link
Copy Markdown
Contributor

Image bump in #3306 should cover this PR as well.

marbre pushed a commit that referenced this pull request Feb 12, 2026
 (#3306)

Move to
sha256:d6ae5712a9c7e8b88281d021e907b312cd8a26295b95690baef3e8dde4805858
from
sha256:db2b63f938941dde2abc80b734e64b45b9995a282896d513a0f3525d4591d6cb

See producing PR's #3240 and
#3369.
subodh-dubey-amd added a commit that referenced this pull request Feb 20, 2026
## Motivation

Add sccache support to PyTorch wheel builds for S3-backed distributed
caching. Script placed in `build_tools/` per [reviewer
feedback](#3171 (comment)),
modeled after `build_tools/setup_ccache.py`.

Part of sccache PR sequence:
[#3369](#3369) →
[#3389](#3389) → **this** → workflow
wiring.

## Technical Details

- **New: `build_tools/setup_sccache_rocm.py`** — generic sccache ROCm
helper (CLI + importable):
  - `find_sccache()` — locate binary; hard fail if missing
- `setup_rocm_sccache()` — wrap clang/clang++ with sccache stubs (Linux
only)
  - `restore_rocm_compilers()` — undo wrapping
  
- **Modified: `external-builds/pytorch/build_prod_wheels.py`**:
  - `--use-ccache` / `--use-sccache` mutually exclusive args
- Both hard-fail with `RuntimeError` if the requested cache tool is not
found ([per
review](#3171 (comment)))
— no silent fallback
- Added explicit ccache availability check (previously would fail with
an unclear subprocess error)
- sccache: wrap compilers → set CMAKE launchers → `try`/`finally` around
build for guaranteed compiler restore + stats
- Moved ccache stats reporting into `finally` block for consistent
reporting on both success and failure

## Test Result

No workflow changes — sccache wired but not yet invoked by CI (next PR
adds `cache_type` input + AWS config).

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
subodh-dubey-amd added a commit that referenced this pull request Mar 5, 2026
## Summary

Adds `sccache` with S3 remote storage to all four PyTorch wheel build
workflows, significantly reducing build times through distributed
compiler caching.

**PR sequence:** #3369#3306#3389#3482 → **this** → #3189
([based on Reviewer's
Feedback](#3171 (comment)))

## How It Works

| | Linux | Windows |
|---|---|---|
| **Host C/C++** | CMake compiler launchers | CMake compiler launchers |
| **HIP device code** | Wraps ROCm `clang`/`clang++` with sccache | Not
supported |
| **Cleanup** | Restores original compilers via try/finally | N/A |

Cache is stored in the `therock-<workflow>-pytorch-sccache` S3 bucket,
keyed by `<os>/<arch>/` prefix.

## S3 Cache Configuration

Each workflow uses a dedicated S3 bucket and IAM role, keyed by
`<os>/<arch>/` prefix:

| Workflow | S3 Bucket | IAM Role |
|----------|-----------|----------|
| Linux CI | `therock-ci-pytorch-sccache` | `therock-ci` |
| Windows CI | `therock-ci-pytorch-sccache` | `therock-ci` |
| Linux Release | `therock-{release_type}-pytorch-sccache` |
`therock-{release_type}` |
| Windows Release | `therock-{release_type}-pytorch-sccache` |
`therock-{release_type}` |

Where `release_type` is one of: `dev`, `nightly`, `prerelease`.

##  Impact

| Platform | Cold → Warm | Improvement |
|----------|------------|-------------|
| Linux | ~70m → ~37m | **~49%** |
| Windows | ~42m → ~26m | **~38%** |

Windows is lower — sccache cannot wrap HIP device compilation on
Windows, only host C/C++ via CMAKE launchers.

## Tests

### Linux:
- [Linux (Cache
Population)](https://github.com/ROCm/TheRock/actions/runs/22226347964/job/64293924748)
- 70 mins
- [Linux (Cache
Hit)](https://github.com/ROCm/TheRock/actions/runs/22231743387/job/64312966557)
- 37 mins

### Windows:
- [Windows (Cache
Population)](https://github.com/ROCm/TheRock/actions/runs/22219252671/job/64280583887)
- 42 mins
- [Windows (Cache
Hit)](https://github.com/ROCm/TheRock/actions/runs/22223608689/job/64284721704)
- 26 mins

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.'



> Forks: S3 caching is only active for ROCm-owned runs. Fork users can
set cache_type to ccache or none, or leave the default — sccache will
work locally without S3 access.

---------
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants