deprecate torch 2.8.0 support by winglian · Pull Request #3550 · axolotl-ai-cloud/axolotl

winglian · 2026-03-25T17:45:53Z

Summary by CodeRabbit

Chores
- Removed PyTorch 2.8.0 from build and test pipelines; updated to support PyTorch 2.9.1+ only.
- Updated dependency installation methods for improved compatibility across architectures.
- Streamlined CI/CD test matrices to focus on currently supported configurations.
Documentation
- Updated minimum PyTorch requirement from 2.8.0 to 2.9.1 in project requirements.

coderabbitai · 2026-03-25T17:46:14Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b1e1aa02-c704-4ee7-a940-7ea81f37c165

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR removes PyTorch 2.8.0 + CUDA 12.8.1 matrix entries from multiple CI workflow files, updates the minimum supported PyTorch version to 2.9.1 in the README, and modifies Docker base image dependency installation for causal-conv1d and mamba-ssm packages along with flash-attn wheel handling.

Changes

Cohort / File(s)	Summary
CI Workflow Matrix Cleanup `.github/workflows/base.yml`, `main.yml`, `multi-gpu-e2e.yml`, `nightlies.yml`, `tests-nightly.yml`, `tests.yml`	Removed PyTorch 2.8.0 + CUDA 12.8.1 matrix entries from build and test jobs, reducing the number of Docker image variants and test configurations.
Docker Base Image Updates `docker/Dockerfile-uv-base`	Modified conditional amd64-only dependency installation to skip CUDA builds for causal-conv1d and mamba-ssm, and updated flash-attn wheel filename construction with versioned ARCH_TAG values (`2_24_x86_64`, `2_34_aarch64`) and added LINUX_TAG component.
Documentation `README.md`	Updated minimum required PyTorch version from ≥2.8.0 to ≥2.9.1 in quick start requirements.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Possibly related PRs

#3224: Modifies flash-attn installation and wheel handling logic in docker/Dockerfile-uv-base
#3450: Updates CI test matrices to change supported PyTorch and Python versions
#3034: Directly related to PyTorch 2.8.0 + CUDA 12.8.1 matrix configuration changes in the same workflow files

Suggested labels

ready to merge

Suggested reviewers

SalmanMohammadi
djsaunde
NanoCode012

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'deprecate torch 2.8.0 support' accurately summarizes the main change: removing PyTorch 2.8.0 configurations from CI/CD workflows and updating the minimum supported version to 2.9.1 in documentation.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch deprecate-torch280

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-25T17:53:13Z

📖 Documentation Preview: https://69c443c7836c0e796916f9c5--resonant-treacle-0fd729.netlify.app

Deployed on Netlify from commit b95161e

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docker/Dockerfile-uv-base`:
- Line 46: The shell assignment for LINUX_TAG is invalid because it uses spaces
around '='; update the assignment for the LINUX_TAG variable (LINUX_TAG) to a
POSIX-compliant form with no spaces (e.g., LINUX_TAG="manylinux_") so the RUN
step treats it as a variable assignment like the other variables in the file.
- Around line 38-40: The RUN step in docker/Dockerfile-uv-base installs
mamba_ssm and causal_conv1d without pins; update that RUN line to install
reproducible refs (either explicit version pins like mamba_ssm==X.Y.Z and
causal_conv1d==A.B.C or git+https://...@<commit-or-tag> refs) while retaining
the MAMBA_SKIP_CUDA_BUILD and CAUSAL_CONV1D_SKIP_CUDA_BUILD env vars and the uv
pip install invocation; ensure the chosen pins/refs match the versions used in
docker/Dockerfile-base or your pyproject/requirements and include both package
identifiers (mamba_ssm, causal_conv1d) so builds are deterministic.
- Around line 49-54: The WHL_FILE construction in docker/Dockerfile-uv-base uses
manylinux_2_24/2_34 via ARCH_TAG/ LINUX_TAG causing a mismatch with
docker/Dockerfile-base which uses linux_x86_64/linux_aarch64; update
docker/Dockerfile-uv-base so its wheel filename construction (WHL_FILE) uses the
same platform tag format as docker/Dockerfile-base (i.e. use linux_x86_64 or
linux_aarch64) by changing the ARCH_TAG/ LINUX_TAG assignment logic tied to
TARGETARCH or by altering WHL_FILE to reference the same LINUX_TAG values used
in docker/Dockerfile-base; ensure WHL_VERSION remains v0.7.16 and preserve the
rest of the wheel name format.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e0e331fd-7595-43f5-8a5b-e204d42b958c

📥 Commits

Reviewing files that changed from the base of the PR and between 74b959e and 2eafa58.

📒 Files selected for processing (8)

.github/workflows/base.yml
.github/workflows/main.yml
.github/workflows/multi-gpu-e2e.yml
.github/workflows/nightlies.yml
.github/workflows/tests-nightly.yml
.github/workflows/tests.yml
README.md
docker/Dockerfile-uv-base

💤 Files with no reviewable changes (4)

.github/workflows/main.yml
.github/workflows/multi-gpu-e2e.yml
.github/workflows/nightlies.yml
.github/workflows/base.yml

coderabbitai · 2026-03-25T17:53:43Z

 RUN if [ "$TARGETARCH" = "amd64" ]; then \
-        uv pip install --no-build-isolation "causal_conv1d @ git+https://github.com/Dao-AILab/causal-conv1d.git@main"; \
-        uv pip install "mamba_ssm @ git+https://github.com/state-spaces/mamba.git@main"; \
+        MAMBA_SKIP_CUDA_BUILD=TRUE CAUSAL_CONV1D_SKIP_CUDA_BUILD=TRUE uv pip install --no-build-isolation mamba_ssm causal_conv1d; \
    fi


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate the Dockerfiles fd -t f "Dockerfile*" docker/

Repository: axolotl-ai-cloud/axolotl

Length of output: 321

🏁 Script executed:

# Read the relevant section of docker/Dockerfile-uv-base head -60 docker/Dockerfile-uv-base | tail -40

Repository: axolotl-ai-cloud/axolotl

Length of output: 1699

🏁 Script executed:

# Check for any constraint files or requirements files in the repo fd -t f -e "txt" -e "toml" | grep -E "(constraint|requirement|pyproject)" | head -20

Repository: axolotl-ai-cloud/axolotl

Length of output: 143

🏁 Script executed:

# Look for any pip constraint files find . -type f \( -name "*constraint*" -o -name "*requirements*" \) 2>/dev/null | head -20

Repository: axolotl-ai-cloud/axolotl

Length of output: 161

🏁 Script executed:

# Check if mamba_ssm and causal_conv1d are in the main requirements files grep -n "mamba_ssm\|causal_conv1d" requirements.txt requirements-dev.txt pyproject.toml

Repository: axolotl-ai-cloud/axolotl

Length of output: 50

🏁 Script executed:

# Check the full context around the problematic lines in docker/Dockerfile-uv-base sed -n '1,60p' docker/Dockerfile-uv-base

Repository: axolotl-ai-cloud/axolotl

Length of output: 2151

🏁 Script executed:

# Compare with docker/Dockerfile-base to see if there's a pattern sed -n '38,60p' docker/Dockerfile-base

Repository: axolotl-ai-cloud/axolotl

Length of output: 1313

🏁 Script executed:

# Check if there's any version pinning strategy documented cat pyproject.toml | head -50

Repository: axolotl-ai-cloud/axolotl

Length of output: 1332

Pin mamba_ssm and causal_conv1d versions in this stage.

Line 39 installs both packages without version constraints—they are not present in requirements.txt, requirements-dev.txt, or pyproject.toml. This makes the base image resolve whatever versions are current at build time, which is risky for CUDA/Torch binary-extension dependencies. The sibling docker/Dockerfile-base uses git refs to achieve reproducibility; apply a similar approach here by either pinning explicit versions or using git references.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docker/Dockerfile-uv-base` around lines 38 - 40, The RUN step in docker/Dockerfile-uv-base installs mamba_ssm and causal_conv1d without pins; update that RUN line to install reproducible refs (either explicit version pins like mamba_ssm==X.Y.Z and causal_conv1d==A.B.C or git+https://...@<commit-or-tag> refs) while retaining the MAMBA_SKIP_CUDA_BUILD and CAUSAL_CONV1D_SKIP_CUDA_BUILD env vars and the uv pip install invocation; ensure the chosen pins/refs match the versions used in docker/Dockerfile-base or your pyproject/requirements and include both package identifiers (mamba_ssm, causal_conv1d) so builds are deterministic.

coderabbitai · 2026-03-25T17:53:43Z

+        amd64) ARCH_TAG="2_24_x86_64" ;; \
+        arm64) ARCH_TAG="2_34_aarch64" ;; \
        *) echo "Unsupported architecture: $TARGETARCH"; exit 1 ;; \
    esac && \
    WHL_VERSION="v0.7.16" && \
-    WHL_FILE="flash_attn-2.8.3+cu${CUDA}${TORCH_TAG}-${PYTHON_CP}-${PYTHON_CP}-linux_${ARCH_TAG}.whl" && \
+    WHL_FILE="flash_attn-2.8.3+cu${CUDA}${TORCH_TAG}-${PYTHON_CP}-${PYTHON_CP}-${LINUX_TAG}${ARCH_TAG}.whl" && \


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

git ls-files | grep -i dockerfile

Repository: axolotl-ai-cloud/axolotl

Length of output: 387

🏁 Script executed:

head -80 docker/Dockerfile-uv-base | tail -40

Repository: axolotl-ai-cloud/axolotl

Length of output: 1699

🏁 Script executed:

head -80 docker/Dockerfile-base | tail -20

Repository: axolotl-ai-cloud/axolotl

Length of output: 1042

🏁 Script executed:

git log --oneline -20 -- docker/Dockerfile-uv-base docker/Dockerfile-base | head -30

Repository: axolotl-ai-cloud/axolotl

Length of output: 105

🏁 Script executed:

git log -p --all -- docker/Dockerfile-uv-base | grep -A 20 -B 5 "manylinux_2_24" | head -50

Repository: axolotl-ai-cloud/axolotl

Length of output: 50

🏁 Script executed:

grep -n "manylinux\|linux_x86_64\|linux_aarch64" docker/Dockerfile-uv-base docker/Dockerfile-base

Repository: axolotl-ai-cloud/axolotl

Length of output: 130

🏁 Script executed:

# Check if there are any other references to the flash-attn wheel names rg "flash_attn|flash-attn" docker/ -A 2 -B 2

Repository: axolotl-ai-cloud/axolotl

Length of output: 2324

🏁 Script executed:

# Check for any comments or documentation about the difference rg -i "uv-base|dockerfile" . --type md -A 5 -B 5 | head -100

Repository: axolotl-ai-cloud/axolotl

Length of output: 50

Align flash-attn wheel naming between docker/Dockerfile-uv-base and docker/Dockerfile-base.

Both files download from the same v0.7.16 release, but construct different wheel filenames for the same architecture. docker/Dockerfile-uv-base:49-54 uses manylinux_2_24_x86_64 / manylinux_2_34_aarch64, while docker/Dockerfile-base:68-76 uses linux_x86_64 / linux_aarch64. This inconsistency will cause the base images to attempt downloading different wheels, risking divergent installations or failures if one naming pattern is not available in the release.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docker/Dockerfile-uv-base` around lines 49 - 54, The WHL_FILE construction in docker/Dockerfile-uv-base uses manylinux_2_24/2_34 via ARCH_TAG/ LINUX_TAG causing a mismatch with docker/Dockerfile-base which uses linux_x86_64/linux_aarch64; update docker/Dockerfile-uv-base so its wheel filename construction (WHL_FILE) uses the same platform tag format as docker/Dockerfile-base (i.e. use linux_x86_64 or linux_aarch64) by changing the ARCH_TAG/ LINUX_TAG assignment logic tied to TARGETARCH or by altering WHL_FILE to reference the same LINUX_TAG values used in docker/Dockerfile-base; ensure WHL_VERSION remains v0.7.16 and preserve the rest of the wheel name format.

codecov · 2026-03-25T17:59:34Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

deprecate torch 2.8.0 support

2eafa58

coderabbitai Bot reviewed Mar 25, 2026

View reviewed changes

winglian added 2 commits March 25, 2026 14:47

shell lint

793aa6f

odd naming of manylinux wheels for x86

b95161e

winglian merged commit 99bde01 into main Mar 25, 2026
33 of 37 checks passed

winglian deleted the deprecate-torch280 branch March 25, 2026 22:22

This was referenced Apr 24, 2026

fix: docker build failing #3622

Merged

fix: pin torchvision per matrix entry to prevent ABI drift #3631

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

deprecate torch 2.8.0 support#3550

deprecate torch 2.8.0 support#3550
winglian merged 3 commits into
mainfrom
deprecate-torch280

winglian commented Mar 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 25, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

github-actions Bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 25, 2026

Uh oh!

Uh oh!

coderabbitai Bot Mar 25, 2026

Uh oh!

codecov Bot commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

winglian commented Mar 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

github-actions Bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Mar 25, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

winglian commented Mar 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 25, 2026 •

edited

Loading

github-actions Bot commented Mar 25, 2026 •

edited

Loading