Skip to content

deprecate torch 2.8.0 support#3550

Merged
winglian merged 3 commits into
mainfrom
deprecate-torch280
Mar 25, 2026
Merged

deprecate torch 2.8.0 support#3550
winglian merged 3 commits into
mainfrom
deprecate-torch280

Conversation

@winglian
Copy link
Copy Markdown
Collaborator

@winglian winglian commented Mar 25, 2026

Summary by CodeRabbit

  • Chores

    • Removed PyTorch 2.8.0 from build and test pipelines; updated to support PyTorch 2.9.1+ only.
    • Updated dependency installation methods for improved compatibility across architectures.
    • Streamlined CI/CD test matrices to focus on currently supported configurations.
  • Documentation

    • Updated minimum PyTorch requirement from 2.8.0 to 2.9.1 in project requirements.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 25, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b1e1aa02-c704-4ee7-a940-7ea81f37c165

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR removes PyTorch 2.8.0 + CUDA 12.8.1 matrix entries from multiple CI workflow files, updates the minimum supported PyTorch version to 2.9.1 in the README, and modifies Docker base image dependency installation for causal-conv1d and mamba-ssm packages along with flash-attn wheel handling.

Changes

Cohort / File(s) Summary
CI Workflow Matrix Cleanup
.github/workflows/base.yml, main.yml, multi-gpu-e2e.yml, nightlies.yml, tests-nightly.yml, tests.yml
Removed PyTorch 2.8.0 + CUDA 12.8.1 matrix entries from build and test jobs, reducing the number of Docker image variants and test configurations.
Docker Base Image Updates
docker/Dockerfile-uv-base
Modified conditional amd64-only dependency installation to skip CUDA builds for causal-conv1d and mamba-ssm, and updated flash-attn wheel filename construction with versioned ARCH_TAG values (2_24_x86_64, 2_34_aarch64) and added LINUX_TAG component.
Documentation
README.md
Updated minimum required PyTorch version from ≥2.8.0 to ≥2.9.1 in quick start requirements.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Possibly related PRs

  • #3224: Modifies flash-attn installation and wheel handling logic in docker/Dockerfile-uv-base
  • #3450: Updates CI test matrices to change supported PyTorch and Python versions
  • #3034: Directly related to PyTorch 2.8.0 + CUDA 12.8.1 matrix configuration changes in the same workflow files

Suggested labels

ready to merge

Suggested reviewers

  • SalmanMohammadi
  • djsaunde
  • NanoCode012
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'deprecate torch 2.8.0 support' accurately summarizes the main change: removing PyTorch 2.8.0 configurations from CI/CD workflows and updating the minimum supported version to 2.9.1 in documentation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch deprecate-torch280

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 25, 2026

📖 Documentation Preview: https://69c443c7836c0e796916f9c5--resonant-treacle-0fd729.netlify.app

Deployed on Netlify from commit b95161e

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docker/Dockerfile-uv-base`:
- Line 46: The shell assignment for LINUX_TAG is invalid because it uses spaces
around '='; update the assignment for the LINUX_TAG variable (LINUX_TAG) to a
POSIX-compliant form with no spaces (e.g., LINUX_TAG="manylinux_") so the RUN
step treats it as a variable assignment like the other variables in the file.
- Around line 38-40: The RUN step in docker/Dockerfile-uv-base installs
mamba_ssm and causal_conv1d without pins; update that RUN line to install
reproducible refs (either explicit version pins like mamba_ssm==X.Y.Z and
causal_conv1d==A.B.C or git+https://...@<commit-or-tag> refs) while retaining
the MAMBA_SKIP_CUDA_BUILD and CAUSAL_CONV1D_SKIP_CUDA_BUILD env vars and the uv
pip install invocation; ensure the chosen pins/refs match the versions used in
docker/Dockerfile-base or your pyproject/requirements and include both package
identifiers (mamba_ssm, causal_conv1d) so builds are deterministic.
- Around line 49-54: The WHL_FILE construction in docker/Dockerfile-uv-base uses
manylinux_2_24/2_34 via ARCH_TAG/ LINUX_TAG causing a mismatch with
docker/Dockerfile-base which uses linux_x86_64/linux_aarch64; update
docker/Dockerfile-uv-base so its wheel filename construction (WHL_FILE) uses the
same platform tag format as docker/Dockerfile-base (i.e. use linux_x86_64 or
linux_aarch64) by changing the ARCH_TAG/ LINUX_TAG assignment logic tied to
TARGETARCH or by altering WHL_FILE to reference the same LINUX_TAG values used
in docker/Dockerfile-base; ensure WHL_VERSION remains v0.7.16 and preserve the
rest of the wheel name format.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e0e331fd-7595-43f5-8a5b-e204d42b958c

📥 Commits

Reviewing files that changed from the base of the PR and between 74b959e and 2eafa58.

📒 Files selected for processing (8)
  • .github/workflows/base.yml
  • .github/workflows/main.yml
  • .github/workflows/multi-gpu-e2e.yml
  • .github/workflows/nightlies.yml
  • .github/workflows/tests-nightly.yml
  • .github/workflows/tests.yml
  • README.md
  • docker/Dockerfile-uv-base
💤 Files with no reviewable changes (4)
  • .github/workflows/main.yml
  • .github/workflows/multi-gpu-e2e.yml
  • .github/workflows/nightlies.yml
  • .github/workflows/base.yml

Comment thread docker/Dockerfile-uv-base
Comment on lines 38 to 40
RUN if [ "$TARGETARCH" = "amd64" ]; then \
uv pip install --no-build-isolation "causal_conv1d @ git+https://github.com/Dao-AILab/causal-conv1d.git@main"; \
uv pip install "mamba_ssm @ git+https://github.com/state-spaces/mamba.git@main"; \
MAMBA_SKIP_CUDA_BUILD=TRUE CAUSAL_CONV1D_SKIP_CUDA_BUILD=TRUE uv pip install --no-build-isolation mamba_ssm causal_conv1d; \
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate the Dockerfiles
fd -t f "Dockerfile*" docker/

Repository: axolotl-ai-cloud/axolotl

Length of output: 321


🏁 Script executed:

# Read the relevant section of docker/Dockerfile-uv-base
head -60 docker/Dockerfile-uv-base | tail -40

Repository: axolotl-ai-cloud/axolotl

Length of output: 1699


🏁 Script executed:

# Check for any constraint files or requirements files in the repo
fd -t f -e "txt" -e "toml" | grep -E "(constraint|requirement|pyproject)" | head -20

Repository: axolotl-ai-cloud/axolotl

Length of output: 143


🏁 Script executed:

# Look for any pip constraint files
find . -type f \( -name "*constraint*" -o -name "*requirements*" \) 2>/dev/null | head -20

Repository: axolotl-ai-cloud/axolotl

Length of output: 161


🏁 Script executed:

# Check if mamba_ssm and causal_conv1d are in the main requirements files
grep -n "mamba_ssm\|causal_conv1d" requirements.txt requirements-dev.txt pyproject.toml

Repository: axolotl-ai-cloud/axolotl

Length of output: 50


🏁 Script executed:

# Check the full context around the problematic lines in docker/Dockerfile-uv-base
sed -n '1,60p' docker/Dockerfile-uv-base

Repository: axolotl-ai-cloud/axolotl

Length of output: 2151


🏁 Script executed:

# Compare with docker/Dockerfile-base to see if there's a pattern
sed -n '38,60p' docker/Dockerfile-base

Repository: axolotl-ai-cloud/axolotl

Length of output: 1313


🏁 Script executed:

# Check if there's any version pinning strategy documented
cat pyproject.toml | head -50

Repository: axolotl-ai-cloud/axolotl

Length of output: 1332


Pin mamba_ssm and causal_conv1d versions in this stage.

Line 39 installs both packages without version constraints—they are not present in requirements.txt, requirements-dev.txt, or pyproject.toml. This makes the base image resolve whatever versions are current at build time, which is risky for CUDA/Torch binary-extension dependencies. The sibling docker/Dockerfile-base uses git refs to achieve reproducibility; apply a similar approach here by either pinning explicit versions or using git references.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/Dockerfile-uv-base` around lines 38 - 40, The RUN step in
docker/Dockerfile-uv-base installs mamba_ssm and causal_conv1d without pins;
update that RUN line to install reproducible refs (either explicit version pins
like mamba_ssm==X.Y.Z and causal_conv1d==A.B.C or
git+https://...@<commit-or-tag> refs) while retaining the MAMBA_SKIP_CUDA_BUILD
and CAUSAL_CONV1D_SKIP_CUDA_BUILD env vars and the uv pip install invocation;
ensure the chosen pins/refs match the versions used in docker/Dockerfile-base or
your pyproject/requirements and include both package identifiers (mamba_ssm,
causal_conv1d) so builds are deterministic.

Comment thread docker/Dockerfile-uv-base Outdated
Comment thread docker/Dockerfile-uv-base Outdated
Comment on lines +49 to +54
amd64) ARCH_TAG="2_24_x86_64" ;; \
arm64) ARCH_TAG="2_34_aarch64" ;; \
*) echo "Unsupported architecture: $TARGETARCH"; exit 1 ;; \
esac && \
WHL_VERSION="v0.7.16" && \
WHL_FILE="flash_attn-2.8.3+cu${CUDA}${TORCH_TAG}-${PYTHON_CP}-${PYTHON_CP}-linux_${ARCH_TAG}.whl" && \
WHL_FILE="flash_attn-2.8.3+cu${CUDA}${TORCH_TAG}-${PYTHON_CP}-${PYTHON_CP}-${LINUX_TAG}${ARCH_TAG}.whl" && \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

git ls-files | grep -i dockerfile

Repository: axolotl-ai-cloud/axolotl

Length of output: 387


🏁 Script executed:

head -80 docker/Dockerfile-uv-base | tail -40

Repository: axolotl-ai-cloud/axolotl

Length of output: 1699


🏁 Script executed:

head -80 docker/Dockerfile-base | tail -20

Repository: axolotl-ai-cloud/axolotl

Length of output: 1042


🏁 Script executed:

git log --oneline -20 -- docker/Dockerfile-uv-base docker/Dockerfile-base | head -30

Repository: axolotl-ai-cloud/axolotl

Length of output: 105


🏁 Script executed:

git log -p --all -- docker/Dockerfile-uv-base | grep -A 20 -B 5 "manylinux_2_24" | head -50

Repository: axolotl-ai-cloud/axolotl

Length of output: 50


🏁 Script executed:

grep -n "manylinux\|linux_x86_64\|linux_aarch64" docker/Dockerfile-uv-base docker/Dockerfile-base

Repository: axolotl-ai-cloud/axolotl

Length of output: 130


🏁 Script executed:

# Check if there are any other references to the flash-attn wheel names
rg "flash_attn|flash-attn" docker/ -A 2 -B 2

Repository: axolotl-ai-cloud/axolotl

Length of output: 2324


🏁 Script executed:

# Check for any comments or documentation about the difference
rg -i "uv-base|dockerfile" . --type md -A 5 -B 5 | head -100

Repository: axolotl-ai-cloud/axolotl

Length of output: 50


Align flash-attn wheel naming between docker/Dockerfile-uv-base and docker/Dockerfile-base.

Both files download from the same v0.7.16 release, but construct different wheel filenames for the same architecture. docker/Dockerfile-uv-base:49-54 uses manylinux_2_24_x86_64 / manylinux_2_34_aarch64, while docker/Dockerfile-base:68-76 uses linux_x86_64 / linux_aarch64. This inconsistency will cause the base images to attempt downloading different wheels, risking divergent installations or failures if one naming pattern is not available in the release.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/Dockerfile-uv-base` around lines 49 - 54, The WHL_FILE construction in
docker/Dockerfile-uv-base uses manylinux_2_24/2_34 via ARCH_TAG/ LINUX_TAG
causing a mismatch with docker/Dockerfile-base which uses
linux_x86_64/linux_aarch64; update docker/Dockerfile-uv-base so its wheel
filename construction (WHL_FILE) uses the same platform tag format as
docker/Dockerfile-base (i.e. use linux_x86_64 or linux_aarch64) by changing the
ARCH_TAG/ LINUX_TAG assignment logic tied to TARGETARCH or by altering WHL_FILE
to reference the same LINUX_TAG values used in docker/Dockerfile-base; ensure
WHL_VERSION remains v0.7.16 and preserve the rest of the wheel name format.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@winglian winglian merged commit 99bde01 into main Mar 25, 2026
33 of 37 checks passed
@winglian winglian deleted the deprecate-torch280 branch March 25, 2026 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant