Skip to content

Add support for PyTorch 2.8.0#3106

Merged
winglian merged 6 commits into
mainfrom
ci280
Aug 28, 2025
Merged

Add support for PyTorch 2.8.0#3106
winglian merged 6 commits into
mainfrom
ci280

Conversation

@winglian
Copy link
Copy Markdown
Collaborator

@winglian winglian commented Aug 26, 2025

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

  • Tests
    • Expanded test matrices to include PyTorch 2.8.0 and added GPU end-to-end variants for CUDA 12.8.1 with Python 3.11.
  • Chores
    • Added CUDA 12.8 (128) build variants across CI build jobs (additive); final variant marked as latest where applicable.
    • Relaxed triton version constraint in requirements to allow newer 3.4+ releases.
    • Adjusted CI setup logic to recognize Torch 2.8+ (no behavioral changes).

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Aug 26, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbit review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Adds CUDA 12.8.1 / PyTorch 2.8.0 / Python 3.11 matrix variants to CI build and test workflows; relaxes the triton upper bound in requirements.txt; introduces a no-op Torch >=2.8 branch in setup.py preceding existing Torch-version logic.

Changes

Cohort / File(s) Summary of changes
Build matrix updates
.github/workflows/main.yml
Appended new matrix.include entries for CUDA 128 (cuda_version: 12.8.1, python: 3.11, pytorch: 2.8.0) to jobs build-axolotl, build-axolotl-cloud, build-axolotl-cloud-no-tmux. New entries set axolotl_extras:; the no-tmux variant also includes is_latest:. No existing entries removed.
Test matrix updates
.github/workflows/tests.yml
Added 2.8.0 to matrix.pytorch_version for pytest and pytest-sdist. Added a docker-e2e-tests matrix.include entry (and mirrored docker-e2e-tests-1st) for cuda: 128, cuda_version: 12.8.1, python_version: "3.11", pytorch: 2.8.0, num_gpus: 1, axolotl_extras:.
Dependency loosening
requirements.txt
Relaxed triton constraint from triton>=3.0.0,<3.4.0 to triton>=3.0.0, removing the upper bound.
Torch-version branch logic
setup.py
Added a no-op conditional branch for Torch >= 2.8 (if (major, minor) >= (2, 8): pass) placed before the existing 2.7+ branch; existing 2.7+ and earlier version logic retained.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • djsaunde
  • NanoCode012
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ci280

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
.github/workflows/main.yml (3)

118-122: Cloud image: add 2.8.0 entry — consistent with other jobs. Consider latest-tag decision.

Parity with the build job looks good. As above, decide whether 2.8.0 should carry is_latest: true for the cloud image to avoid ambiguity around which CUDA 128 tag is considered latest.

Optional tweak:

           pytorch: 2.8.0
           axolotl_extras:
+          is_latest: true

182-187: Avoid bare is_latest: with no value (can be ambiguous in GH expressions).

is_latest: without a value renders as empty string/null. While it will evaluate falsy in the ${{ ... }} expression, being explicit improves readability and prevents surprises if the expression ever changes.

Use false or omit the key:

           pytorch: 2.8.0
           axolotl_extras:
-          is_latest:
+          is_latest: false

39-43: Add 2.8.0 CUDA 12.8.1 variant – consider marking as latest

  • The new matrix entry is correct and will build the image for Python 3.11 + CUDA 128 + PyTorch 2.8.0.

  • To have this build published under the -latest tag, add is_latest: true to the variant:

              pytorch: 2.8.0
              axolotl_extras:
    +         is_latest: true
  • Before merging, ensure that the corresponding base image tag has been published. GitHub Actions will override
    ARG CUDA and ARG PYTORCH_VERSION via:

    build-args:
      BASE_TAG=${{ github.ref_type == 'tag' && 'main' || github.ref_name }}-base-py${{ matrix.python_version }}-cu${{ matrix.cuda }}-${{ matrix.pytorch }}
      CUDA=${{ matrix.cuda }}
      PYTORCH_VERSION=${{ matrix.pytorch }}

    Verify the base image exists, for example:

    # Replace <branch> with your branch or tag name (e.g. main)
    curl -f -I https://hub.docker.com/v2/repositories/axolotlai/axolotl-base/tags/<branch>-base-py3.11-cu128-2.8.0
    # or via Docker CLI
    docker pull axolotlai/axolotl-base:<branch>-base-py3.11-cu128-2.8.0

    This confirms that axolotlai/axolotl-base:…-cu128-2.8.0 is available before the build runs.

.github/workflows/tests.yml (2)

133-133: SDist matrix: mirrored 2.8.0 — consistent with wheels path.

Looks good and mirrors the wheel-based job. Same note about torchvision compatibility applies.

If you observe resolver downgrades in this job, keep the single-transaction install but add a constraint file with a torch↔torchvision pin for 2.8.0 only. I can draft one if needed.


301-306: E2E matrix: Adds CUDA 128 + Torch 2.8.0 — consider promoting a 2.8.0 case into the “gate” job.

Great to have full E2E for 2.8.0. To fail fast on core breakages for the new stack, consider adding one 2.8.0 variant to docker-e2e-tests-1st (your gate job) alongside existing entries. This can save GPU minutes when 2.8.0-specific regressions occur.

Proposed YAML to add under docker-e2e-tests-1st.strategy.matrix.include (outside the changed hunk):

- cuda: 128
  cuda_version: 12.8.1
  python_version: "3.11"
  pytorch: 2.8.0
  num_gpus: 1
  axolotl_extras:

Please also confirm Modal base images for CUDA 12.8.1 are available on your runners and that your Dockerfiles reference a cu128-compatible base.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e1131e9 and 1c0f515.

📒 Files selected for processing (2)
  • .github/workflows/main.yml (3 hunks)
  • .github/workflows/tests.yml (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.7.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
  • GitHub Check: PyTest (3.11, 2.8.0)
  • GitHub Check: PyTest (3.11, 2.6.0)
🔇 Additional comments (1)
.github/workflows/tests.yml (1)

58-58: Verify torchvision compatibility for Torch 2.8.0

  • File: .github/workflows/tests.yml (line 58)
    Ensure the pip install torch==${{ matrix.pytorch_version }} torchvision step picks a matching torchvision for PyTorch 2.8.0 on the ubuntu-latest runner.

To confirm the resolver behavior and exact torchvision version selected, run one of the following locally:

# List available torchvision releases and ensure there’s a 2.8.0-compatible build
pip index versions torchvision
# Install torch 2.8.0 alongside torchvision and check which torchvision was chosen
pip install torch==2.8.0 torchvision && pip freeze | grep torchvision

Verify that the resolved torchvision version aligns with PyTorch’s official compatibility matrix for 2.8.0.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1c0f515 and 648031e.

📒 Files selected for processing (1)
  • requirements.txt (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.6.0)
  • GitHub Check: PyTest (3.11, 2.7.0)
  • GitHub Check: PyTest (3.11, 2.8.0)
  • GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.6.0, 2, true)
  • GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.7.1, vllm, 2, true)

Comment thread requirements.txt
Comment on lines 5 to +6
# triton 3.4.0 is not compatible with CCE
triton>=3.0.0,<3.4.0
triton>=3.0.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Removing Triton upper bound contradicts the CCE warning and may break CCE users.

Line 5 states “triton 3.4.0 is not compatible with CCE”, but Line 6 removes the <3.4.0 cap. This allows pip to resolve to 3.4.x+ globally, which could reintroduce the CCE incompatibility. If PyTorch 2.8.0 truly requires Triton ≥3.4, we should scope that to the PyTorch 2.8/CUDA 12.8 CI variants via constraints rather than relaxing it for all users.

Two safe options; pick one:

  • Option A (conservative): keep the cap by default and override in CI for PT 2.8 only via constraints.
  • Option B (if CCE is now known-good with ≥3.4): keep the relaxed spec but fix the comment and add a mitigation note.

Proposed diffs:

Option A – restore cap and restrict mac installs with markers (safer default):

-# triton 3.4.0 is not compatible with CCE
-triton>=3.0.0
+# triton 3.4.0 is not compatible with CCE; keep <3.4.0 by default and override via constraints for PT 2.8 builds
+triton>=3.0.0,<3.4.0; sys_platform != "darwin" and platform_machine == "x86_64"
+# For PT 2.8 / CUDA 12.8 jobs, allow newer Triton via a constraints file in CI.

Option B – keep relaxed bound but correct the comment and document the risk:

-# triton 3.4.0 is not compatible with CCE
-triton>=3.0.0
+# NOTE: Triton ≥3.4 previously had CCE issues. If you are on CCE, pin triton<3.4.0 via a constraints file until verified.
+triton>=3.0.0

Follow-ups:

  • If you choose Option A, add a constraints file (e.g., constraints-pt28.txt) used only in the PT 2.8 / CUDA 12.8 CI jobs that loosens Triton to the vetted range.
  • If you choose Option B, please link evidence that CCE incompat is resolved for ≥3.4 so we can safely remove the warning.

Run this repo-scoped check to see where Triton is referenced and whether CI already uses constraints/overrides:


🏁 Script executed:

#!/bin/bash
# Find Triton refs and any constraints usage
rg -n -C2 -i '\btriton\b|constraints.*\.txt|pip install .* -c ' .github scripts requirements* setup* docker* || true

# Surface PT 2.8 entries in workflows to target constraints there
rg -n -C2 -i 'pytorch[^0-9]*2\.8\.0|cuda[^0-9]*12\.8|128\b' .github/workflows || true

# Search for any CCE references to validate the warning context
rg -n -C2 -i '\bCCE\b' || true

Length of output: 8488


Require Triton <3.4.0 by default and scope any bump to CI-only

No existing constraints file (e.g. constraints-pt28.txt) or pip install -c … override is configured in the PyTorch 2.8 / CUDA 12.8 workflows, so the removal of the <3.4.0 cap contradicts the CCE warning and risks breaking CCE users. Please apply Option A:

• In requirements.txt, restore the Triton upper bound and include an OS marker:

- # triton 3.4.0 is not compatible with CCE
- triton>=3.0.0
+ # triton 3.4.0 is not compatible with CCE; keep <3.4.0 by default and override via constraints for PT 2.8 CI
+ triton>=3.0.0,<3.4.0; sys_platform != "darwin" and platform_machine == "x86_64"

• Add a new constraints-pt28.txt at the repo root to allow Triton ≥3.4.0 in CI:

triton>=3.0.0

• In each PyTorch 2.8 / CUDA 12.8 job in .github/workflows/*.yml, install with that constraints file:

run: pip install -r requirements.txt -c constraints-pt28.txt

This preserves CCE compatibility for end users while permitting Triton ≥3.4.0 in controlled CI builds.

🤖 Prompt for AI Agents
In requirements.txt around lines 5-6, the Triton upper bound was removed which
breaks CCE users; restore the Triton constraint to require triton>=3.0.0,<3.4.0
and add the appropriate OS marker (e.g. ; platform_system != "Windows" or the
project-specific marker) so end users keep CCE-compatible versions. Add a new
file constraints-pt28.txt at the repo root containing a relaxed constraint
(triton>=3.0.0) to allow CI to test newer Triton, and update each PyTorch 2.8 /
CUDA 12.8 job in .github/workflows/*.yml to install with pip install -r
requirements.txt -c constraints-pt28.txt so CI can opt into Triton ≥3.4.0 while
keeping the default user install capped.

@codecov
Copy link
Copy Markdown

codecov Bot commented Aug 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
setup.py (6)

67-69: Torch>=2.8 branch is a no-op; explicitly handle xformers/vLLM to avoid stale/incompatible pins

Leaving this branch as pass preserves whatever xformers and vllm pins come from requirements/extras. That can easily drift out of compatibility with Torch 2.8 and CUDA 12.8. At minimum, mirror the 2.7 logic for vLLM (relax pin) and proactively avoid stale xformers pins for 2.8.

Apply this diff to make the 2.8 branch explicit and safer:

-            if (major, minor) >= (2, 8):
-                pass
+            if (major, minor) >= (2, 8):
+                # Torch 2.8+: avoid stale xformers pins; relax vLLM
+                try:
+                    _install_requires.pop(_install_requires.index(xformers_version))
+                except ValueError:
+                    pass  # xformers not pinned in requirements.txt
+                # Keep vLLM opt-in but avoid over-restricting to a specific patch
+                extras_require_map["vllm"] = ["vllm>=0.10.0"]

Would you like me to open a follow-up PR to run the new 2.8 matrix against: (a) install from sdist and (b) extras "vllm" + "flash-attn" to ensure solver stability and import sanity?


45-47: Remove debug print from setup to avoid noisy installs

Printing the entire install_requires during setup is noisy and leaks environment details in user installs/CI logs.

Apply:

-            print(
-                _install_requires, [req in skip_packages for req in _install_requires]
-            )

54-56: Hard default to Torch 2.6.0 can surprise in fresh envs; allow an override

When Torch isn’t preinstalled, you force torch==2.6.0. On a PR explicitly adding 2.8 support, this default may be counterintuitive. Make it overrideable via env var so CI can set a newer default without code changes.

Apply:

-                torch_version = "2.6.0"  # default to torch 2.6
+                # default Torch fallback (override in CI with AXOLOTL_TORCH_FALLBACK)
+                torch_version = os.getenv("AXOLOTL_TORCH_FALLBACK", "2.6.0")

Ensure your CI matrices set AXOLOTL_TORCH_FALLBACK=2.8.0 in the 2.8 jobs if you rely on this path.


28-39: Guard against missing packages when parsing requirements

[...][0] will raise IndexError if xformers/autoawq are removed from requirements in the future. Given the 2.8 branch may drop xformers, make the detection resilient.

Apply:

-        xformers_version = [req for req in _install_requires if "xformers" in req][0]
-        autoawq_version = [req for req in _install_requires if "autoawq" in req][0]
+        def find_req(reqs, name):
+            for req in reqs:
+                if re.split(r"[>=<]", req)[0].strip() == name:
+                    return req
+            return None
+        xformers_version = find_req(_install_requires, "xformers")
+        autoawq_version = find_req(_install_requires, "autoawq")

Then guard subsequent pops:

-                _install_requires.pop(_install_requires.index(xformers_version))
+                if xformers_version:
+                    _install_requires.pop(_install_requires.index(xformers_version))
...
-                _install_requires.pop(_install_requires.index(autoawq_version))
+                if autoawq_version:
+                    _install_requires.pop(_install_requires.index(autoawq_version))

57-66: Prefer packaging.version for version parsing/compare

Manual regex + tuple comparison works but is brittle with local version segments (e.g., 2.8.0+cu128). packaging.version.Version handles this robustly and self-documents the intent.

Example:

-            version_match = re.match(r"^(\d+)\.(\d+)(?:\.(\d+))?", torch_version)
-            if version_match:
-                major, minor, patch = version_match.groups()
-                major, minor = int(major), int(minor)
-                patch = (
-                    int(patch) if patch is not None else 0
-                )  # Default patch to 0 if not present
-            else:
-                raise ValueError("Invalid version format")
+            from packaging.version import Version
+            v = Version(torch_version)
+            major, minor, patch = v.major, v.minor, v.micro

122-129: Confirm wheel availability for flash-attn 2.8.3 and DeepSpeed 0.17.2 under PyTorch 2.8/CUDA 12.8

Our inspection of PyPI metadata shows that neither flash-attn 2.8.3 nor DeepSpeed 0.17.2 publish prebuilt wheels for Python 3.11 or CUDA 12.x. In a PT 2.8/CUDA 12.8 environment, pip install will fall back to source builds (or fail), leading to longer installs or potential errors.

• File: setup.py, lines 122–129 (flash-attn extras)
• File: setup.py, lines 130–136 (DeepSpeed extras)

Recommendations:

  • Gate these extras by environment markers to skip them on unsupported platforms (e.g. ; platform_system=="Linux" and cuda_version<12.8)
  • Widen version pins to pull in future releases that include CUDA 12.8 wheels (e.g. "flash-attn>=2.8.3,<2.9") once available
  • Document that unsupported envs require manual source compilation or a matching CUDA toolkit
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 648031e and 5594e7d.

📒 Files selected for processing (1)
  • setup.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.7.1, vllm, 2, true)
  • GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.6.0, 2, true)
  • GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
  • GitHub Check: PyTest (3.11, 2.8.0)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.7.0)
  • GitHub Check: PyTest (3.11, 2.6.0)
🔇 Additional comments (2)
setup.py (2)

159-161: Relax vLLM version pin to >=0.10.0

The default extras currently lock vLLM to an exact version (0.10.0), which can block users upgrading to PyTorch 2.8.0 with CUDA 12.8. Since the 2.7.x branch already loosened this pin, let’s align the default extras here as well.

• File: setup.py
• Lines: 159–161
• Apply:

-    "vllm": [
-        "vllm==0.10.0",
-    ],
+    "vllm": [
+        "vllm>=0.10.0",
+    ],

Please verify against the official vLLM documentation or release notes that all vLLM versions ≥0.10.0 support PyTorch 2.8.0 and CUDA 12.8 before merging.


152-155: Ignore typo suggestion: torch-optimi is a valid package
The package torch-optimi (PyPI project “optimi”) exists with version 0.2.1 and provides the intended low-precision PyTorch optimizers (pypi.org). No change to setup.py is necessary.

fail-fast: false
matrix:
python_version: ["3.11"]
pytorch_version: ["2.6.0", "2.7.0", "2.7.1"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason not to drop 2.7.0 in favor of 2.7.1?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we had originally kept 2.7.0 due to vllm. let me double check and you're most likely spot on that we can drop 2.7.0

Comment thread requirements.txt
bitsandbytes==0.47.0
# triton 3.4.0 is not compatible with CCE
triton>=3.0.0,<3.4.0
triton>=3.0.0
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't torch 2.8.0 require triton 3.4.0, and this ends up with an incompatibility? I think we need to fix our CCE, right?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this gets addressed with axolotl-ai-cloud/ml-cross-entropy#19

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Aug 27, 2025

📖 Documentation Preview: https://68af42022daf05688d3f48f5--resonant-treacle-0fd729.netlify.app

Deployed on Netlify from commit 22c3eb4

Comment thread requirements.txt Outdated
Co-authored-by: NanoCode012 <nano@axolotl.ai>
@winglian winglian merged commit 6afba38 into main Aug 28, 2025
1 check passed
@winglian winglian deleted the ci280 branch August 28, 2025 13:10
@coderabbitai coderabbitai Bot mentioned this pull request Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants