Skip to content

Feat: add qwen3_vl, qwen3_vl_moe, granitemoeshared, granitemoehybrid, and upgraded all cce patches#3178

Merged
NanoCode012 merged 2 commits into
mainfrom
feat/up-cce-4-56
Sep 26, 2025
Merged

Feat: add qwen3_vl, qwen3_vl_moe, granitemoeshared, granitemoehybrid, and upgraded all cce patches#3178
NanoCode012 merged 2 commits into
mainfrom
feat/up-cce-4-56

Conversation

@NanoCode012

@NanoCode012 NanoCode012 commented Sep 24, 2025

Copy link
Copy Markdown
Collaborator

Description

  • Added qwen3_vl, qwen3_vl_moe, granitemoeshared, granitemoehybrid patches
  • Upgraded all our patches to transformers v4.56.2
  • Linted with Ruff

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

  • Documentation

    • Updated installation instructions to pin ml-cross-entropy to a newer commit.
    • Expanded Supported Models list with additions such as apertus, glm4v, glm4v_moe, granitemoeshared, granitemoehybrid, qwen3_vl, qwen3_vl_moe, and broader families (deepseek v3, gemma variants, llama4/llama4_text, mistral variants, phi, qwen2_vl, etc.).
    • Aligned installation guidance text across components.
  • Chores

    • Synchronized the pinned commit hash used in scripts and messages for consistent installation output.

@coderabbitai

coderabbitai Bot commented Sep 24, 2025

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Updates the pinned git commit for ml-cut-cross-entropy installation from c5aa3ef to 147ea28 across notebook, script, and init message. The README also updates the install hash and expands the Supported Models list with additional entries.

Changes

Cohort / File(s) Summary
Install pin update
examples/colab-notebooks/colab-axolotl-example.ipynb, scripts/cutcrossentropy_install.py, src/axolotl/integrations/cut_cross_entropy/__init__.py
Replace ml-cut-cross-entropy git commit hash c5aa3ef with 147ea28 in install commands/messages.
CCE README updates
src/axolotl/integrations/cut_cross_entropy/README.md
Update install command to hash 147ea28 and append multiple models to Supported Models list.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

ready to merge

Suggested reviewers

  • winglian
  • SalmanMohammadi

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
Title Check ✅ Passed The title clearly identifies the primary changes — adding qwen3_vl, qwen3_vl_moe, granitemoeshared, granitemoehybrid and upgrading CCE patches — and matches the PR objectives; the diffs show README entries for those models and install/commit updates consistent with an upgrade, so the title is specific and relevant to the changeset.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/up-cce-4-56

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown
Contributor

📖 Documentation Preview: https://68d37e3c230072294bb4a8d9--resonant-treacle-0fd729.netlify.app

Deployed on Netlify from commit 0b0f37c

@codecov

codecov Bot commented Sep 24, 2025

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
src/axolotl/integrations/cut_cross_entropy/__init__.py (1)

36-39: Avoid future drift: centralize the pinned CCE revision.

The SHA appears in multiple places (here, install script, README). Consider centralizing it (e.g., a small internal constant module or config) and referencing it to prevent future mismatches.

scripts/cutcrossentropy_install.py (2)

32-32: Pin bump acknowledged; consider env-agnostic invocation.

String looks correct. Optionally prefer python -m pip (and uv pip) consistently to avoid PATH/env issues.

Apply within this line if desired:

-+ f'{UV_PREFIX}pip install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@147ea28"'
++ f'{UV_PREFIX}python -m pip install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@147ea28"'

Note: When USE_UV is true, this becomes: uv python -m pip install ... which is also supported by uv.


23-33: Uninstall condition may miss non-fork installs with transformers extra.

If cut_cross_entropy is installed with transformers extra but is not Axolotl’s fork, we won’t uninstall and will re-install on top. Consider always uninstalling before reinstalling to ensure the fork is used.

-if cce_spec:
-    if not importlib.util.find_spec("cut_cross_entropy.transformers"):
-        UNINSTALL_PREFIX = "pip uninstall -y cut-cross-entropy && "
+if cce_spec:
+    # Always uninstall to avoid mixed installations across forks/extras
+    UNINSTALL_PREFIX = "pip uninstall -y cut-cross-entropy && "
examples/colab-notebooks/colab-axolotl-example.ipynb (2)

40-44: Align Colab env with Transformers 4.56.2 to match upgraded patches.

To avoid resolver drift on Colab and ensure compatibility with the upgraded patches, pin Transformers in the install cell.

Apply this diff:

 %%capture
 # This step can take ~5-10 minutes to install dependencies
 !pip install --no-build-isolation axolotl[flash-attn]>=0.9.1
+!pip install "transformers==4.56.2"
 !pip install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@147ea28"

40-44: Prefer fully pinned versions in notebooks for reproducibility.

Colab environments change frequently. Consider pinning axolotl to an exact version used by this PR to prevent breakages from future releases.

Example:

-!pip install --no-build-isolation axolotl[flash-attn]>=0.9.1
+!pip install --no-build-isolation "axolotl[flash-attn]==0.9.1"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 55d1be2 and 0b0f37c.

📒 Files selected for processing (4)
  • examples/colab-notebooks/colab-axolotl-example.ipynb (1 hunks)
  • scripts/cutcrossentropy_install.py (1 hunks)
  • src/axolotl/integrations/cut_cross_entropy/README.md (4 hunks)
  • src/axolotl/integrations/cut_cross_entropy/__init__.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: docker-e2e-tests-1st (126, 12.6.3, 3.11, 2.7.1, 1)
  • GitHub Check: docker-e2e-tests-1st (126, 12.6.3, 3.11, 2.7.1, 1, Dockerfile-uv.jinja)
🔇 Additional comments (4)
src/axolotl/integrations/cut_cross_entropy/README.md (2)

22-22: Install line matches code/scripts; verify repo-wide consistency.

Looks good. Please ensure no references to the old SHA remain elsewhere.

Use the script shared in the init.py comment to confirm both the commit’s existence and that all occurrences are updated.


34-75: Verify PATCH_FNS registration for newly documented models.
Couldn't fetch cut_cross_entropy/transformers/patch.py at axolotl-ai-cloud/ml-cross-entropy@147ea28 (gh returned 404); confirm PATCH_FNS contains keys for: apertus, glm4v, glm4v_moe, granitemoeshared, granitemoehybrid, qwen3_vl, qwen3_vl_moe — or confirm the generic patch path reliably covers them.

src/axolotl/integrations/cut_cross_entropy/__init__.py (1)

38-38: Pin bump OK — commit 147ea28 found; verify patch API

cut_cross_entropy/transformers/patch.py@147ea28 defines AXOLOTL_CCE_FORK but does not define register_patch/apply_patch/patch (it exposes a PATCH_FNS mapping); confirm the integration uses the mapping or update the code to match upstream.

examples/colab-notebooks/colab-axolotl-example.ipynb (1)

43-43: CCE pin bump to 147ea28 — no stale c5aa3ef references found (LGTM)

rg returned no matches for old SHA c5aa3ef and found expected occurrences of 147ea28 in: scripts/cutcrossentropy_install.py, src/axolotl/integrations/cut_cross_entropy/init.py, src/axolotl/integrations/cut_cross_entropy/README.md, and examples/colab-notebooks/colab-axolotl-example.ipynb.

@NanoCode012 NanoCode012 changed the title Feat: add qwen3_vl, qwen3_vl_moe, granitemoeshared, granitemoehybrid, and upgraded all patches Feat: add qwen3_vl, qwen3_vl_moe, granitemoeshared, granitemoehybrid, and upgraded all cce patches Sep 24, 2025
@NanoCode012 NanoCode012 merged commit 7fa8ac4 into main Sep 26, 2025
19 checks passed
@NanoCode012 NanoCode012 deleted the feat/up-cce-4-56 branch September 26, 2025 05:11
@coderabbitai coderabbitai Bot mentioned this pull request Dec 24, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants