Skip to content

[LoRA][I] Add MOE LoRA JIT alignment kernel and tests #19710

Merged
Fridge003 merged 11 commits intosgl-project:mainfrom
yushengsu-thu:moe-lora-jit-kernel
Mar 12, 2026
Merged

[LoRA][I] Add MOE LoRA JIT alignment kernel and tests #19710
Fridge003 merged 11 commits intosgl-project:mainfrom
yushengsu-thu:moe-lora-jit-kernel

Conversation

@yushengsu-thu
Copy link
Copy Markdown
Collaborator

@yushengsu-thu yushengsu-thu commented Mar 2, 2026

Split this PR #14105 into 3 parts - Part I

Add JIT-compiled CUDA kernels for MOE LoRA block size alignment:

  • moe_lora_align.py: JIT wrapper for moe_lora_align_block_size
  • moe_lora_align_kernel.cu: CUDA kernels for token alignment, sorting, and expert counting
  • test_moe_lora_align_block_size.py: Unit tests for the alignment kernel

Made-with: Cursor

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

Add JIT-compiled CUDA kernels for MOE LoRA block size alignment:
- moe_lora_align.py: JIT wrapper for moe_lora_align_block_size
- moe_lora_align_kernel.cu: CUDA kernels for token alignment, sorting, and expert counting
- test_moe_lora_align_block_size.py: Unit tests for the alignment kernel

Made-with: Cursor
Copilot AI review requested due to automatic review settings March 2, 2026 18:50
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@yushengsu-thu
Copy link
Copy Markdown
Collaborator Author

Co-authored-by: Jonah Bernard jb2528@cornell.edu
Co-authored-by: cursor[bot] noreply@cursor.sh

@yushengsu-thu yushengsu-thu changed the title Add MOE LoRA JIT alignment kernel and tests [Lora] Add MOE LoRA JIT alignment kernel and tests Mar 2, 2026
@yushengsu-thu yushengsu-thu changed the title [Lora] Add MOE LoRA JIT alignment kernel and tests [Lora][I] Add MOE LoRA JIT alignment kernel and tests Mar 2, 2026
@yushengsu-thu yushengsu-thu changed the title [Lora][I] Add MOE LoRA JIT alignment kernel and tests [LoRA][I] Add MOE LoRA JIT alignment kernel and tests Mar 2, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a JIT-compiled CUDA implementation for MoE+LoRA token alignment (block-size padding + per-expert sorting), along with a Python wrapper and a CUDA CI unit test, as part of the larger MoE LoRA enablement work split out from #14105.

Changes:

  • Introduce moe_lora_align_block_size Python wrapper that JIT-loads a new CUDA kernel.
  • Add CUDA kernels to build a per-LoRA token mask, align counts to block_size, and sort tokens by expert.
  • Add a CUDA-registered pytest validating expert assignment and LoRA ownership of sorted blocks.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
python/sglang/jit_kernel/moe_lora_align.py JIT loader + Python entrypoint for the new MOE LoRA alignment kernel.
python/sglang/jit_kernel/csrc/lora/moe_lora_align_kernel.cu CUDA implementation for token masking, expert counting/padding, and sorting for MoE LoRA alignment.
python/sglang/jit_kernel/tests/test_moe_lora_align_block_size.py CUDA CI test validating the alignment/sorting results.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yushengsu-thu
Copy link
Copy Markdown
Collaborator Author

/tag-and-rerun-ci

@github-actions github-actions bot added the run-ci label Mar 2, 2026
@yushengsu-thu yushengsu-thu requested a review from yuan-luo as a code owner March 6, 2026 00:08
@yushengsu-thu
Copy link
Copy Markdown
Collaborator Author

/tag-and-rerun-ci

@Fridge003 Fridge003 merged commit af2807e into sgl-project:main Mar 12, 2026
241 of 267 checks passed
liubiyongge pushed a commit to liubiyongge/sglang that referenced this pull request Mar 13, 2026
)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Jonah Bernard <96398205+Jonahcb@users.noreply.github.com>
yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Mar 15, 2026
)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Jonah Bernard <96398205+Jonahcb@users.noreply.github.com>
Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026
)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Jonah Bernard <96398205+Jonahcb@users.noreply.github.com>
0-693 pushed a commit to 0-693/sglang that referenced this pull request Mar 25, 2026
)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Jonah Bernard <96398205+Jonahcb@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants