Skip to content

throw error if got adapter with added_tokens#18046

Merged
Fridge003 merged 1 commit intosgl-project:mainfrom
glenliu21:load_lora_added_tokens
Feb 5, 2026
Merged

throw error if got adapter with added_tokens#18046
Fridge003 merged 1 commit intosgl-project:mainfrom
glenliu21:load_lora_added_tokens

Conversation

@glenliu21
Copy link
Copy Markdown
Contributor

Motivation

Before supporting loading adapters with added tokens in #17794, we should disable that functionality for now.

Modifications

  • throw error if got adapter with added tokens
  • refactor and simplify adapter load logic

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @glenliu21, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a temporary safeguard to prevent the loading of LoRA adapters that modify the vocabulary, which is not yet fully supported. Concurrently, it significantly refactors and simplifies the internal logic for loading LoRA adapters, making the process more unified and robust for future enhancements.

Highlights

  • Temporary Restriction on LoRA Adapters with Added Tokens: Introduced a validation check that explicitly throws an error if an attempt is made to load a LoRA adapter that adds new tokens to the vocabulary. This is a temporary measure until full support is implemented.
  • Refactored LoRA Adapter Loading Logic: The load_lora_adapter method has been refactored to be more flexible, now capable of loading adapters either from a specified path or directly from provided tensor data and configuration. This consolidates previous separate loading paths.
  • New LoRATensorPayload Dataclass: A new LoRATensorPayload dataclass was added to encapsulate the tensors, configuration dictionary, and optional added tokens configuration, streamlining the passing of adapter data.
  • Consolidated Weight Loading: The load_lora_weights method now accepts an optional tensors argument, allowing it to initialize weights from pre-supplied tensors, thereby removing the need for a separate load_lora_weights_from_tensors method.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@glenliu21
Copy link
Copy Markdown
Contributor Author

cc @yushengsu-thu

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a check to prevent loading LoRA adapters with added tokens, which is a sensible guard for a feature not yet fully supported. The accompanying refactoring of the adapter loading logic is well-executed. By consolidating the loading from a path and from tensors into a single load_lora_adapter method, the code is now cleaner, more maintainable, and has less duplication. The introduction of the LoRATensorPayload dataclass is a good way to structure the data passed for loading from tensors. Overall, the changes are solid and improve the codebase.

@glenliu21 glenliu21 mentioned this pull request Feb 1, 2026
5 tasks
@glenliu21 glenliu21 changed the title [Fix/RFC] throw error if got adapter with added_tokens and refactor adapter load logic [Fix/Refactor] throw error if got adapter with added_tokens and refactor adapter load logic Feb 2, 2026
@Fridge003
Copy link
Copy Markdown
Collaborator

@glenliu21 Can you please split the logic of refactors into another PR

@glenliu21
Copy link
Copy Markdown
Contributor Author

@glenliu21 Can you please split the logic of refactors into another PR

Opened #18288 for this.

@glenliu21 glenliu21 force-pushed the load_lora_added_tokens branch from e5ce4e0 to b9a323a Compare February 5, 2026 04:38
@glenliu21 glenliu21 changed the title [Fix/Refactor] throw error if got adapter with added_tokens and refactor adapter load logic throw error if got adapter with added_tokens Feb 5, 2026
@Fridge003
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@github-actions github-actions bot added the run-ci label Feb 5, 2026
@Fridge003 Fridge003 merged commit 3f32a58 into sgl-project:main Feb 5, 2026
146 of 160 checks passed
@glenliu21 glenliu21 deleted the load_lora_added_tokens branch February 7, 2026 14:42
charlesHsuGG pushed a commit to charlesHsuGG/sglang that referenced this pull request Feb 9, 2026
Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026
alisonshao added a commit that referenced this pull request Feb 16, 2026
PR #18046 added validation rejecting LoRA adapters with
added_tokens.json, but the test adapter y9760210/Qwen3-4B-lora_model
has added_tokens.json, causing a server crash (exit code -9).

Replace with TanXS/Qwen3-4B-LoRA-ZH-WebNovelty-v0.0 which is a valid
Qwen3-4B LoRA adapter without added_tokens.json.

Failure: https://github.com/sgl-project/sglang/actions/runs/22026980777/job/63645203194
magicYang1573 pushed a commit to magicYang1573/sglang that referenced this pull request Mar 9, 2026
Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants