Skip to content

[BugFix] fix loading new adapter with added_tokens#17794

Closed
glenliu21 wants to merge 7 commits intosgl-project:mainfrom
glenliu21:lora_added_tokens_fix
Closed

[BugFix] fix loading new adapter with added_tokens#17794
glenliu21 wants to merge 7 commits intosgl-project:mainfrom
glenliu21:lora_added_tokens_fix

Conversation

@glenliu21
Copy link
Copy Markdown
Contributor

Motivation

Fixes #17096.

Modifications

Update LoRAMemoryPool's lora_added_tokens_size after loading in a new adapter that adds tokens to the vocabulary.

Accuracy Tests

Added to test_lora_update.py

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions bot added the lora label Jan 27, 2026
@yushengsu-thu yushengsu-thu self-assigned this Jan 27, 2026
@glenliu21
Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@glenliu21
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug where lora_added_tokens_size was not updated when a new LoRA adapter with added tokens was loaded after server initialization. The changes correctly introduce a mechanism to update this size and re-initialize the memory pool accordingly. The addition of a new test case to verify this fix is also a good improvement.

I've identified a critical issue in the implementation that could lead to a server crash during startup if a LoRA adapter with added tokens is provided initially. I've also pointed out a potential logic issue in how different lora_added_tokens_size values from multiple adapters are handled, which could lead to silent failures. Please see my detailed comments.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@yushengsu-thu
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci


# Some LoRA adapters are loaded before the memory pool is initialized
if hasattr(self, "memory_pool"):
self.init_memory_pool()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this operation delete the earlier adaptors in GPU memory? If so I feel it's risky

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think theoretically that should be fine because adapters will just be reloaded on the next forward pass.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra tokens only affects embedding memory pool. Can we only reinitialize embedding part?

@glenliu21
Copy link
Copy Markdown
Contributor Author

The test I added where adapters are loaded on server start fails even on our existing implementation.

Unless it is the case that we don't want users to load adapters with different added_tokens sizes (which seems unlikely) then the existing implementation is broken and needs to be fixed.

@glenliu21 glenliu21 marked this pull request as draft February 1, 2026 02:02
@vedantjh2
Copy link
Copy Markdown
Contributor

Do we have an ETA for this fix? #18046 was supposed to be temporary iirc.

@Fridge003
Copy link
Copy Markdown
Collaborator

Fixed by #17905

@Fridge003 Fridge003 closed this Apr 1, 2026
@glenliu21 glenliu21 deleted the lora_added_tokens_fix branch April 4, 2026 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Dynamic LoRA load error

4 participants