[BugFix] fix loading new adapter with added_tokens#17794
[BugFix] fix loading new adapter with added_tokens#17794glenliu21 wants to merge 7 commits intosgl-project:mainfrom
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
/gemini review |
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request addresses a bug where lora_added_tokens_size was not updated when a new LoRA adapter with added tokens was loaded after server initialization. The changes correctly introduce a mechanism to update this size and re-initialize the memory pool accordingly. The addition of a new test case to verify this fix is also a good improvement.
I've identified a critical issue in the implementation that could lead to a server crash during startup if a LoRA adapter with added tokens is provided initially. I've also pointed out a potential logic issue in how different lora_added_tokens_size values from multiple adapters are handled, which could lead to silent failures. Please see my detailed comments.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
/tag-and-rerun-ci |
|
|
||
| # Some LoRA adapters are loaded before the memory pool is initialized | ||
| if hasattr(self, "memory_pool"): | ||
| self.init_memory_pool() |
There was a problem hiding this comment.
Will this operation delete the earlier adaptors in GPU memory? If so I feel it's risky
There was a problem hiding this comment.
I think theoretically that should be fine because adapters will just be reloaded on the next forward pass.
There was a problem hiding this comment.
extra tokens only affects embedding memory pool. Can we only reinitialize embedding part?
|
The test I added where adapters are loaded on server start fails even on our existing implementation. Unless it is the case that we don't want users to load adapters with different |
|
Do we have an ETA for this fix? #18046 was supposed to be temporary iirc. |
|
Fixed by #17905 |
Motivation
Fixes #17096.
Modifications
Update
LoRAMemoryPool'slora_added_tokens_sizeafter loading in a new adapter that adds tokens to the vocabulary.Accuracy Tests
Added to
test_lora_update.pyChecklist
Review Process
/tag-run-ci-label,/rerun-failed-ci,/tag-and-rerun-ci