Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT Models by qthequartermasterman · Pull Request #25717 · vllm-project/vllm

qthequartermasterman · 2025-09-25T22:26:43Z

Purpose

It was previously unknown if #24278 was compatible with LoRA adapters or not. This PR adds tests explicitly for that combination. Since #25663 swapped out Zephyr for OPT125-m for testing prompt embeds, this PR also adds LoRA support for opt125-m.

Test Plan

Updated tests cases. I've also tested it locally with a meta-llama/Llama-3.1-8B-Instruct LoRA and everything seems to work as expected there.

Test Result

New tests are working locally. Pending CI.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Andrew Sansom <andrew@protopia.ai>

qthequartermasterman · 2025-09-25T22:28:38Z

@DarkLight1337. I'm not sure who else would need to look at this. I also wonder that using this model could speed up some of the other entrypoints LoRA tests that are currently using zephyr-7b, just like you sped up these tests in #25663.

gemini-code-assist

Code Review

This pull request adds LoRA support for OPT models and includes corresponding tests. The changes to enable LoRA in the OPT model implementation are mostly correct, following patterns from other models in the repository. However, I found a critical issue in the initialization of the LogitsProcessor which would lead to incorrect behavior when using LoRA adapters with extra vocabulary tokens. My review provides a code suggestion to fix this.

vllm/model_executor/models/opt.py

Signed-off-by: Andrew Sansom <andrew@protopia.ai>

vllm/model_executor/models/opt.py

…list Signed-off-by: Andrew Sansom <andrew@protopia.ai>

DarkLight1337

LGTM if the tests pass, cc @jeejeelee if you want to double check the model

DarkLight1337 · 2025-09-26T04:23:31Z

I'm not sure who else would need to look at this. I also wonder that using this model could speed up some of the other entrypoints LoRA tests that are currently using zephyr-7b, just like you sped up these tests in #25663.

cc @jeejeelee

vllm/model_executor/models/opt.py

Signed-off-by: Andrew Sansom <andrew@protopia.ai>

qthequartermasterman · 2025-09-29T18:34:16Z

@DarkLight1337 This looks like it's ready for re-review. Thanks!

Thanks @jeejeelee for your help.

… Models (vllm-project#25717) Signed-off-by: Andrew Sansom <andrew@protopia.ai>

… Models (#25717) Signed-off-by: Andrew Sansom <andrew@protopia.ai> Signed-off-by: yewentao256 <zhyanwentao@126.com>

… Models (vllm-project#25717) Signed-off-by: Andrew Sansom <andrew@protopia.ai> Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>

… Models (vllm-project#25717) Signed-off-by: Andrew Sansom <andrew@protopia.ai>

qthequartermasterman added 2 commits September 25, 2025 17:21

feat: Add LoRA support for OPT models

98b2d06

Signed-off-by: Andrew Sansom <andrew@protopia.ai>

test: Add Prompt Embeds + LoRA tests

54e0fa6

Signed-off-by: Andrew Sansom <andrew@protopia.ai>

qthequartermasterman requested review from DarkLight1337, NickLucche, aarnphm, robertgshaw2-redhat and simon-mo as code owners September 25, 2025 22:26

mergify bot added the documentation Improvements or additions to documentation label Sep 25, 2025

gemini-code-assist bot reviewed Sep 25, 2025

View reviewed changes

vllm/model_executor/models/opt.py Show resolved Hide resolved

qthequartermasterman added 3 commits September 25, 2025 17:45

fix: use correct vocab size

b065f8a

Signed-off-by: Andrew Sansom <andrew@protopia.ai>

fix: pass in org_vocab_size to OPT Logits Processor

f84cb07

Signed-off-by: Andrew Sansom <andrew@protopia.ai>

Merge branch 'main' into lora-v1-prompt-embeds

387ecea

DarkLight1337 reviewed Sep 26, 2025

View reviewed changes

vllm/model_executor/models/opt.py Show resolved Hide resolved

docs: Document OPTForCausalLM as supporting LoRA in supported models …

079e81d

…list Signed-off-by: Andrew Sansom <andrew@protopia.ai>

DarkLight1337 approved these changes Sep 26, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) September 26, 2025 04:23

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 26, 2025

qthequartermasterman mentioned this pull request Sep 26, 2025

feat: Support Prefix Caching with Prompt Embeds #25741

Closed

5 tasks

jeejeelee reviewed Sep 26, 2025

View reviewed changes

vllm/model_executor/models/opt.py Outdated Show resolved Hide resolved

jeejeelee reviewed Sep 26, 2025

View reviewed changes

vllm/model_executor/models/opt.py Outdated Show resolved Hide resolved

jeejeelee reviewed Sep 26, 2025

View reviewed changes

vllm/model_executor/models/opt.py Outdated Show resolved Hide resolved

jeejeelee reviewed Sep 26, 2025

View reviewed changes

vllm/model_executor/models/opt.py Outdated Show resolved Hide resolved

qthequartermasterman added 3 commits September 26, 2025 10:26

docs: remove unnecessary comment

8c303ed

Signed-off-by: Andrew Sansom <andrew@protopia.ai>

feat: revert implementation of deprecated additional vocab tokens

34ab149

Signed-off-by: Andrew Sansom <andrew@protopia.ai>

Merge branch 'main' into lora-v1-prompt-embeds

937bc7d

auto-merge was automatically disabled September 26, 2025 15:29
Head branch was pushed to by a user without write access

qthequartermasterman added 2 commits September 26, 2025 10:30

chore: revert unnecessary variable

5cad11e

Signed-off-by: Andrew Sansom <andrew@protopia.ai>

refactor: 🔥 remove embedding lora support

988294c

Signed-off-by: Andrew Sansom <andrew@protopia.ai>

jeejeelee approved these changes Sep 29, 2025

View reviewed changes

jeejeelee merged commit 78a47f8 into vllm-project:main Sep 30, 2025
50 checks passed

pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025

Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT…

5b84d22

… Models (vllm-project#25717) Signed-off-by: Andrew Sansom <andrew@protopia.ai>

yewentao256 pushed a commit that referenced this pull request Oct 3, 2025

Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT…

6941d53

… Models (#25717) Signed-off-by: Andrew Sansom <andrew@protopia.ai> Signed-off-by: yewentao256 <zhyanwentao@126.com>

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT…

26749ed

… Models (vllm-project#25717) Signed-off-by: Andrew Sansom <andrew@protopia.ai>

alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025

Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT…

2b76c82

… Models (vllm-project#25717) Signed-off-by: Andrew Sansom <andrew@protopia.ai>

rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025

Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT…

ae9e62c

… Models (vllm-project#25717) Signed-off-by: Andrew Sansom <andrew@protopia.ai>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT…

d0f2156

… Models (vllm-project#25717) Signed-off-by: Andrew Sansom <andrew@protopia.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT Models #25717

Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT Models #25717
jeejeelee merged 11 commits intovllm-project:mainfrom
protopia-ai:lora-v1-prompt-embeds

qthequartermasterman commented Sep 25, 2025 •

edited by github-actions bot

Loading

Uh oh!

qthequartermasterman commented Sep 25, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Uh oh!

DarkLight1337 commented Sep 26, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qthequartermasterman commented Sep 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

qthequartermasterman commented Sep 25, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

qthequartermasterman commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 commented Sep 26, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qthequartermasterman commented Sep 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

qthequartermasterman commented Sep 25, 2025 •

edited by github-actions bot

Loading

qthequartermasterman commented Sep 25, 2025 •

edited

Loading