[BugFix] Fix sglang and vllm engine args by ETOgaosion · Pull Request #1634 · verl-project/verl

ETOgaosion · 2025-05-22T07:51:07Z

Checklist Before Starting

Search for similar PR(s).

What does this PR do?

#1616 causes vllm engine arg init failed, not know why CI of that PR fail to detect. Some errors have shown up.

We may better separate engine args for different inference systems

High-Level Design

Demonstrate the high-level design if this PR is complex.

Specific Changes

List the specific changes.

API

    engine_kwargs: # inference engine parameters
      vllm:
        swap_space: null # null means "use the engine default value" (usually 4 GB), setting it to, e.g., 32 means 32 GB
      sglang:
        attention_backend: null # null means use the engine default value, available options: flashinfer, triton, flashmla

Usage Example

Provide usage example(s) for easier usage.

# Add code snippet or script demonstrating how to use this

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluatuion results, etc.

Additional Info.

Issue Number: Fixes issue # or discussion # if any.
Training: [Note which backend this PR will affect: FSDP, Megatron, both, or none]
Inference: [Note which backend this PR will affect: vLLM, SGLang, both, or none]

Checklist Before Submitting

Read the Contribute Guide.
Apply pre-commit checks.
Add [BREAKING] to the PR title if it breaks any API.
Update the documentation about your changes in the docs.
Add CI test(s) if necessary.

vermouth1992 · 2025-05-22T07:54:19Z

Shall we separate vllm and sglang config args from a higher level?

ETOgaosion · 2025-05-22T07:57:30Z

Shall we separate vllm and sglang config args from a higher level?

Seems that the only separate configuration of these two systems is the engine_kwargs, the abstraction is good.

hebiao064 · 2025-05-22T18:36:42Z

Sorry for that, thanks for fixing it

### Checklist Before Starting - [x] Search for similar PR(s). ### What does this PR do? verl-project#1616 causes vllm engine arg init failed, not know why CI of that PR fail to detect. Some errors have shown up. ![image](https://github.com/user-attachments/assets/ac6bb86e-1576-458e-b341-0e949724ac12) We may better separate engine args for different inference systems ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes > List the specific changes. ### API ```yml engine_kwargs: # inference engine parameters vllm: swap_space: null # null means "use the engine default value" (usually 4 GB), setting it to, e.g., 32 means 32 GB sglang: attention_backend: null # null means use the engine default value, available options: flashinfer, triton, flashmla ``` ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluatuion results, etc. ### Additional Info. - **Issue Number**: Fixes issue # or discussion # if any. - **Training**: [Note which backend this PR will affect: FSDP, Megatron, both, or none] - **Inference**: [Note which backend this PR will affect: vLLM, SGLang, both, or none] ### Checklist Before Submitting - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add CI test(s) if necessary.

ETOgaosion added 3 commits May 22, 2025 15:37

vllm do not have the attention_backend engine args, cause CI to fail

984a693

fix engine args

f8e1e9e

fix rollout inside

49e7b19

ETOgaosion added the high priority label May 22, 2025

ETOgaosion requested review from vermouth1992 and zhaochenyang20 May 22, 2025 07:51

ETOgaosion mentioned this pull request May 22, 2025

[Megatron] Support optimizer offload for moe when ep > 1 #1638

Merged

6 tasks

vermouth1992 approved these changes May 22, 2025

View reviewed changes

vermouth1992 merged commit c803b1f into verl-project:main May 22, 2025
35 checks passed

ETOgaosion mentioned this pull request May 23, 2025

[Refactor] fused kernel in forward #1624

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Fix sglang and vllm engine args#1634

[BugFix] Fix sglang and vllm engine args#1634
vermouth1992 merged 3 commits intoverl-project:mainfrom
ETOgaosion:fix_sglang_engine_args

ETOgaosion commented May 22, 2025 •

edited

Loading

Uh oh!

vermouth1992 commented May 22, 2025

Uh oh!

ETOgaosion commented May 22, 2025

Uh oh!

Uh oh!

hebiao064 commented May 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ETOgaosion commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist Before Starting

What does this PR do?

High-Level Design

Specific Changes

API

Usage Example

Test

Additional Info.

Checklist Before Submitting

Uh oh!

vermouth1992 commented May 22, 2025

Uh oh!

ETOgaosion commented May 22, 2025

Uh oh!

Uh oh!

hebiao064 commented May 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ETOgaosion commented May 22, 2025 •

edited

Loading