Skip to content

[BugFix] Fix sglang and vllm engine args#1634

Merged
vermouth1992 merged 3 commits intoverl-project:mainfrom
ETOgaosion:fix_sglang_engine_args
May 22, 2025
Merged

[BugFix] Fix sglang and vllm engine args#1634
vermouth1992 merged 3 commits intoverl-project:mainfrom
ETOgaosion:fix_sglang_engine_args

Conversation

@ETOgaosion
Copy link
Copy Markdown
Collaborator

@ETOgaosion ETOgaosion commented May 22, 2025

Checklist Before Starting

  • Search for similar PR(s).

What does this PR do?

#1616 causes vllm engine arg init failed, not know why CI of that PR fail to detect. Some errors have shown up.

image

We may better separate engine args for different inference systems

High-Level Design

Demonstrate the high-level design if this PR is complex.

Specific Changes

List the specific changes.

API

    engine_kwargs: # inference engine parameters
      vllm:
        swap_space: null # null means "use the engine default value" (usually 4 GB), setting it to, e.g., 32 means 32 GB
      sglang:
        attention_backend: null # null means use the engine default value, available options: flashinfer, triton, flashmla

Usage Example

Provide usage example(s) for easier usage.

# Add code snippet or script demonstrating how to use this 

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluatuion results, etc.

Additional Info.

  • Issue Number: Fixes issue # or discussion # if any.
  • Training: [Note which backend this PR will affect: FSDP, Megatron, both, or none]
  • Inference: [Note which backend this PR will affect: vLLM, SGLang, both, or none]

Checklist Before Submitting

  • Read the Contribute Guide.
  • Apply pre-commit checks.
  • Add [BREAKING] to the PR title if it breaks any API.
  • Update the documentation about your changes in the docs.
  • Add CI test(s) if necessary.

@vermouth1992
Copy link
Copy Markdown
Collaborator

Shall we separate vllm and sglang config args from a higher level?

@ETOgaosion
Copy link
Copy Markdown
Collaborator Author

Shall we separate vllm and sglang config args from a higher level?

Seems that the only separate configuration of these two systems is the engine_kwargs, the abstraction is good.

@vermouth1992 vermouth1992 merged commit c803b1f into verl-project:main May 22, 2025
35 checks passed
@hebiao064
Copy link
Copy Markdown
Collaborator

Sorry for that, thanks for fixing it

wwwjn pushed a commit to wwwjn/verl that referenced this pull request Jun 10, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

verl-project#1616 causes vllm engine arg init failed, not know why CI of that PR
fail to detect. Some errors have shown up.


![image](https://github.com/user-attachments/assets/ac6bb86e-1576-458e-b341-0e949724ac12)

We may better separate engine args for different inference systems

### High-Level Design

> Demonstrate the high-level design if this PR is complex.

### Specific Changes

> List the specific changes.

### API

```yml
    engine_kwargs: # inference engine parameters
      vllm:
        swap_space: null # null means "use the engine default value" (usually 4 GB), setting it to, e.g., 32 means 32 GB
      sglang:
        attention_backend: null # null means use the engine default value, available options: flashinfer, triton, flashmla
```

### Usage Example

> Provide usage example(s) for easier usage.

```python
# Add code snippet or script demonstrating how to use this 
```

### Test

> For changes that can not be tested by CI (e.g., algorithm
implementation, new model support), validate by experiment(s) and show
results like training curve plots, evaluatuion results, etc.

### Additional Info.

- **Issue Number**: Fixes issue # or discussion # if any.
- **Training**: [Note which backend this PR will affect: FSDP, Megatron,
both, or none]
- **Inference**: [Note which backend this PR will affect: vLLM, SGLang,
both, or none]

### Checklist Before Submitting

- [ ] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [ ] Add `[BREAKING]` to the PR title if it breaks any API.
- [ ] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add CI test(s) if necessary.
chenjiaoAngel added a commit to chenjiaoAngel/verl that referenced this pull request Nov 14, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

verl-project#1616 causes vllm engine arg init failed, not know why CI of that PR
fail to detect. Some errors have shown up.


![image](https://github.com/user-attachments/assets/ac6bb86e-1576-458e-b341-0e949724ac12)

We may better separate engine args for different inference systems

### High-Level Design

> Demonstrate the high-level design if this PR is complex.

### Specific Changes

> List the specific changes.

### API

```yml
    engine_kwargs: # inference engine parameters
      vllm:
        swap_space: null # null means "use the engine default value" (usually 4 GB), setting it to, e.g., 32 means 32 GB
      sglang:
        attention_backend: null # null means use the engine default value, available options: flashinfer, triton, flashmla
```

### Usage Example

> Provide usage example(s) for easier usage.

```python
# Add code snippet or script demonstrating how to use this 
```

### Test

> For changes that can not be tested by CI (e.g., algorithm
implementation, new model support), validate by experiment(s) and show
results like training curve plots, evaluatuion results, etc.

### Additional Info.

- **Issue Number**: Fixes issue # or discussion # if any.
- **Training**: [Note which backend this PR will affect: FSDP, Megatron,
both, or none]
- **Inference**: [Note which backend this PR will affect: vLLM, SGLang,
both, or none]

### Checklist Before Submitting

- [ ] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [ ] Add `[BREAKING]` to the PR title if it breaks any API.
- [ ] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add CI test(s) if necessary.
TimurTaepov pushed a commit to giorgossideris/verl that referenced this pull request Dec 20, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

verl-project#1616 causes vllm engine arg init failed, not know why CI of that PR
fail to detect. Some errors have shown up.


![image](https://github.com/user-attachments/assets/ac6bb86e-1576-458e-b341-0e949724ac12)

We may better separate engine args for different inference systems

### High-Level Design

> Demonstrate the high-level design if this PR is complex.

### Specific Changes

> List the specific changes.

### API

```yml
    engine_kwargs: # inference engine parameters
      vllm:
        swap_space: null # null means "use the engine default value" (usually 4 GB), setting it to, e.g., 32 means 32 GB
      sglang:
        attention_backend: null # null means use the engine default value, available options: flashinfer, triton, flashmla
```

### Usage Example

> Provide usage example(s) for easier usage.

```python
# Add code snippet or script demonstrating how to use this 
```

### Test

> For changes that can not be tested by CI (e.g., algorithm
implementation, new model support), validate by experiment(s) and show
results like training curve plots, evaluatuion results, etc.

### Additional Info.

- **Issue Number**: Fixes issue # or discussion # if any.
- **Training**: [Note which backend this PR will affect: FSDP, Megatron,
both, or none]
- **Inference**: [Note which backend this PR will affect: vLLM, SGLang,
both, or none]

### Checklist Before Submitting

- [ ] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [ ] Add `[BREAKING]` to the PR title if it breaks any API.
- [ ] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add CI test(s) if necessary.
vyomakesh0728 added a commit to vyomakesh0728/verl that referenced this pull request Jan 22, 2026
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

verl-project#1616 causes vllm engine arg init failed, not know why CI of that PR
fail to detect. Some errors have shown up.


![image](https://github.com/user-attachments/assets/ac6bb86e-1576-458e-b341-0e949724ac12)

We may better separate engine args for different inference systems

### High-Level Design

> Demonstrate the high-level design if this PR is complex.

### Specific Changes

> List the specific changes.

### API

```yml
    engine_kwargs: # inference engine parameters
      vllm:
        swap_space: null # null means "use the engine default value" (usually 4 GB), setting it to, e.g., 32 means 32 GB
      sglang:
        attention_backend: null # null means use the engine default value, available options: flashinfer, triton, flashmla
```

### Usage Example

> Provide usage example(s) for easier usage.

```python
# Add code snippet or script demonstrating how to use this 
```

### Test

> For changes that can not be tested by CI (e.g., algorithm
implementation, new model support), validate by experiment(s) and show
results like training curve plots, evaluatuion results, etc.

### Additional Info.

- **Issue Number**: Fixes issue # or discussion # if any.
- **Training**: [Note which backend this PR will affect: FSDP, Megatron,
both, or none]
- **Inference**: [Note which backend this PR will affect: vLLM, SGLang,
both, or none]

### Checklist Before Submitting

- [ ] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [ ] Add `[BREAKING]` to the PR title if it breaks any API.
- [ ] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add CI test(s) if necessary.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants