Skip to content

[EPLB]Eplb Config Renaming#5533

Merged
wangxiyuan merged 1 commit intovllm-project:mainfrom
shenchuxiaofugui:rename
Jan 15, 2026
Merged

[EPLB]Eplb Config Renaming#5533
wangxiyuan merged 1 commit intovllm-project:mainfrom
shenchuxiaofugui:rename

Conversation

@shenchuxiaofugui
Copy link
Copy Markdown
Collaborator

@shenchuxiaofugui shenchuxiaofugui commented Dec 30, 2025

What this PR does / why we need it?

  1. Rename num_iterations_eplb_update to expert_heat_collection_interval.
  2. Rename num_wait_worker_iterations to algorithm_execution_interval.
  3. Rename init_redundancy_expert to num_redundant_experts because the variable with the same meaning in vLLM is named this way.
  4. Delete gate_eplb because we don't need this feature.
  5. Move eplb config into a dict in additional config.
  6. Depend on pr5817

Does this PR introduce any user-facing change?

before this pr:
--additional-config '{"dynamic_eplb":true, "num_iterations_eplb_update": 4000, "num_wait_worker_iterations": 150, "init_redundancy_expert": 16, "expert_map_path": "xxx.json"}'

after this pr:
--additional-config '{"eplb_config":{"dynamic_eplb":true,"expert_heat_collection_interval":4000, "algorithm_execution_interval":150,"num_redundant_experts": 16, "expert_map_path": "xxx.json"}}'

How was this patch tested?

test qwen3-235b eplb num_redundant_experts=16

without pr5817

dataset version metric mode vllm-api-general-chat
aime2024 604a78 accuracy gen 83.33

with pr5817

dataset version metric mode vllm-api-general-chat
aime2024 604a78 accuracy gen 86.67

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors several configuration parameters for the Expert Parallelism Load Balancer (EPLB) to improve clarity. The changes are applied consistently across documentation, tests, and source code. My review focuses on ensuring the documentation is clear and accurate after these changes. I found some issues in the additional_config.md file, including duplicate entries and formatting errors, which could confuse users.

Comment on lines +37 to +46
| `lmhead_tensor_parallel_size` | int | `None` | The custom tensor parallel size of lmhead. Restriction: Can only be used when tensor_parallel=1 |
| `oproj_tensor_parallel_size` | int | `None` | The custom tensor parallel size of oproj. |
| `multistream_overlap_shared_expert` | bool | `False` | Whether to enable multistream shared expert. This option only takes effect on MoE models with shared experts. |
| `dynamic_eplb` | bool | `False` | Whether to enable dynamic EPLB. |
| `expert_heat_collection_interval` | int | `400` | Forward iterations when EPLB begins. |
| `algorithm_execution_interval` | int | `30` | The forward iterations when the EPLB worker will finish CPU tasks. In our test default value 30 can cover most cases. |
| `expert_map_record_path` | str | `None` | Save the expert load calculation results to a new expert table in the specified directory. |
| `num_redundant_experts` | int | `0` | Specify redundant experts during initialization. |
| `dump_config` | str | `None` | Configuration file path for msprobe dump(eager mode). |
| `enable_async_exponential` | int | `0` | Whether to enable async exponential overlap. To enable async exponential, set this config to 1. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The configuration table in this section has several issues that could be confusing for users:

  • Duplicate Configurations:
    • enable_async_exponential is listed on line 35 (as a bool) and again on line 46 (as an int).
    • dump_config (line 45) appears to be a duplicate of dump_config_path (line 34), as they have identical descriptions. The correct parameter used in the code is dump_config_path.
    • lmhead_tensor_parallel_size (line 37) and oproj_tensor_parallel_size (line 38) are already documented under the finegrained_tp_config section below.
  • Formatting and Typos:
    • The table formatting is misaligned from line 41 onwards, making it difficult to read.
    • There is a typo in the description for algorithm_execution_interval on line 42 ("The forward iterations...").

To improve clarity and correctness, I suggest replacing lines 37-46 to only include the relevant, correctly formatted, and de-duplicated configurations related to this PR's scope.

Suggested change
| `lmhead_tensor_parallel_size` | int | `None` | The custom tensor parallel size of lmhead. Restriction: Can only be used when tensor_parallel=1 |
| `oproj_tensor_parallel_size` | int | `None` | The custom tensor parallel size of oproj. |
| `multistream_overlap_shared_expert` | bool | `False` | Whether to enable multistream shared expert. This option only takes effect on MoE models with shared experts. |
| `dynamic_eplb` | bool | `False` | Whether to enable dynamic EPLB. |
| `expert_heat_collection_interval` | int | `400` | Forward iterations when EPLB begins. |
| `algorithm_execution_interval` | int | `30` | The forward iterations when the EPLB worker will finish CPU tasks. In our test default value 30 can cover most cases. |
| `expert_map_record_path` | str | `None` | Save the expert load calculation results to a new expert table in the specified directory. |
| `num_redundant_experts` | int | `0` | Specify redundant experts during initialization. |
| `dump_config` | str | `None` | Configuration file path for msprobe dump(eager mode). |
| `enable_async_exponential` | int | `0` | Whether to enable async exponential overlap. To enable async exponential, set this config to 1. |
| `multistream_overlap_shared_expert` | bool | `False` | Whether to enable multistream shared expert. This option only takes effect on MoE models with shared experts. |
| `dynamic_eplb` | bool | `False` | Whether to enable dynamic EPLB. |
| `expert_heat_collection_interval` | int | `400` | Forward iterations when EPLB begins. |
| `algorithm_execution_interval` | int | `30` | The forward iterations when the EPLB worker will finish CPU tasks. In our test default value 30 can cover most cases. |
| `expert_map_record_path` | str | `None` | Save the expert load calculation results to a new expert table in the specified directory. |
| `num_redundant_experts` | int | `0` | Specify redundant experts during initialization. |

@github-actions github-actions bot added documentation Improvements or additions to documentation module:tests module:core labels Dec 30, 2025
@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@shenchuxiaofugui shenchuxiaofugui force-pushed the rename branch 3 times, most recently from e297113 to 000b338 Compare January 13, 2026 09:49
@wangxiyuan
Copy link
Copy Markdown
Collaborator

ci passed here: https://github.com/vllm-project/vllm-ascend/actions/runs/20952178776?pr=5533
please rebase and fix the merge conflict. I'll merge it without CI test again.

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
@shenchuxiaofugui
Copy link
Copy Markdown
Collaborator Author

ci passed here: https://github.com/vllm-project/vllm-ascend/actions/runs/20952178776?pr=5533 please rebase and fix the merge conflict. I'll merge it without CI test again.

Conflict was resolved.

@wangxiyuan wangxiyuan merged commit da958ee into vllm-project:main Jan 15, 2026
14 checks passed
aipaes pushed a commit to aipaes/vllm-ascend that referenced this pull request Jan 15, 2026
### What this PR does / why we need it?
1. Rename num_iterations_eplb_update to expert_heat_collection_interval.
2. Rename num_wait_worker_iterations to algorithm_execution_interval.
3. Rename init_redundancy_expert to num_redundant_experts because the
variable with the same meaning in vLLM is named this way.
4. Delete gate_eplb because we don't need this feature.
5. Move eplb config into a dict in additional config.
6. Depend on pr5817

### Does this PR introduce _any_ user-facing change?

before this pr:
`--additional-config '{"dynamic_eplb":true,
"num_iterations_eplb_update": 4000, "num_wait_worker_iterations": 150,
"init_redundancy_expert": 16, "expert_map_path": "xxx.json"}'`

after this pr: 
`--additional-config
'{"eplb_config":{"dynamic_eplb":true,"expert_heat_collection_interval":4000,
"algorithm_execution_interval":150,"num_redundant_experts": 16,
"expert_map_path": "xxx.json"}}'`

### How was this patch tested?

#### test qwen3-235b eplb num_redundant_experts=16

without pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 83.33 |

with pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 86.67 |

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@45c1ca1

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
wangxiyuan pushed a commit that referenced this pull request Jan 22, 2026
### What this PR does / why we need it?
#5533 
Add a wrapper for the eplb startup configuration; this is a
forward-compatible update.

### Does this PR introduce _any_ user-facing change?
before this pr:
--additional-config '{"dynamic_eplb":true, "num_iterations_eplb_update":
4000, "num_wait_worker_iterations": 150, "init_redundancy_expert": 16,
"expert_map_path": "xxx.json"}'

after this pr:
--additional-config
'{"eplb_config":{"dynamic_eplb":true,"expert_heat_collection_interval":4000,
"algorithm_execution_interval":150,"num_redundant_experts": 16,
"expert_map_path": "xxx.json"}}'

### How was this patch tested?
qwen3-30b dialogue
Okay, the user is asking, \"What is deep learning?\" I need to explain
this in a clear and concise way. Let me start by recalling what I know
about deep learning. It's a subset of machine learning, right? So first,
I should mention that it's part of machine learning, which is a branch
of AI. Then, the key point is that deep learning uses neural networks
with multiple layers. The term \"deep\" refers to the number of layers
in the network.\n\nI should explain what neural networks are. Maybe
start with the basics: they're inspired by the human brain, with layers
of nodes (neurons). Each layer processes data and passes it to the next.
The more layers, the deeper the network. But I need to make sure not to
get too technical here.\n\nExamples would help. Maybe mention
applications like image recognition, speech recognition, natural
language processing. For instance, when you use a smartphone's facial
recognition, that's deep learning. Or when you ask a virtual assistant
like Siri or Alexa, that's also deep learning in action.\n\nI should
also touch on how deep learning works. It requires a lot of data and
computational power. The process involves training the network with
labeled data, adjusting the weights of the connections between neurons
through backpropagation. The more data and layers, the better the model
can learn complex patterns.\n\nWait, but the user might not know what
backpropagation is. Maybe I should avoid that term unless necessary.

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
### What this PR does / why we need it?
1. Rename num_iterations_eplb_update to expert_heat_collection_interval.
2. Rename num_wait_worker_iterations to algorithm_execution_interval.
3. Rename init_redundancy_expert to num_redundant_experts because the
variable with the same meaning in vLLM is named this way.
4. Delete gate_eplb because we don't need this feature.
5. Move eplb config into a dict in additional config.
6. Depend on pr5817

### Does this PR introduce _any_ user-facing change?

before this pr:
`--additional-config '{"dynamic_eplb":true,
"num_iterations_eplb_update": 4000, "num_wait_worker_iterations": 150,
"init_redundancy_expert": 16, "expert_map_path": "xxx.json"}'`

after this pr: 
`--additional-config
'{"eplb_config":{"dynamic_eplb":true,"expert_heat_collection_interval":4000,
"algorithm_execution_interval":150,"num_redundant_experts": 16,
"expert_map_path": "xxx.json"}}'`

### How was this patch tested?

#### test qwen3-235b eplb num_redundant_experts=16

without pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 83.33 |

with pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 86.67 |

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@45c1ca1

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
### What this PR does / why we need it?
vllm-project#5533 
Add a wrapper for the eplb startup configuration; this is a
forward-compatible update.

### Does this PR introduce _any_ user-facing change?
before this pr:
--additional-config '{"dynamic_eplb":true, "num_iterations_eplb_update":
4000, "num_wait_worker_iterations": 150, "init_redundancy_expert": 16,
"expert_map_path": "xxx.json"}'

after this pr:
--additional-config
'{"eplb_config":{"dynamic_eplb":true,"expert_heat_collection_interval":4000,
"algorithm_execution_interval":150,"num_redundant_experts": 16,
"expert_map_path": "xxx.json"}}'

### How was this patch tested?
qwen3-30b dialogue
Okay, the user is asking, \"What is deep learning?\" I need to explain
this in a clear and concise way. Let me start by recalling what I know
about deep learning. It's a subset of machine learning, right? So first,
I should mention that it's part of machine learning, which is a branch
of AI. Then, the key point is that deep learning uses neural networks
with multiple layers. The term \"deep\" refers to the number of layers
in the network.\n\nI should explain what neural networks are. Maybe
start with the basics: they're inspired by the human brain, with layers
of nodes (neurons). Each layer processes data and passes it to the next.
The more layers, the deeper the network. But I need to make sure not to
get too technical here.\n\nExamples would help. Maybe mention
applications like image recognition, speech recognition, natural
language processing. For instance, when you use a smartphone's facial
recognition, that's deep learning. Or when you ask a virtual assistant
like Siri or Alexa, that's also deep learning in action.\n\nI should
also touch on how deep learning works. It requires a lot of data and
computational power. The process involves training the network with
labeled data, adjusting the weights of the connections between neurons
through backpropagation. The more data and layers, the better the model
can learn complex patterns.\n\nWait, but the user might not know what
backpropagation is. Maybe I should avoid that term unless necessary.

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
### What this PR does / why we need it?
1. Rename num_iterations_eplb_update to expert_heat_collection_interval.
2. Rename num_wait_worker_iterations to algorithm_execution_interval.
3. Rename init_redundancy_expert to num_redundant_experts because the
variable with the same meaning in vLLM is named this way.
4. Delete gate_eplb because we don't need this feature.
5. Move eplb config into a dict in additional config.
6. Depend on pr5817

### Does this PR introduce _any_ user-facing change?

before this pr:
`--additional-config '{"dynamic_eplb":true,
"num_iterations_eplb_update": 4000, "num_wait_worker_iterations": 150,
"init_redundancy_expert": 16, "expert_map_path": "xxx.json"}'`

after this pr: 
`--additional-config
'{"eplb_config":{"dynamic_eplb":true,"expert_heat_collection_interval":4000,
"algorithm_execution_interval":150,"num_redundant_experts": 16,
"expert_map_path": "xxx.json"}}'`

### How was this patch tested?

#### test qwen3-235b eplb num_redundant_experts=16

without pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 83.33 |

with pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 86.67 |

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@45c1ca1

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
tangtiangu pushed a commit to tangtiangu/jiusi-vllm-ascend that referenced this pull request Feb 24, 2026
### What this PR does / why we need it?
vllm-project#5533 
Add a wrapper for the eplb startup configuration; this is a
forward-compatible update.

### Does this PR introduce _any_ user-facing change?
before this pr:
--additional-config '{"dynamic_eplb":true, "num_iterations_eplb_update":
4000, "num_wait_worker_iterations": 150, "init_redundancy_expert": 16,
"expert_map_path": "xxx.json"}'

after this pr:
--additional-config
'{"eplb_config":{"dynamic_eplb":true,"expert_heat_collection_interval":4000,
"algorithm_execution_interval":150,"num_redundant_experts": 16,
"expert_map_path": "xxx.json"}}'

### How was this patch tested?
qwen3-30b dialogue
Okay, the user is asking, \"What is deep learning?\" I need to explain
this in a clear and concise way. Let me start by recalling what I know
about deep learning. It's a subset of machine learning, right? So first,
I should mention that it's part of machine learning, which is a branch
of AI. Then, the key point is that deep learning uses neural networks
with multiple layers. The term \"deep\" refers to the number of layers
in the network.\n\nI should explain what neural networks are. Maybe
start with the basics: they're inspired by the human brain, with layers
of nodes (neurons). Each layer processes data and passes it to the next.
The more layers, the deeper the network. But I need to make sure not to
get too technical here.\n\nExamples would help. Maybe mention
applications like image recognition, speech recognition, natural
language processing. For instance, when you use a smartphone's facial
recognition, that's deep learning. Or when you ask a virtual assistant
like Siri or Alexa, that's also deep learning in action.\n\nI should
also touch on how deep learning works. It requires a lot of data and
computational power. The process involves training the network with
labeled data, adjusting the weights of the connections between neurons
through backpropagation. The more data and layers, the better the model
can learn complex patterns.\n\nWait, but the user might not know what
backpropagation is. Maybe I should avoid that term unless necessary.

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
tangtiangu pushed a commit to tangtiangu/jiusi-vllm-ascend that referenced this pull request Feb 24, 2026
### What this PR does / why we need it?
vllm-project#5533 
Add a wrapper for the eplb startup configuration; this is a
forward-compatible update.

### Does this PR introduce _any_ user-facing change?
before this pr:
--additional-config '{"dynamic_eplb":true, "num_iterations_eplb_update":
4000, "num_wait_worker_iterations": 150, "init_redundancy_expert": 16,
"expert_map_path": "xxx.json"}'

after this pr:
--additional-config
'{"eplb_config":{"dynamic_eplb":true,"expert_heat_collection_interval":4000,
"algorithm_execution_interval":150,"num_redundant_experts": 16,
"expert_map_path": "xxx.json"}}'

### How was this patch tested?
qwen3-30b dialogue
Okay, the user is asking, \"What is deep learning?\" I need to explain
this in a clear and concise way. Let me start by recalling what I know
about deep learning. It's a subset of machine learning, right? So first,
I should mention that it's part of machine learning, which is a branch
of AI. Then, the key point is that deep learning uses neural networks
with multiple layers. The term \"deep\" refers to the number of layers
in the network.\n\nI should explain what neural networks are. Maybe
start with the basics: they're inspired by the human brain, with layers
of nodes (neurons). Each layer processes data and passes it to the next.
The more layers, the deeper the network. But I need to make sure not to
get too technical here.\n\nExamples would help. Maybe mention
applications like image recognition, speech recognition, natural
language processing. For instance, when you use a smartphone's facial
recognition, that's deep learning. Or when you ask a virtual assistant
like Siri or Alexa, that's also deep learning in action.\n\nI should
also touch on how deep learning works. It requires a lot of data and
computational power. The process involves training the network with
labeled data, adjusting the weights of the connections between neurons
through backpropagation. The more data and layers, the better the model
can learn complex patterns.\n\nWait, but the user might not know what
backpropagation is. Maybe I should avoid that term unless necessary.

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
### What this PR does / why we need it?
1. Rename num_iterations_eplb_update to expert_heat_collection_interval.
2. Rename num_wait_worker_iterations to algorithm_execution_interval.
3. Rename init_redundancy_expert to num_redundant_experts because the
variable with the same meaning in vLLM is named this way.
4. Delete gate_eplb because we don't need this feature.
5. Move eplb config into a dict in additional config.
6. Depend on pr5817

### Does this PR introduce _any_ user-facing change?

before this pr:
`--additional-config '{"dynamic_eplb":true,
"num_iterations_eplb_update": 4000, "num_wait_worker_iterations": 150,
"init_redundancy_expert": 16, "expert_map_path": "xxx.json"}'`

after this pr:
`--additional-config
'{"eplb_config":{"dynamic_eplb":true,"expert_heat_collection_interval":4000,
"algorithm_execution_interval":150,"num_redundant_experts": 16,
"expert_map_path": "xxx.json"}}'`

### How was this patch tested?

#### test qwen3-235b eplb num_redundant_experts=16

without pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 83.33 |

with pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 86.67 |

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@45c1ca1

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
### What this PR does / why we need it?
1. Rename num_iterations_eplb_update to expert_heat_collection_interval.
2. Rename num_wait_worker_iterations to algorithm_execution_interval.
3. Rename init_redundancy_expert to num_redundant_experts because the
variable with the same meaning in vLLM is named this way.
4. Delete gate_eplb because we don't need this feature.
5. Move eplb config into a dict in additional config.
6. Depend on pr5817

### Does this PR introduce _any_ user-facing change?

before this pr:
`--additional-config '{"dynamic_eplb":true,
"num_iterations_eplb_update": 4000, "num_wait_worker_iterations": 150,
"init_redundancy_expert": 16, "expert_map_path": "xxx.json"}'`

after this pr: 
`--additional-config
'{"eplb_config":{"dynamic_eplb":true,"expert_heat_collection_interval":4000,
"algorithm_execution_interval":150,"num_redundant_experts": 16,
"expert_map_path": "xxx.json"}}'`

### How was this patch tested?

#### test qwen3-235b eplb num_redundant_experts=16

without pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 83.33 |

with pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 86.67 |

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@45c1ca1

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
### What this PR does / why we need it?
1. Rename num_iterations_eplb_update to expert_heat_collection_interval.
2. Rename num_wait_worker_iterations to algorithm_execution_interval.
3. Rename init_redundancy_expert to num_redundant_experts because the
variable with the same meaning in vLLM is named this way.
4. Delete gate_eplb because we don't need this feature.
5. Move eplb config into a dict in additional config.
6. Depend on pr5817

### Does this PR introduce _any_ user-facing change?

before this pr:
`--additional-config '{"dynamic_eplb":true,
"num_iterations_eplb_update": 4000, "num_wait_worker_iterations": 150,
"init_redundancy_expert": 16, "expert_map_path": "xxx.json"}'`

after this pr:
`--additional-config
'{"eplb_config":{"dynamic_eplb":true,"expert_heat_collection_interval":4000,
"algorithm_execution_interval":150,"num_redundant_experts": 16,
"expert_map_path": "xxx.json"}}'`

### How was this patch tested?

#### test qwen3-235b eplb num_redundant_experts=16

without pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 83.33 |

with pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 86.67 |

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@45c1ca1

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
### What this PR does / why we need it?
1. Rename num_iterations_eplb_update to expert_heat_collection_interval.
2. Rename num_wait_worker_iterations to algorithm_execution_interval.
3. Rename init_redundancy_expert to num_redundant_experts because the
variable with the same meaning in vLLM is named this way.
4. Delete gate_eplb because we don't need this feature.
5. Move eplb config into a dict in additional config.
6. Depend on pr5817

### Does this PR introduce _any_ user-facing change?

before this pr:
`--additional-config '{"dynamic_eplb":true,
"num_iterations_eplb_update": 4000, "num_wait_worker_iterations": 150,
"init_redundancy_expert": 16, "expert_map_path": "xxx.json"}'`

after this pr: 
`--additional-config
'{"eplb_config":{"dynamic_eplb":true,"expert_heat_collection_interval":4000,
"algorithm_execution_interval":150,"num_redundant_experts": 16,
"expert_map_path": "xxx.json"}}'`

### How was this patch tested?

#### test qwen3-235b eplb num_redundant_experts=16

without pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 83.33 |

with pr5817
| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 86.67 |

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@45c1ca1

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation merge-conflicts module:core module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants