Skip to content

Conversation

@CSWYF3634076
Copy link
Contributor

@CSWYF3634076 CSWYF3634076 commented Oct 13, 2025

Purpose

fix following issue by PR #22100

(EngineCore_DP0 pid=539437)   File "/root/paddlejob/wangyafeng/myGithub/vllm/vllm/model_executor/models/ernie45_moe.py", line 442, in __init__
(EngineCore_DP0 pid=539437)     self.start_layer, self.end_layer, self.layers = make_layers(
(EngineCore_DP0 pid=539437)                                                     ^^^^^^^^^^^^
(EngineCore_DP0 pid=539437)   File "/root/paddlejob/wangyafeng/myGithub/vllm/vllm/model_executor/models/utils.py", line 640, in make_layers
(EngineCore_DP0 pid=539437)     maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
(EngineCore_DP0 pid=539437)                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=539437)   File "/root/paddlejob/wangyafeng/myGithub/vllm/vllm/model_executor/models/ernie45_moe.py", line 444, in <lambda>
(EngineCore_DP0 pid=539437)     lambda prefix: Ernie4_5_MoeDecoderLayer(
(EngineCore_DP0 pid=539437)                    ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=539437)   File "/root/paddlejob/wangyafeng/myGithub/vllm/vllm/model_executor/models/ernie45_moe.py", line 369, in __init__
(EngineCore_DP0 pid=539437)     self.mlp = Ernie4_5_MoeMoE(
(EngineCore_DP0 pid=539437)                ^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=539437)   File "/root/paddlejob/wangyafeng/myGithub/vllm/vllm/model_executor/models/ernie45_moe.py", line 147, in __init__
(EngineCore_DP0 pid=539437)     self.n_physical_experts = self.n_logical_experts + self.n_redundant_experts
(EngineCore_DP0 pid=539437)                               ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
(EngineCore_DP0 pid=539437) TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'

Test Plan

Running the following test script
python test_eplb.py --mode eplb
python test_eplb.py --mode normal

import json
import os
import argparse
from vllm import LLM, SamplingParams

prompt = "Explain the theory of relativity in simple terms."

RESULT_FILE = "eplb_test_output.json"

sampling_params = SamplingParams(
    temperature=0.0,
    top_p=1.0,
    top_k=1,
    max_tokens=4096
)

def run_inference(model_path: str, enable_eplb: bool, num_redundant_experts: int = 0):
    print(f"Running inference with EPLB={enable_eplb}, redundant experts={num_redundant_experts}")
    
    llm = LLM(
        model=model_path,
        tensor_parallel_size=2,
        enable_expert_parallel=True,
        enable_eplb=enable_eplb,
        num_redundant_experts=num_redundant_experts if enable_eplb else 0,
        eplb_window_size=1000,
        eplb_step_interval=100,
        enforce_eager=True,
        trust_remote_code=True
    )
    
    result = llm.generate([prompt], sampling_params)
    output_text = result[0].outputs[0].text.strip()
    
    print("Output:")
    print(output_text)
    print("-" * 50)

    return output_text

def save_result(key: str, value: list):
    if os.path.exists(RESULT_FILE):
        with open(RESULT_FILE, "r") as f:
            results = json.load(f)
    else:
        results = {}

    results[key] = value

    with open(RESULT_FILE, "w") as f:
        json.dump(results, f, indent=2)

    print(f"Output saved to {RESULT_FILE}")

def load_results():
    if os.path.exists(RESULT_FILE):
        with open(RESULT_FILE, "r") as f:
            return json.load(f)
    return {}

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--mode", type=str, choices=["eplb", "normal", "compare"], required=True)
    args = parser.parse_args()

    MODEL_PATH = "baidu/ERNIE-4.5-21B-A3B-PT"
    MODEL_PATH = "/root/paddlejob/wangyafeng/models/ERNIE-4.5-21B-A3B-PT"

    if args.mode == "eplb":
        outputs = run_inference(MODEL_PATH, enable_eplb=True, num_redundant_experts=32)
        save_result("eplb", outputs)
    elif args.mode == "normal":
        outputs = run_inference(MODEL_PATH, enable_eplb=False)
        save_result("normal", outputs)

Test Result

{
  "normal": "The **theory of relativity**, developed by Albert Einstein in the early 20th century, is a revolutionary framework for understanding space, time, and gravity. It has two main parts: **special relativity** and **general relativity**, each addressing different aspects of the universe.\n\n### **1. Special Relativity (1905)**\n**Focus:** How space and time behave at high speeds (close to the speed of light).  \n**Key Ideas:**\n- **Time Dilation:** Time slows down for objects moving at very high speeds (or in strong gravitational fields). For example, a clock on a fast-moving spaceship would tick slower than one on Earth.\n- **Length Contraction:** Objects appear shorter along the direction of motion when moving at high speeds.\n- **Mass-Energy Equivalence:** Energy (*E*) and mass (*m*) are interchangeable, described by *E = mc\u00b2*. This explains why nuclear reactions release so much energy.\n- **Relativity of Simultaneity:** Events that seem simultaneous to one observer may not be to another moving at a different speed.\n\n**Why It Matters:** It overturned classical Newtonian physics, showing that space and time are interconnected (now called **spacetime**) and that nothing can travel faster than light.\n\n### **2. General Relativity (1915)**\n**Focus:** How gravity works and how massive objects warp spacetime.  \n**Key Ideas:**\n- **Gravity as Curvature:** Instead of gravity being a force (like magnetism), it\u2019s the result of massive objects bending spacetime. For example, Earth orbits the Sun because the Sun curves spacetime around it.\n- **Equivalence Principle:** A person in a closed elevator can\u2019t tell if they\u2019re being pulled by gravity or accelerating upward\u2014gravity and acceleration are locally indistinguishable.\n- **Black Holes & Gravitational Waves:** Extreme curvature of spacetime can create black holes (where gravity is so strong nothing escapes) and ripples in spacetime called gravitational waves.\n\n**Why It Matters:** It predicted phenomena like the bending of light around massive objects (used in GPS to correct time delays) and the expansion of the universe.\n\n### **Simple Analogy**\nImagine spacetime as a stretched rubber sheet. A heavy ball (like the Sun) sinks into the sheet, creating a \"dent.\" A smaller ball (like Earth) rolls into the dent, following the curve\u2014this is gravity! Special relativity is like zooming in to see how the sheet stretches and bends at different points.\n\n### **Why It\u2019s Important**\n- **Technology:** GPS satellites rely on corrections from both theories to stay accurate.\n- **Cosmology:** It explains the Big Bang, black holes, and the universe\u2019s expansion.\n- **Philosophy:** It redefined our understanding of reality, showing that time and space are dynamic, not absolute.\n\nIn short, Einstein\u2019s theory reshaped physics by merging space, time, and gravity into a single, flexible framework\u2014proving that the universe is stranger and more interconnected than we ever imagined.",
  "eplb": "The **theory of relativity**, developed by Albert Einstein in the early 20th century, consists of two main parts: **special relativity** and **general relativity**. Here's a simple breakdown:\n\n### **1. Special Relativity (1905)**\n- **Key Idea**: Time and space are not absolute\u2014they depend on your motion.\n- **Two Big Effects**:\n  - **Time Dilation**: Moving clocks run slower. For example, if you\u2019re on a fast train, your watch might tick slightly slower than someone standing still.\n  - **Length Contraction**: Objects appear shorter when moving. A train moving at 90% the speed of light would look 40% shorter to a stationary observer.\n- **Why It Matters**: It shows that the laws of physics are the same for all observers moving at constant speeds, no matter how fast they\u2019re going.\n\n### **2. General Relativity (1915)**\n- **Key Idea**: Gravity isn\u2019t a force\u2014it\u2019s the curvature of spacetime caused by mass and energy.\n- **How It Works**:\n  - Imagine spacetime as a stretched rubber sheet. Heavy objects (like planets) sink into the sheet, creating a \"dent.\" Smaller objects roll into this dent, following a curved path (like planets orbiting the Sun).\n  - This explains why objects fall to Earth and why planets orbit the Sun.\n- **Why It Matters**: It unified gravity with the other fundamental forces and predicted phenomena like black holes and gravitational waves.\n\n### **Why It\u2019s Important**\n- **Special Relativity** revolutionized physics by showing that time and space are interconnected (E=mc\u00b2, the famous mass-energy equivalence).\n- **General Relativity** reshaped our understanding of the universe, explaining everything from the Big Bang to the expansion of space itself.\n\n### **Simple Analogy**\nThink of spacetime as a 4D \"fabric\" (3D space + time). Massive objects warp this fabric, and other objects move along the curves. Special relativity deals with flat spacetime (no gravity), while general relativity handles the warped stuff.\n\nIn short: **Relativity says that time, space, and gravity are all connected, and they behave differently depending on how you\u2019re moving or how much mass is around.** \ud83c\udf0c\ud83d\ude80"
}

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses a critical bug that caused the ernie45 model to fail during loading when EPLB was disabled. The root cause, an incorrect access to a potentially None configuration value, is correctly fixed by using the eplb_config which provides a safe default. Furthermore, the changes to the MoE weight loading mechanism are a significant improvement, making it more robust for EPLB scenarios with redundant experts. By checking for loading success, the new logic correctly handles cases where an expert's weights might not be on a particular rank, allowing it to try other replicas. The implementation is clean and directly solves the reported issues.

@CSWYF3634076
Copy link
Contributor Author

cc @HsChen-sys @abmfy

Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, sorry this got broken.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) October 14, 2025 05:03
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 14, 2025
@DarkLight1337 DarkLight1337 merged commit 01ad27f into vllm-project:main Oct 14, 2025
57 checks passed
Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025
bbartels pushed a commit to bbartels/vllm that referenced this pull request Oct 16, 2025
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants