-
Notifications
You must be signed in to change notification settings - Fork 639
[XPU]add xpu ci ep case #4432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+111
−2
Merged
[XPU]add xpu ci ep case #4432
Changes from all commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
3122790
add xpu ci case
plusNew001 529258b
Merge branch 'develop' into xpu-add-case
plusNew001 b82625c
Add xDeepEP download and build steps
plusNew001 f4501e7
Merge branch 'develop' into xpu-add-case
plusNew001 5fe0fed
Fix formatting and add missing sleep command
plusNew001 235e2b4
Update Docker image version in CI workflow
plusNew001 6f1252d
Modify run_ci_xpu.sh for log cleanup and error handling
plusNew001 4f4b523
Enhance test_ep.py with process management and assertions
plusNew001 771600e
Replace test_fastdeploy_llm with test_fd_ep
plusNew001 392b9de
Fix conditional statement in run_ci_xpu.sh
plusNew001 7b1eead
Update test_ep.py for string handling and formatting
plusNew001 141f874
Merge branch 'develop' into xpu-add-case
plusNew001 83f3108
Rename test_ep.py to run_ep.py
plusNew001 ab3d524
Change test script from test_ep.py to run_ep.py
plusNew001 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
import os | ||
|
||
import psutil | ||
|
||
from fastdeploy import LLM, SamplingParams | ||
|
||
|
||
def test_fd_ep(): | ||
""" """ | ||
|
||
msg1 = [ | ||
{"role": "system", "content": ""}, | ||
{"role": "user", "content": "北京天安门广场在哪里?"}, | ||
] | ||
messages = [msg1] | ||
|
||
# 采样参数 | ||
sampling_params = SamplingParams(top_p=0, max_tokens=500) | ||
|
||
# 模型路径与设备配置 | ||
model = os.getenv("model_path", "/home/ERNIE-4.5-300B-A47B-Paddle") | ||
xpu_visible_devices = os.getenv("XPU_VISIBLE_DEVICES", "0") | ||
xpu_device_num = len(xpu_visible_devices.split(",")) | ||
|
||
enable_expert_parallel = True | ||
if enable_expert_parallel: | ||
tensor_parallel_size = 1 | ||
data_parallel_size = xpu_device_num | ||
else: | ||
tensor_parallel_size = xpu_device_num | ||
data_parallel_size = 1 | ||
|
||
engine_worker_queue_port = [str(8023 + i * 10) for i in range(data_parallel_size)] | ||
engine_worker_queue_port = ",".join(engine_worker_queue_port) | ||
|
||
print(f"[INFO] messages: {messages}") | ||
|
||
llm = LLM( | ||
model=model, | ||
enable_expert_parallel=enable_expert_parallel, | ||
tensor_parallel_size=tensor_parallel_size, | ||
data_parallel_size=data_parallel_size, | ||
max_model_len=8192, | ||
quantization="wint4", | ||
engine_worker_queue_port=engine_worker_queue_port, | ||
max_num_seqs=8, | ||
) | ||
|
||
try: | ||
outputs = llm.chat(messages, sampling_params) | ||
assert outputs, "❌ LLM 推理返回空结果。" | ||
|
||
for idx, output in enumerate(outputs): | ||
prompt = output.prompt | ||
generated_text = getattr(output.outputs, "text", "").strip() | ||
|
||
print(f"{'-'*100}") | ||
print(f"[PROMPT {idx}] {prompt}") | ||
print(f"{'-'*100}") | ||
print(f"[GENERATED TEXT] {generated_text}") | ||
print(f"{'-'*100}") | ||
|
||
# 核心断言:输出不能为空 | ||
assert generated_text, f"❌ 推理结果为空 (index={idx})" | ||
|
||
finally: | ||
# 无论是否报错都清理子进程 | ||
current_process = psutil.Process(os.getpid()) | ||
for child in current_process.children(recursive=True): | ||
try: | ||
child.kill() | ||
print(f"[CLEANUP] 已杀死子进程 {child.pid}") | ||
except Exception as e: | ||
print(f"[WARN] 无法杀死子进程 {child.pid}: {e}") | ||
print("✅ 已清理所有 FastDeploy 子进程。") | ||
|
||
|
||
if __name__ == "__main__": | ||
test_fd_ep() |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.