[CI] Add HunyuanVideo 1.5 X2V accuracy tests#3852
Conversation
5dbb3a3 to
bc87e97
Compare
|
Local E2E run on physical GPU 3 ( Results:
Notes:
Comparison videos committed in this PR: Commands run: CUDA_VISIBLE_DEVICES=3 VLLM_WORKER_MULTIPROC_METHOD=spawn /home/zjy/code/david/.venv/bin/python -m pytest -s -v tests/e2e/accuracy/hunyuanvideo15_t2v/test_hunyuanvideo15_t2v_video_similarity.py -m full_model --run-level full_model
CUDA_VISIBLE_DEVICES=3 VLLM_WORKER_MULTIPROC_METHOD=spawn /home/zjy/code/david/.venv/bin/python -m pytest -s -v tests/e2e/accuracy/hunyuanvideo15_i2v/test_hunyuanvideo15_i2v_video_similarity.py -m full_model --run-level full_model
CUDA_VISIBLE_DEVICES=3 /home/zjy/code/david/.venv/bin/python -m pytest -s -v tests/e2e/accuracy/hunyuanvideo15_t2v/test_hunyuanvideo15_t2v_video_similarity.py -k serving_matches -m full_model --run-level full_model
/home/zjy/.local/bin/uv run --extra dev pre-commit run --files tests/e2e/accuracy/hunyuanvideo15_t2v/test_hunyuanvideo15_t2v_video_similarity.py tests/e2e/accuracy/hunyuanvideo15_i2v/test_hunyuanvideo15_i2v_video_similarity.py tests/e2e/accuracy/hunyuanvideo15_t2v/result/cat_grass/comparison_online_offline.mp4 tests/e2e/accuracy/hunyuanvideo15_i2v/result/cherry_blossom-54880725/comparison_online_offline.mp4 |
|
Update for HunyuanVideo-1.5 I2V accuracy input:
Local E2E run on CUDA_VISIBLE_DEVICES=3 VLLM_WORKER_MULTIPROC_METHOD=spawn \
/home/zjy/code/david/.venv/bin/python -m pytest -s -v \
tests/e2e/accuracy/hunyuanvideo15_i2v/test_hunyuanvideo15_i2v_video_similarity.py \
-m full_model --run-level full_modelResult:
After updating the I2V thresholds to Pushed commit: |
|
Update for the HunyuanVideo-1.5 I2V accuracy case:
Pushed commit: |
|
Update after rerunning I2V locally on Changes pushed in
Local validation:
|
|
Local E2E accuracy run update (device 3,
Updated comparison videos committed in this PR:
Thresholds are now set to SSIM |
|
Follow-up cleanup pushed in
Local E2E rerun on device 3 with
Pre-commit passed for the touched Python files; local DCO scan passed. |
|
Follow-up pushed in
Local E2E rerun on device 3 with
Pre-commit passed for the touched Python files; local DCO scan passed. |
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
CI: |
|
Fixed the UT collection failure in Root cause: Validation:
|
Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: david6666666 <530634352@qq.com>
0efcacb to
ab8b1e8
Compare
|
Rebased PR branch onto latest Conflict resolution:
Validation:
CI has been re-triggered after the rebase. |
|
@Gaohan123 @lishunyang12 ptal |
| from diffusers import UniPCMultistepScheduler | ||
| from PIL import Image | ||
|
|
||
| from tests.e2e.accuracy.helpers import ( |
There was a problem hiding this comment.
does hunyuan models needs to import these helpers as well?
There was a problem hiding this comment.
Yes they use same tool function
| lookup_name = original_name.replace(weight_name, param_name) | ||
| if lookup_name not in params_dict: | ||
| break | ||
| maybe_lookup_name = original_name.replace(weight_name, param_name) |
There was a problem hiding this comment.
Could you explain why we need to change the model file? Is it the real bug?
There was a problem hiding this comment.
Root cause fixed: HunyuanVideo 1.5 transformer weight loading was incorrectly treating token-refiner attn.to_q/to_k/to_v weights as fused to_qkv candidates. Those non-fused token-refiner weights were left randomly initialized across processes. The loader now only uses the fused mapping when the fused target parameter exists; otherwise it falls back to normal parameter-name loading.
Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: david6666666 <530634352@qq.com>
Summary
tests/e2e/accuracy/helpers.py, then reuse them from Wan2.2 I2V.:full_moon: Diffusion X2V · Accuracy TestBuildkite step.Tests
/home/zjy/code/david/.venv/bin/python -m py_compile tests/e2e/accuracy/helpers.py tests/e2e/accuracy/conftest.py tests/e2e/accuracy/wan22_i2v/test_wan22_i2v_video_similarity.py tests/e2e/accuracy/hunyuanvideo15_t2v/test_hunyuanvideo15_t2v_video_similarity.py tests/e2e/accuracy/hunyuanvideo15_i2v/test_hunyuanvideo15_i2v_video_similarity.py/home/zjy/code/david/.venv/bin/python -m pytest --collect-only -q tests/e2e/accuracy/wan22_i2v/test_wan22_i2v_video_similarity.py tests/e2e/accuracy/hunyuanvideo15_t2v/test_hunyuanvideo15_t2v_video_similarity.py tests/e2e/accuracy/hunyuanvideo15_i2v/test_hunyuanvideo15_i2v_video_similarity.py/home/zjy/code/david/.venv/bin/python -m pytest -q tests/e2e/accuracy/wan22_i2v/test_wan22_i2v_video_similarity.py -k 'parse or build_diffusers_command or resolve_image_source or online_timeout or artifact_dir or resize_to_target or configure_scheduler or ensure_wan_ftfy_fallback or send_video_request'/home/zjy/.local/bin/uv run --extra docs ruff check tests/e2e/accuracy/helpers.py tests/e2e/accuracy/conftest.py tests/e2e/accuracy/wan22_i2v/test_wan22_i2v_video_similarity.py tests/e2e/accuracy/hunyuanvideo15_t2v/test_hunyuanvideo15_t2v_video_similarity.py tests/e2e/accuracy/hunyuanvideo15_i2v/test_hunyuanvideo15_i2v_video_similarity.py/home/zjy/.local/bin/uv run --extra docs ruff format --check tests/e2e/accuracy/helpers.py tests/e2e/accuracy/conftest.py tests/e2e/accuracy/wan22_i2v/test_wan22_i2v_video_similarity.py tests/e2e/accuracy/hunyuanvideo15_t2v/test_hunyuanvideo15_t2v_video_similarity.py tests/e2e/accuracy/hunyuanvideo15_i2v/test_hunyuanvideo15_i2v_video_similarity.pyNote: the new H100 video generation benchmarks were collected but not executed locally.
t2v:
comparison_online_offline.3.mp4
i2v:
comparison_online_offline.4.mp4