Skip to content

[Main2Main] Upgrade vllm commit to v0.15.0rc0#6304

Closed
shen-shanshan wants to merge 8 commits intovllm-project:mainfrom
shen-shanshan:main2main
Closed

[Main2Main] Upgrade vllm commit to v0.15.0rc0#6304
shen-shanshan wants to merge 8 commits intovllm-project:mainfrom
shen-shanshan:main2main

Conversation

@shen-shanshan
Copy link
Copy Markdown
Collaborator

@shen-shanshan shen-shanshan commented Jan 27, 2026

What this PR does / why we need it?

  1. Fix TypeError: MMEncoderAttention.__init__() got an unexpected keyword argument 'multimodal_config' due to [Models]: Make Multimodal config implicit in ViT implementation vllm#31972.
  2. Fix _shared_experts: 'NoneType' object is not callable due to [Models] Add SharedFusedMoE support to Qwen3MoE vllm#32082 by [Main2Main][BugFix] Add shared_experts check for AscendSharedFusedMoE #6335.
  3. Fix ReshapeAndCacheOperation setup failed! due to [Performance] Split FlashAttn attention and cache update vllm#25954 by registering unified_kv_cache_update custom op.

Does this PR introduce any user-facing change?

How was this patch tested?

@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request upgrades the vLLM commit to align with v0.15.0rc0, primarily to resolve a TypeError related to the multimodal_config argument in MMEncoderAttention. The changes correctly update the dependency commit in the documentation and remove the now-obsolete parameter from the AscendMMEncoderAttention class. The modifications are appropriate and address the stated issue.

Comment thread .github/workflows/pr_test_full.yaml Outdated
@shen-shanshan shen-shanshan added ready read for review ready-for-test start test by label for PR labels Jan 27, 2026
Comment thread docs/source/community/versioning_policy.md Outdated
@wjunLu
Copy link
Copy Markdown
Collaborator

wjunLu commented Jan 27, 2026

https://github.com/vllm-project/vllm/pull/32082 breaks this https://github.com/vllm-project/vllm-ascend/actions/runs/21391619327/job/61579739637?pr=6304#step:11:1116

@wjunLu
Copy link
Copy Markdown
Collaborator

wjunLu commented Jan 28, 2026

https://github.com/vllm-project/vllm/pull/32082 breaks this https://github.com/vllm-project/vllm-ascend/actions/runs/21391619327/job/61579739637?pr=6304#step:11:1116

#6335 fixed above break

@shen-shanshan
Copy link
Copy Markdown
Collaborator Author

shen-shanshan commented Jan 28, 2026

ReshapeAndCacheOperation setup failed!
Exception raised from OperationSetup at build/third_party/op-plugin/op_plugin/CMakeFiles/op_plugin_atb.dir/compiler_depend.ts:203 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0xb0 (0xffffa90fc700 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x68 (0xffffa909a860 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #2: atb::OperationSetup(atb::VariantPack, atb::Operation*, atb::Context*) + 0x278 (0xfffdd98c0498 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch_npu/lib/libop_plugin_atb.so)
frame #3: <unknown function> + 0xb1b74 (0xfffdd98c1b74 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch_npu/lib/libop_plugin_atb.so)
frame #4: <unknown function> + 0x2c77e24 (0xfffdfd047e24 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so)
frame #5: <unknown function> + 0xa607d0 (0xfffdfae307d0 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so)
frame #6: <unknown function> + 0xa613ac (0xfffdfae313ac in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so)
frame #7: <unknown function> + 0xa5f2c8 (0xfffdfae2f2c8 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so)
frame #8: <unknown function> + 0xda294 (0xffffb7a6a294 in /root/miniconda3/envs/vllm/bin/../lib/libstdc++.so.6)
frame #9: <unknown function> + 0x80398 (0xffffb7c10398 in /lib/aarch64-linux-gnu/libc.so.6)
frame #10: <unknown function> + 0xe9e9c (0xffffb7c79e9c in /lib/aarch64-linux-gnu/libc.so.6)

(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:72] Dumping input data for V1 LLM engine (v0.15.0rc0) with config: model='/root/.cache/modelscope/hub/models/openai-mirror/whisper-large-v3-turbo', speculative_config=None, tokenizer='/root/.cache/modelscope/hub/models/openai-mirror/whisper-large-v3-turbo', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=448, download_dir=None, load_format=auto, tensor_parallel_size=1, pipeline_parallel_size=1, data_parallel_size=1, disable_custom_all_reduce=True, quantization=None, enforce_eager=False, enable_return_routed_experts=False, kv_cache_dtype=auto, device_config=npu, structured_outputs_config=StructuredOutputsConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_parser='', reasoning_parser_plugin='', enable_in_reasoning=False), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None, kv_cache_metrics=False, kv_cache_metrics_sample=0.01, cudagraph_metrics=False, enable_layerwise_nvtx_tracing=False, enable_mfu_metrics=False, enable_mm_processor_stats=False, enable_logging_iteration_details=False), seed=0, served_model_name=/root/.cache/modelscope/hub/models/openai-mirror/whisper-large-v3-turbo, enable_prefix_caching=False, enable_chunked_prefill=False, pooler_config=None, compilation_config={'level': None, 'mode': <CompilationMode.VLLM_COMPILE: 3>, 'debug_dump_path': None, 'cache_dir': '/root/.cache/vllm/torch_compile_cache/7423e1a4bc', 'compile_cache_save_format': 'binary', 'backend': 'vllm_ascend.compilation.compiler_interface.AscendCompiler', 'custom_ops': ['all'], 'splitting_ops': ['vllm::unified_attention', 'vllm::unified_attention_with_output', 'vllm::unified_mla_attention', 'vllm::unified_mla_attention_with_output', 'vllm::mamba_mixer2', 'vllm::mamba_mixer', 'vllm::short_conv', 'vllm::linear_attention', 'vllm::plamo2_mamba_mixer', 'vllm::gdn_attention_core', 'vllm::kda_attention', 'vllm::sparse_attn_indexer', 'vllm::rocm_aiter_sparse_attn_indexer', 'vllm::mla_forward', 'vllm::mla_forward'], 'compile_mm_encoder': False, 'compile_sizes': [], 'compile_ranges_split_points': [2240], 'inductor_compile_config': {'enable_auto_functionalized_v2': False, 'combo_kernels': True, 'benchmark_combo_kernel': True}, 'inductor_passes': {}, 'cudagraph_mode': <CUDAGraphMode.PIECEWISE: 1>, 'cudagraph_num_of_warmups': 1, 'cudagraph_capture_sizes': [1, 2, 4, 8], 'cudagraph_copy_inputs': False, 'cudagraph_specialize_lora': True, 'use_inductor_graph_partition': False, 'pass_config': {'fuse_norm_quant': True, 'fuse_act_quant': True, 'fuse_attn_quant': False, 'eliminate_noops': True, 'enable_sp': False, 'fuse_gemm_comms': False, 'fuse_allreduce_rms': False}, 'max_cudagraph_capture_size': 8, 'dynamic_shapes_config': {'type': <DynamicShapesType.BACKED: 'backed'>, 'evaluate_guards': False, 'assume_32_bit_indexing': True}, 'local_cache_dir': '/root/.cache/vllm/torch_compile_cache/7423e1a4bc/rank_0_0/backbone'}, 
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79] Dumping scheduler output for model execution: SchedulerOutput(scheduled_new_reqs=[NewRequestData(req_id=0-a72d9ae6,prompt_token_ids_len=4,prefill_token_ids_len=None,mm_features=[MultiModalFeatureSpec(data={'input_features': MultiModalFieldElem(modality='audio', key='input_features', data=tensor([[-0.5781, -0.5781, -0.5781,  ..., -0.5781, -0.5781, -0.5781],
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]         [-0.5781, -0.5781, -0.5781,  ..., -0.5781, -0.5781, -0.5781],
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]         [-0.5781, -0.5781, -0.5781,  ..., -0.5781, -0.5781, -0.5781],
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]         ...,
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]         [-0.5781, -0.5781, -0.5781,  ..., -0.5781, -0.5781, -0.5781],
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]         [-0.5781, -0.5781, -0.5781,  ..., -0.5781, -0.5781, -0.5781],
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]         [-0.5781, -0.5781, -0.5781,  ..., -0.5781, -0.5781, -0.5781]],
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]        dtype=torch.bfloat16), field=MultiModalBatchedField(keep_on_cpu=False))}, modality='audio', identifier='103d3b4774b9507ef036a4103955d46a4d6eb3816a10255bb0810cf7cea178bf', mm_position=PlaceholderRange(offset=0, length=1500, is_embed=None), mm_hash='103d3b4774b9507ef036a4103955d46a4d6eb3816a10255bb0810cf7cea178bf')],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.2, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=10, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([1], [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None)], scheduled_cached_reqs=CachedRequestData(req_ids=[],resumed_req_ids=set(),new_token_ids_lens=[],all_token_ids_lens={},new_block_ids=[],num_computed_tokens=[],num_output_tokens=[]), num_scheduled_tokens={0-a72d9ae6: 4}, total_num_scheduled_tokens=4, scheduled_spec_decode_tokens={}, scheduled_encoder_inputs={0-a72d9ae6: [0]}, num_common_prefix_blocks=[0, 0], finished_req_ids=[], free_encoder_mm_hashes=[], preempted_req_ids=[], has_structured_output_requests=false, pending_structured_output_tokens=false, num_invalid_spec_tokens=null, kv_connector_metadata=null, ec_connector_metadata=null)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937] EngineCore encountered a fatal error.
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937] Traceback (most recent call last):
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 928, in run_engine_core
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     engine_core.run_busy_loop()
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 955, in run_busy_loop
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     self._process_engine_step()
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 988, in _process_engine_step
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     outputs, model_executed = self.step_fn()
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                               ^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 490, in step_with_batch_queue
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     exec_model_fut.result()
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/concurrent/futures/_base.py", line 449, in result
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self.__get_result()
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     raise self._exception
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/executor/uniproc_executor.py", line 79, in collective_rpc
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     result = run_method(self.driver_worker, method, args, kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/serial_utils.py", line 461, in run_method
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return func(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/worker/worker_base.py", line 365, in execute_model
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self.worker.execute_model(scheduler_output)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker.py", line 367, in execute_model
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     output = self.model_runner.execute_model(scheduler_output,
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return func(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1433, in execute_model
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     hidden_states = self._generate_process_reqs_hidden_states(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1023, in _generate_process_reqs_hidden_states
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     hidden_states = self.model(input_ids=input_ids,
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/model_executor/models/whisper.py", line 897, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     decoder_outputs = self.model(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                       ^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/model_executor/models/whisper.py", line 590, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     decoder_outputs = self.decoder(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                       ^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 396, in __call__
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self.forward(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/model_executor/models/whisper.py", line 561, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     hidden_states = decoder_layer(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                     ^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/model_executor/models/whisper.py", line 430, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     hidden_states = self.encoder_attn(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                     ^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/model_executor/models/whisper.py", line 300, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     attn_output = self.attn(q, k, v)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                   ^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/attention/layer.py", line 415, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     torch.ops.vllm.unified_attention_with_output(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/_ops.py", line 1255, in __call__
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self._op(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/attention/utils/kv_transfer_utils.py", line 39, in wrapper
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return func(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/attention/layer.py", line 885, in unified_attention_with_output
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     self.impl.forward(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/model_executor/layers/attention/cross_attention.py", line 144, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return super().forward(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 941, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     output = self.forward_impl(query, key, value, kv_cache, attn_metadata, output)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 898, in forward_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     output = self.forward_fused_infer_attention(query, key, value, attn_metadata, output)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 767, in forward_fused_infer_attention
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     key, value, block_size, block_table, actual_seq_lengths_kv = self._get_fia_params(key, value, attn_metadata)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 681, in _get_fia_params
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     actual_seq_lengths_kv = torch.cumsum(attn_metadata.seq_lens, dim=0).tolist()
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937] RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is ReshapeCacheOperation.

@wjunLu
Copy link
Copy Markdown
Collaborator

wjunLu commented Jan 28, 2026

(EngineCore_DP0 pid=42706)   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 681, in _get_fia_params
(EngineCore_DP0 pid=42706)     actual_seq_lengths_kv = torch.cumsum(attn_metadata.seq_lens, dim=0).tolist()
(EngineCore_DP0 pid=42706)                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=42706) RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is ReshapeCacheOperation.

what's the failed test case?

@shen-shanshan
Copy link
Copy Markdown
Collaborator Author

(EngineCore_DP0 pid=42706)   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 681, in _get_fia_params
(EngineCore_DP0 pid=42706)     actual_seq_lengths_kv = torch.cumsum(attn_metadata.seq_lens, dim=0).tolist()
(EngineCore_DP0 pid=42706)                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=42706) RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is ReshapeCacheOperation.

what's the failed test case?

FAILED tests/e2e/singlecard/test_models.py::test_whisper[openai-mirror/whisper-large-v3-turbo]

@shen-shanshan
Copy link
Copy Markdown
Collaborator Author

(EngineCore_DP0 pid=42706)   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 681, in _get_fia_params
(EngineCore_DP0 pid=42706)     actual_seq_lengths_kv = torch.cumsum(attn_metadata.seq_lens, dim=0).tolist()
(EngineCore_DP0 pid=42706)                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=42706) RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is ReshapeCacheOperation.

what's the failed test case?

FAILED tests/e2e/singlecard/test_models.py::test_whisper[openai-mirror/whisper-large-v3-turbo]

This is due to vllm-project/vllm@a28b94e.

@shen-shanshan
Copy link
Copy Markdown
Collaborator Author

shen-shanshan commented Jan 28, 2026

Failed tests/e2e/singlecard/model_runner_v2/test_basic.py::test_qwen3_dense_eager_mode[True-32-Qwen/Qwen3-0.6B]
Failed tests/e2e/singlecard/model_runner_v2/test_basic.py::test_egale_spec_decoding[True-32-vllm-ascend/EAGLE-LLaMA3.1-Instruct-8B-LLM-Research/Meta-Llama-3.1-8B-Instruct]


(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935] EngineCore failed to start.
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935] Traceback (most recent call last):
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 926, in run_engine_core
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 691, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     super().__init__(
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 105, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.model_executor = executor_class(vllm_config)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/executor/abstract.py", line 101, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self._init_executor()
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/executor/uniproc_executor.py", line 47, in _init_executor
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.driver_worker.init_device()
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/worker/worker_base.py", line 326, in init_device
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.worker.init_device()  # type: ignore
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker.py", line 298, in init_device
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.model_runner = NPUModelRunnerV2(self.vllm_config, self.device)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/v2/model_runner.py", line 49, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     super().__init__(vllm_config, device)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/worker/gpu/model_runner.py", line 142, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.req_states = RequestState(
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                       ^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/worker/gpu/states.py", line 34, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.prefill_token_ids = StagedWriteTensor(
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                              ^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/worker/gpu/buffer_utils.py", line 110, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self._uva_buf = UvaBuffer(size, dtype)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                     ^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/worker/gpu/buffer_utils.py", line 20, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.uva = get_cuda_view_from_cpu_tensor(self.cpu)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/utils/torch_utils.py", line 659, in get_cuda_view_from_cpu_tensor
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     return torch.ops._C.get_cuda_view_from_cpu_tensor(cpu_tensor)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/_ops.py", line 1365, in __getattr__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     raise AttributeError(
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935] AttributeError: '_OpNamespace' '_C' object has no attribute 'get_cuda_view_from_cpu_tensor'

@shen-shanshan
Copy link
Copy Markdown
Collaborator Author

RuntimeError: Cannot run aclop operators during NPU graph capture. Current working aclop is BatchMatMul. If you need this call to be captured, please try to set torch.npu.config.allow_internal_format = False. If still fail, the operator needs aclnn implementation and please file an issue. Current npuStreamCaptureStatus: npuStreamCaptureStatusActive

@shen-shanshan
Copy link
Copy Markdown
Collaborator Author

FAILED tests/e2e/multicard/2-cards/test_external_launcher.py::test_qwen3_external_launcher_with_sleepmode


Traceback (most recent call last):
  File "/usr/local/python3.11.14/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/local/python3.11.14/lib/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/__w/vllm-ascend/vllm-ascend/examples/offline_external_launcher.py", line 195, in main
    llm = LLM(
          ^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/entrypoints/llm.py", line 334, in __init__
    self.llm_engine = LLMEngine.from_engine_args(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/llm_engine.py", line 172, in from_engine_args
    return cls(
           ^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/llm_engine.py", line 106, in __init__
    self.engine_core = EngineCoreClient.make_client(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core_client.py", line 96, in make_client
    return InprocClient(vllm_config, executor_class, log_stats)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core_client.py", line 269, in __init__
    self.engine_core = EngineCore(*args, **kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 112, in __init__
    num_gpu_blocks, num_cpu_blocks, kv_cache_config = self._initialize_kv_caches(
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 242, in _initialize_kv_caches
    available_gpu_memory = self.model_executor.determine_available_memory()
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/executor/uniproc_executor.py", line 180, in determine_available_memory
    memory = super().determine_available_memory()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/executor/abstract.py", line 126, in determine_available_memory
    return self.collective_rpc("determine_available_memory")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/executor/uniproc_executor.py", line 75, in collective_rpc
    result = run_method(self.driver_worker, method, args, kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/serial_utils.py", line 461, in run_method
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/worker/worker.py", line 320, in determine_available_memory
    assert self.init_npu_memory > free_npu_memory, (
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Error in memory profiling. Initial free memory 42652123136, current free memory 47196426240. This happens when the NPU memory was not properly cleaned up before initializing the vLLM instance.

@shen-shanshan
Copy link
Copy Markdown
Collaborator Author

(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] EngineCore failed to start.
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] Traceback (most recent call last):
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 918, in run_engine_core
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     engine_core = DPEngineCoreProc(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 1268, in __init__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     super().__init__(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 691, in __init__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     super().__init__(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 112, in __init__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     num_gpu_blocks, num_cpu_blocks, kv_cache_config = self._initialize_kv_caches(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 242, in _initialize_kv_caches
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     available_gpu_memory = self.model_executor.determine_available_memory()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/executor/abstract.py", line 126, in determine_available_memory
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return self.collective_rpc("determine_available_memory")
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/executor/uniproc_executor.py", line 75, in collective_rpc
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     result = run_method(self.driver_worker, method, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/serial_utils.py", line 461, in run_method
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return func(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return func(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/worker/worker.py", line 313, in determine_available_memory
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.model_runner.profile_run()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 2344, in profile_run
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self._dummy_run(mc2_tokens_capacity,
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return func(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 2291, in _dummy_run
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     outputs = self._model_forward(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]               ^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1688, in _model_forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     hidden_states = self.model(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                     ^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return self._call_impl(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return forward_call(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/model_executor/models/qwen3_moe.py", line 773, in forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     hidden_states = self.model(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                     ^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/compilation/decorators.py", line 561, in __call__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     output = TorchCompileWithNoGuardsWrapper.__call__(self, *args, **kwargs)  # type: ignore[arg-type]
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/compilation/wrapper.py", line 228, in __call__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return self._call_with_optional_nvtx_range(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/compilation/wrapper.py", line 119, in _call_with_optional_nvtx_range
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return callable_fn(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 832, in compile_wrapper
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return fn(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1874, in __call__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     result = self._torchdynamo_orig_backend(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 688, in __call__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     result = _compile(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]              ^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1433, in _compile
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     guarded_code, tracer_output = compile_inner(code, one_graph, hooks)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_utils_internal.py", line 92, in wrapper_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return function(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1117, in compile_inner
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return _compile_inner(code, one_graph, hooks)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1151, in _compile_inner
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     dynamo_output = compile_frame(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                     ^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1032, in compile_frame
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     bytecode, tracer_output = transform_code_object(code, transform)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/bytecode_transformation.py", line 1592, in transform_code_object
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     tracer_output = transformations(instructions, code_options)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1004, in transform
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     tracer_output = trace_frame(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                     ^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 312, in _fn
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return fn(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 815, in trace_frame
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     run_tracer()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 797, in run_tracer
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     tracer.run()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1487, in run
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     while self.step():
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]           ^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1348, in step
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.dispatch_table[inst.opcode](self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 904, in wrapper
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return inner_fn(self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 3411, in CALL
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self._call(inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 3405, in _call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.call_function(fn, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1266, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.push(fn.call_function(self, args, kwargs))  # type: ignore[arg-type]
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/lazy.py", line 212, in realize_and_forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return getattr(self.realize(), name)(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/nn_module.py", line 1010, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return variables.UserFunctionVariable(fn, source=source).call_function(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 598, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return super().call_function(tx, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 342, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1288, in inline_user_function_return
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 4112, in inline_call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return tracer.inline_call_()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/compilation/decorators.py", line 508, in patched_inline_call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return inline_call(self_)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 4315, in inline_call_
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.run()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1487, in run
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     while self.step():
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]           ^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1348, in step
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.dispatch_table[inst.opcode](self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 904, in wrapper
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return inner_fn(self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 3411, in CALL
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self._call(inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 3405, in _call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.call_function(fn, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1266, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.push(fn.call_function(self, args, kwargs))  # type: ignore[arg-type]
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/lazy.py", line 212, in realize_and_forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return getattr(self.realize(), name)(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/nn_module.py", line 1010, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return variables.UserFunctionVariable(fn, source=source).call_function(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 598, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return super().call_function(tx, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 342, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1288, in inline_user_function_return
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 4112, in inline_call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return tracer.inline_call_()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/compilation/decorators.py", line 508, in patched_inline_call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return inline_call(self_)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 4315, in inline_call_
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.run()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1487, in run
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     while self.step():
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]           ^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1348, in step
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.dispatch_table[inst.opcode](self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 904, in wrapper
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return inner_fn(self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 3411, in CALL
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self._call(inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 3405, in _call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.call_function(fn, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1266, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.push(fn.call_function(self, args, kwargs))  # type: ignore[arg-type]
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/lazy.py", line 212, in realize_and_forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return getattr(self.realize(), name)(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/nn_module.py", line 1010, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return variables.UserFunctionVariable(fn, source=source).call_function(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 598, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return super().call_function(tx, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 342, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1288, in inline_user_function_return
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 4112, in inline_call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return tracer.inline_call_()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/compilation/decorators.py", line 508, in patched_inline_call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return inline_call(self_)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 4315, in inline_call_
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.run()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1487, in run
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     while self.step():
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]           ^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1348, in step
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.dispatch_table[inst.opcode](self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2991, in UNPACK_SEQUENCE
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     val = seq.unpack_var_sequence(self, idxes=range(inst.argval))  # type: ignore[arg-type]
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/tensor.py", line 593, in unpack_var_sequence
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     assert len(idxes) == length, (
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] AssertionError: Can't unpack a tensor of 512 rows into a tuple of 2 elements.
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] 
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] from user code:
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]    File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/model_executor/models/qwen3_moe.py", line 494, in forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     hidden_states, residual = layer(positions, hidden_states, residual)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/model_executor/models/qwen3_moe.py", line 425, in forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     hidden_states = self.mlp(hidden_states)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/model_executor/models/qwen3_moe.py", line 230, in forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     shared_out, fused_out = self.experts(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/ops/fused_moe/fused_moe.py", line 515, in forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     shared_out, fused_out = AscendFusedMoE.forward(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] 
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
@shen-shanshan
Copy link
Copy Markdown
Collaborator Author

Transfer to #6470.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build documentation Improvements or additions to documentation module:ops ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants