[Main2Main] Upgrade vllm commit to `v0.15.0rc0` by shen-shanshan · Pull Request #6304 · vllm-project/vllm-ascend

shen-shanshan · 2026-01-27T07:13:26Z

What this PR does / why we need it?

Fix TypeError: MMEncoderAttention.__init__() got an unexpected keyword argument 'multimodal_config' due to [Models]: Make Multimodal config implicit in ViT implementation vllm#31972.
Fix _shared_experts: 'NoneType' object is not callable due to [Models] Add SharedFusedMoE support to Qwen3MoE vllm#32082 by [Main2Main][BugFix] Add shared_experts check for AscendSharedFusedMoE #6335.
Fix ReshapeAndCacheOperation setup failed! due to [Performance] Split FlashAttn attention and cache update vllm#25954 by registering unified_kv_cache_update custom op.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.14.1
vLLM main: vllm-project/vllm@dc917cc

github-actions · 2026-01-27T07:13:43Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request upgrades the vLLM commit to align with v0.15.0rc0, primarily to resolve a TypeError related to the multimodal_config argument in MMEncoderAttention. The changes correctly update the dependency commit in the documentation and remove the now-obsolete parameter from the AscendMMEncoderAttention class. The modifications are appropriate and address the stated issue.

wjunLu · 2026-01-27T12:16:21Z

https://github.com/vllm-project/vllm/pull/32082 breaks this https://github.com/vllm-project/vllm-ascend/actions/runs/21391619327/job/61579739637?pr=6304#step:11:1116

wjunLu · 2026-01-28T03:38:48Z

https://github.com/vllm-project/vllm/pull/32082 breaks this https://github.com/vllm-project/vllm-ascend/actions/runs/21391619327/job/61579739637?pr=6304#step:11:1116

#6335 fixed above break

shen-shanshan · 2026-01-28T03:58:24Z

ReshapeAndCacheOperation setup failed!
Exception raised from OperationSetup at build/third_party/op-plugin/op_plugin/CMakeFiles/op_plugin_atb.dir/compiler_depend.ts:203 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0xb0 (0xffffa90fc700 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x68 (0xffffa909a860 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #2: atb::OperationSetup(atb::VariantPack, atb::Operation*, atb::Context*) + 0x278 (0xfffdd98c0498 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch_npu/lib/libop_plugin_atb.so)
frame #3: <unknown function> + 0xb1b74 (0xfffdd98c1b74 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch_npu/lib/libop_plugin_atb.so)
frame #4: <unknown function> + 0x2c77e24 (0xfffdfd047e24 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so)
frame #5: <unknown function> + 0xa607d0 (0xfffdfae307d0 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so)
frame #6: <unknown function> + 0xa613ac (0xfffdfae313ac in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so)
frame #7: <unknown function> + 0xa5f2c8 (0xfffdfae2f2c8 in /root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so)
frame #8: <unknown function> + 0xda294 (0xffffb7a6a294 in /root/miniconda3/envs/vllm/bin/../lib/libstdc++.so.6)
frame #9: <unknown function> + 0x80398 (0xffffb7c10398 in /lib/aarch64-linux-gnu/libc.so.6)
frame #10: <unknown function> + 0xe9e9c (0xffffb7c79e9c in /lib/aarch64-linux-gnu/libc.so.6)

(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:72] Dumping input data for V1 LLM engine (v0.15.0rc0) with config: model='/root/.cache/modelscope/hub/models/openai-mirror/whisper-large-v3-turbo', speculative_config=None, tokenizer='/root/.cache/modelscope/hub/models/openai-mirror/whisper-large-v3-turbo', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=448, download_dir=None, load_format=auto, tensor_parallel_size=1, pipeline_parallel_size=1, data_parallel_size=1, disable_custom_all_reduce=True, quantization=None, enforce_eager=False, enable_return_routed_experts=False, kv_cache_dtype=auto, device_config=npu, structured_outputs_config=StructuredOutputsConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_parser='', reasoning_parser_plugin='', enable_in_reasoning=False), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None, kv_cache_metrics=False, kv_cache_metrics_sample=0.01, cudagraph_metrics=False, enable_layerwise_nvtx_tracing=False, enable_mfu_metrics=False, enable_mm_processor_stats=False, enable_logging_iteration_details=False), seed=0, served_model_name=/root/.cache/modelscope/hub/models/openai-mirror/whisper-large-v3-turbo, enable_prefix_caching=False, enable_chunked_prefill=False, pooler_config=None, compilation_config={'level': None, 'mode': <CompilationMode.VLLM_COMPILE: 3>, 'debug_dump_path': None, 'cache_dir': '/root/.cache/vllm/torch_compile_cache/7423e1a4bc', 'compile_cache_save_format': 'binary', 'backend': 'vllm_ascend.compilation.compiler_interface.AscendCompiler', 'custom_ops': ['all'], 'splitting_ops': ['vllm::unified_attention', 'vllm::unified_attention_with_output', 'vllm::unified_mla_attention', 'vllm::unified_mla_attention_with_output', 'vllm::mamba_mixer2', 'vllm::mamba_mixer', 'vllm::short_conv', 'vllm::linear_attention', 'vllm::plamo2_mamba_mixer', 'vllm::gdn_attention_core', 'vllm::kda_attention', 'vllm::sparse_attn_indexer', 'vllm::rocm_aiter_sparse_attn_indexer', 'vllm::mla_forward', 'vllm::mla_forward'], 'compile_mm_encoder': False, 'compile_sizes': [], 'compile_ranges_split_points': [2240], 'inductor_compile_config': {'enable_auto_functionalized_v2': False, 'combo_kernels': True, 'benchmark_combo_kernel': True}, 'inductor_passes': {}, 'cudagraph_mode': <CUDAGraphMode.PIECEWISE: 1>, 'cudagraph_num_of_warmups': 1, 'cudagraph_capture_sizes': [1, 2, 4, 8], 'cudagraph_copy_inputs': False, 'cudagraph_specialize_lora': True, 'use_inductor_graph_partition': False, 'pass_config': {'fuse_norm_quant': True, 'fuse_act_quant': True, 'fuse_attn_quant': False, 'eliminate_noops': True, 'enable_sp': False, 'fuse_gemm_comms': False, 'fuse_allreduce_rms': False}, 'max_cudagraph_capture_size': 8, 'dynamic_shapes_config': {'type': <DynamicShapesType.BACKED: 'backed'>, 'evaluate_guards': False, 'assume_32_bit_indexing': True}, 'local_cache_dir': '/root/.cache/vllm/torch_compile_cache/7423e1a4bc/rank_0_0/backbone'}, 
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79] Dumping scheduler output for model execution: SchedulerOutput(scheduled_new_reqs=[NewRequestData(req_id=0-a72d9ae6,prompt_token_ids_len=4,prefill_token_ids_len=None,mm_features=[MultiModalFeatureSpec(data={'input_features': MultiModalFieldElem(modality='audio', key='input_features', data=tensor([[-0.5781, -0.5781, -0.5781,  ..., -0.5781, -0.5781, -0.5781],
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]         [-0.5781, -0.5781, -0.5781,  ..., -0.5781, -0.5781, -0.5781],
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]         [-0.5781, -0.5781, -0.5781,  ..., -0.5781, -0.5781, -0.5781],
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]         ...,
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]         [-0.5781, -0.5781, -0.5781,  ..., -0.5781, -0.5781, -0.5781],
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]         [-0.5781, -0.5781, -0.5781,  ..., -0.5781, -0.5781, -0.5781],
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]         [-0.5781, -0.5781, -0.5781,  ..., -0.5781, -0.5781, -0.5781]],
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [dump_input.py:79]        dtype=torch.bfloat16), field=MultiModalBatchedField(keep_on_cpu=False))}, modality='audio', identifier='103d3b4774b9507ef036a4103955d46a4d6eb3816a10255bb0810cf7cea178bf', mm_position=PlaceholderRange(offset=0, length=1500, is_embed=None), mm_hash='103d3b4774b9507ef036a4103955d46a4d6eb3816a10255bb0810cf7cea178bf')],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.2, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=10, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([1], [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None)], scheduled_cached_reqs=CachedRequestData(req_ids=[],resumed_req_ids=set(),new_token_ids_lens=[],all_token_ids_lens={},new_block_ids=[],num_computed_tokens=[],num_output_tokens=[]), num_scheduled_tokens={0-a72d9ae6: 4}, total_num_scheduled_tokens=4, scheduled_spec_decode_tokens={}, scheduled_encoder_inputs={0-a72d9ae6: [0]}, num_common_prefix_blocks=[0, 0], finished_req_ids=[], free_encoder_mm_hashes=[], preempted_req_ids=[], has_structured_output_requests=false, pending_structured_output_tokens=false, num_invalid_spec_tokens=null, kv_connector_metadata=null, ec_connector_metadata=null)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937] EngineCore encountered a fatal error.
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937] Traceback (most recent call last):
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 928, in run_engine_core
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     engine_core.run_busy_loop()
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 955, in run_busy_loop
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     self._process_engine_step()
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 988, in _process_engine_step
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     outputs, model_executed = self.step_fn()
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                               ^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 490, in step_with_batch_queue
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     exec_model_fut.result()
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/concurrent/futures/_base.py", line 449, in result
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self.__get_result()
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     raise self._exception
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/executor/uniproc_executor.py", line 79, in collective_rpc
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     result = run_method(self.driver_worker, method, args, kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/serial_utils.py", line 461, in run_method
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return func(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/v1/worker/worker_base.py", line 365, in execute_model
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self.worker.execute_model(scheduler_output)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker.py", line 367, in execute_model
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     output = self.model_runner.execute_model(scheduler_output,
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return func(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1433, in execute_model
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     hidden_states = self._generate_process_reqs_hidden_states(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1023, in _generate_process_reqs_hidden_states
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     hidden_states = self.model(input_ids=input_ids,
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/model_executor/models/whisper.py", line 897, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     decoder_outputs = self.model(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                       ^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/model_executor/models/whisper.py", line 590, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     decoder_outputs = self.decoder(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                       ^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 396, in __call__
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self.forward(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/model_executor/models/whisper.py", line 561, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     hidden_states = decoder_layer(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                     ^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/model_executor/models/whisper.py", line 430, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     hidden_states = self.encoder_attn(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                     ^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/model_executor/models/whisper.py", line 300, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     attn_output = self.attn(q, k, v)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                   ^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/attention/layer.py", line 415, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     torch.ops.vllm.unified_attention_with_output(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/_ops.py", line 1255, in __call__
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return self._op(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/attention/utils/kv_transfer_utils.py", line 39, in wrapper
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return func(*args, **kwargs)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/attention/layer.py", line 885, in unified_attention_with_output
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     self.impl.forward(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm/vllm/model_executor/layers/attention/cross_attention.py", line 144, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     return super().forward(
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]            ^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 941, in forward
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     output = self.forward_impl(query, key, value, kv_cache, attn_metadata, output)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 898, in forward_impl
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     output = self.forward_fused_infer_attention(query, key, value, attn_metadata, output)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 767, in forward_fused_infer_attention
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     key, value, block_size, block_table, actual_seq_lengths_kv = self._get_fia_params(key, value, attn_metadata)
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]   File "/vllm-workspace/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 681, in _get_fia_params
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]     actual_seq_lengths_kv = torch.cumsum(attn_metadata.seq_lens, dim=0).tolist()
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=629784) ERROR 01-28 06:38:01 [core.py:937] RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is ReshapeCacheOperation.

wjunLu · 2026-01-28T06:25:45Z

(EngineCore_DP0 pid=42706)   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 681, in _get_fia_params
(EngineCore_DP0 pid=42706)     actual_seq_lengths_kv = torch.cumsum(attn_metadata.seq_lens, dim=0).tolist()
(EngineCore_DP0 pid=42706)                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=42706) RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is ReshapeCacheOperation.

what's the failed test case?

shen-shanshan · 2026-01-28T06:31:10Z

(EngineCore_DP0 pid=42706)   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 681, in _get_fia_params
(EngineCore_DP0 pid=42706)     actual_seq_lengths_kv = torch.cumsum(attn_metadata.seq_lens, dim=0).tolist()
(EngineCore_DP0 pid=42706)                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=42706) RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is ReshapeCacheOperation.

what's the failed test case?

FAILED tests/e2e/singlecard/test_models.py::test_whisper[openai-mirror/whisper-large-v3-turbo]

shen-shanshan · 2026-01-28T07:12:56Z

(EngineCore_DP0 pid=42706)   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/attention/attention_v1.py", line 681, in _get_fia_params
(EngineCore_DP0 pid=42706)     actual_seq_lengths_kv = torch.cumsum(attn_metadata.seq_lens, dim=0).tolist()
(EngineCore_DP0 pid=42706)                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=42706) RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is ReshapeCacheOperation.

what's the failed test case?

FAILED tests/e2e/singlecard/test_models.py::test_whisper[openai-mirror/whisper-large-v3-turbo]

This is due to vllm-project/vllm@a28b94e.

shen-shanshan · 2026-01-28T09:04:57Z

Failed tests/e2e/singlecard/model_runner_v2/test_basic.py::test_qwen3_dense_eager_mode[True-32-Qwen/Qwen3-0.6B]
Failed tests/e2e/singlecard/model_runner_v2/test_basic.py::test_egale_spec_decoding[True-32-vllm-ascend/EAGLE-LLaMA3.1-Instruct-8B-LLM-Research/Meta-Llama-3.1-8B-Instruct]


(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935] EngineCore failed to start.
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935] Traceback (most recent call last):
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 926, in run_engine_core
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 691, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     super().__init__(
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 105, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.model_executor = executor_class(vllm_config)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/executor/abstract.py", line 101, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self._init_executor()
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/executor/uniproc_executor.py", line 47, in _init_executor
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.driver_worker.init_device()
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/worker/worker_base.py", line 326, in init_device
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.worker.init_device()  # type: ignore
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker.py", line 298, in init_device
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.model_runner = NPUModelRunnerV2(self.vllm_config, self.device)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/v2/model_runner.py", line 49, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     super().__init__(vllm_config, device)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/worker/gpu/model_runner.py", line 142, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.req_states = RequestState(
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                       ^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/worker/gpu/states.py", line 34, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.prefill_token_ids = StagedWriteTensor(
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                              ^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/worker/gpu/buffer_utils.py", line 110, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self._uva_buf = UvaBuffer(size, dtype)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                     ^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/v1/worker/gpu/buffer_utils.py", line 20, in __init__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     self.uva = get_cuda_view_from_cpu_tensor(self.cpu)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/vllm-workspace/vllm/vllm/utils/torch_utils.py", line 659, in get_cuda_view_from_cpu_tensor
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     return torch.ops._C.get_cuda_view_from_cpu_tensor(cpu_tensor)
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]   File "/root/miniconda3/envs/vllm/lib/python3.11/site-packages/torch/_ops.py", line 1365, in __getattr__
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935]     raise AttributeError(
(EngineCore_DP0 pid=656245) ERROR 01-28 08:59:01 [core.py:935] AttributeError: '_OpNamespace' '_C' object has no attribute 'get_cuda_view_from_cpu_tensor'

shen-shanshan · 2026-01-28T09:41:49Z

RuntimeError: Cannot run aclop operators during NPU graph capture. Current working aclop is BatchMatMul. If you need this call to be captured, please try to set torch.npu.config.allow_internal_format = False. If still fail, the operator needs aclnn implementation and please file an issue. Current npuStreamCaptureStatus: npuStreamCaptureStatusActive

shen-shanshan · 2026-01-28T09:46:31Z

FAILED tests/e2e/multicard/2-cards/test_external_launcher.py::test_qwen3_external_launcher_with_sleepmode


Traceback (most recent call last):
  File "/usr/local/python3.11.14/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/local/python3.11.14/lib/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/__w/vllm-ascend/vllm-ascend/examples/offline_external_launcher.py", line 195, in main
    llm = LLM(
          ^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/entrypoints/llm.py", line 334, in __init__
    self.llm_engine = LLMEngine.from_engine_args(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/llm_engine.py", line 172, in from_engine_args
    return cls(
           ^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/llm_engine.py", line 106, in __init__
    self.engine_core = EngineCoreClient.make_client(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core_client.py", line 96, in make_client
    return InprocClient(vllm_config, executor_class, log_stats)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core_client.py", line 269, in __init__
    self.engine_core = EngineCore(*args, **kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 112, in __init__
    num_gpu_blocks, num_cpu_blocks, kv_cache_config = self._initialize_kv_caches(
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 242, in _initialize_kv_caches
    available_gpu_memory = self.model_executor.determine_available_memory()
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/executor/uniproc_executor.py", line 180, in determine_available_memory
    memory = super().determine_available_memory()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/executor/abstract.py", line 126, in determine_available_memory
    return self.collective_rpc("determine_available_memory")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/executor/uniproc_executor.py", line 75, in collective_rpc
    result = run_method(self.driver_worker, method, args, kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/serial_utils.py", line 461, in run_method
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/worker/worker.py", line 320, in determine_available_memory
    assert self.init_npu_memory > free_npu_memory, (
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Error in memory profiling. Initial free memory 42652123136, current free memory 47196426240. This happens when the NPU memory was not properly cleaned up before initializing the vLLM instance.

shen-shanshan · 2026-01-29T03:39:27Z

(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] EngineCore failed to start.
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] Traceback (most recent call last):
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 918, in run_engine_core
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     engine_core = DPEngineCoreProc(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 1268, in __init__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     super().__init__(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 691, in __init__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     super().__init__(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 112, in __init__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     num_gpu_blocks, num_cpu_blocks, kv_cache_config = self._initialize_kv_caches(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/engine/core.py", line 242, in _initialize_kv_caches
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     available_gpu_memory = self.model_executor.determine_available_memory()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/executor/abstract.py", line 126, in determine_available_memory
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return self.collective_rpc("determine_available_memory")
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/executor/uniproc_executor.py", line 75, in collective_rpc
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     result = run_method(self.driver_worker, method, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/v1/serial_utils.py", line 461, in run_method
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return func(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return func(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/worker/worker.py", line 313, in determine_available_memory
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.model_runner.profile_run()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 2344, in profile_run
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self._dummy_run(mc2_tokens_capacity,
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return func(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 2291, in _dummy_run
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     outputs = self._model_forward(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]               ^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1688, in _model_forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     hidden_states = self.model(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                     ^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return self._call_impl(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return forward_call(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/model_executor/models/qwen3_moe.py", line 773, in forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     hidden_states = self.model(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                     ^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/compilation/decorators.py", line 561, in __call__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     output = TorchCompileWithNoGuardsWrapper.__call__(self, *args, **kwargs)  # type: ignore[arg-type]
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/compilation/wrapper.py", line 228, in __call__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return self._call_with_optional_nvtx_range(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/compilation/wrapper.py", line 119, in _call_with_optional_nvtx_range
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return callable_fn(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 832, in compile_wrapper
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return fn(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1874, in __call__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     result = self._torchdynamo_orig_backend(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 688, in __call__
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     result = _compile(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]              ^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1433, in _compile
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     guarded_code, tracer_output = compile_inner(code, one_graph, hooks)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_utils_internal.py", line 92, in wrapper_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return function(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1117, in compile_inner
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return _compile_inner(code, one_graph, hooks)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1151, in _compile_inner
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     dynamo_output = compile_frame(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                     ^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1032, in compile_frame
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     bytecode, tracer_output = transform_code_object(code, transform)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/bytecode_transformation.py", line 1592, in transform_code_object
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     tracer_output = transformations(instructions, code_options)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1004, in transform
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     tracer_output = trace_frame(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]                     ^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 312, in _fn
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return fn(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 815, in trace_frame
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     run_tracer()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 797, in run_tracer
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     tracer.run()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1487, in run
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     while self.step():
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]           ^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1348, in step
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.dispatch_table[inst.opcode](self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 904, in wrapper
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return inner_fn(self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 3411, in CALL
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self._call(inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 3405, in _call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.call_function(fn, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1266, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.push(fn.call_function(self, args, kwargs))  # type: ignore[arg-type]
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/lazy.py", line 212, in realize_and_forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return getattr(self.realize(), name)(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/nn_module.py", line 1010, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return variables.UserFunctionVariable(fn, source=source).call_function(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 598, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return super().call_function(tx, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 342, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1288, in inline_user_function_return
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 4112, in inline_call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return tracer.inline_call_()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/compilation/decorators.py", line 508, in patched_inline_call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return inline_call(self_)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 4315, in inline_call_
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.run()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1487, in run
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     while self.step():
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]           ^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1348, in step
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.dispatch_table[inst.opcode](self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 904, in wrapper
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return inner_fn(self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 3411, in CALL
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self._call(inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 3405, in _call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.call_function(fn, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1266, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.push(fn.call_function(self, args, kwargs))  # type: ignore[arg-type]
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/lazy.py", line 212, in realize_and_forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return getattr(self.realize(), name)(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/nn_module.py", line 1010, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return variables.UserFunctionVariable(fn, source=source).call_function(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 598, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return super().call_function(tx, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 342, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1288, in inline_user_function_return
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 4112, in inline_call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return tracer.inline_call_()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/compilation/decorators.py", line 508, in patched_inline_call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return inline_call(self_)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 4315, in inline_call_
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.run()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1487, in run
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     while self.step():
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]           ^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1348, in step
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.dispatch_table[inst.opcode](self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 904, in wrapper
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return inner_fn(self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 3411, in CALL
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self._call(inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 3405, in _call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.call_function(fn, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1266, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.push(fn.call_function(self, args, kwargs))  # type: ignore[arg-type]
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/lazy.py", line 212, in realize_and_forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return getattr(self.realize(), name)(*args, **kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/nn_module.py", line 1010, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return variables.UserFunctionVariable(fn, source=source).call_function(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 598, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return super().call_function(tx, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 342, in call_function
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1288, in inline_user_function_return
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 4112, in inline_call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return tracer.inline_call_()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/compilation/decorators.py", line 508, in patched_inline_call
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     return inline_call(self_)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 4315, in inline_call_
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.run()
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1487, in run
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     while self.step():
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]           ^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1348, in step
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     self.dispatch_table[inst.opcode](self, inst)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2991, in UNPACK_SEQUENCE
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     val = seq.unpack_var_sequence(self, idxes=range(inst.argval))  # type: ignore[arg-type]
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch/_dynamo/variables/tensor.py", line 593, in unpack_var_sequence
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     assert len(idxes) == length, (
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]            ^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] AssertionError: Can't unpack a tensor of 512 rows into a tuple of 2 elements.
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] 
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] from user code:
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]    File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/model_executor/models/qwen3_moe.py", line 494, in forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     hidden_states, residual = layer(positions, hidden_states, residual)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/model_executor/models/qwen3_moe.py", line 425, in forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     hidden_states = self.mlp(hidden_states)
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm-empty/vllm/model_executor/models/qwen3_moe.py", line 230, in forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     shared_out, fused_out = self.experts(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]   File "/__w/vllm-ascend/vllm-ascend/vllm_ascend/ops/fused_moe/fused_moe.py", line 515, in forward
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935]     shared_out, fused_out = AscendFusedMoE.forward(
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] 
(EngineCore_DP1 pid=24156) ERROR 01-29 02:04:49 [core.py:935] Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

github-actions · 2026-01-29T12:32:29Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions · 2026-01-30T07:57:21Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: shen-shanshan <467638484@qq.com>

shen-shanshan · 2026-02-02T06:22:35Z

Transfer to #6470.

shen-shanshan requested review from LCAIZJ, Yikun, realliujiaxu, wangxiyuan, whx-sjtu and zzzzwwjj as code owners January 27, 2026 07:13

github-actions Bot added documentation Improvements or additions to documentation ci/build module:ops labels Jan 27, 2026

gemini-code-assist Bot reviewed Jan 27, 2026

View reviewed changes

wangxiyuan reviewed Jan 27, 2026

View reviewed changes

Comment thread .github/workflows/pr_test_full.yaml Outdated

shen-shanshan requested a review from weijinqian0 as a code owner January 27, 2026 08:49

shen-shanshan added ready read for review ready-for-test start test by label for PR labels Jan 27, 2026

gcanlin reviewed Jan 27, 2026

View reviewed changes

Comment thread docs/source/community/versioning_policy.md Outdated

shen-shanshan force-pushed the main2main branch from 5119505 to 9861090 Compare January 29, 2026 01:42

github-actions Bot added the merge-conflicts label Jan 29, 2026

shen-shanshan force-pushed the main2main branch from 567298c to 462de7b Compare January 30, 2026 03:38

github-actions Bot removed the merge-conflicts label Jan 30, 2026

wangxiyuan mentioned this pull request Jan 30, 2026

[Question]: When will support for vLLM v0.15.0rc1 or newer versions be added? #6390

Open

github-actions Bot added the merge-conflicts label Jan 30, 2026

shen-shanshan added 7 commits February 2, 2026 01:51

main2main

570b353

Signed-off-by: shen-shanshan <467638484@qq.com>

update

39a37c5

Signed-off-by: shen-shanshan <467638484@qq.com>

update

7d71da6

Signed-off-by: shen-shanshan <467638484@qq.com>

update

46d18d3

Signed-off-by: shen-shanshan <467638484@qq.com>

split attn forward and kv_cache update

516c170

Signed-off-by: shen-shanshan <467638484@qq.com>

fix none shared experts case

b110014

Signed-off-by: shen-shanshan <467638484@qq.com>

fix

850c6d6

Signed-off-by: shen-shanshan <467638484@qq.com>

shen-shanshan force-pushed the main2main branch from 462de7b to 850c6d6 Compare February 2, 2026 03:04

github-actions Bot removed the merge-conflicts label Feb 2, 2026

fix lint

1fe1c54

Signed-off-by: shen-shanshan <467638484@qq.com>

shen-shanshan closed this Feb 2, 2026

Conversation

shen-shanshan commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions Bot commented Jan 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

wjunLu commented Jan 27, 2026

Uh oh!

wjunLu commented Jan 28, 2026

Uh oh!

shen-shanshan commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wjunLu commented Jan 28, 2026

Uh oh!

shen-shanshan commented Jan 28, 2026

Uh oh!

shen-shanshan commented Jan 28, 2026

Uh oh!

shen-shanshan commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shen-shanshan commented Jan 28, 2026

Uh oh!

shen-shanshan commented Jan 28, 2026

Uh oh!

shen-shanshan commented Jan 29, 2026

Uh oh!

github-actions Bot commented Jan 29, 2026

Uh oh!

github-actions Bot commented Jan 30, 2026

Uh oh!

shen-shanshan commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shen-shanshan commented Jan 27, 2026 •

edited

Loading

shen-shanshan commented Jan 28, 2026 •

edited

Loading

shen-shanshan commented Jan 28, 2026 •

edited

Loading