Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] ascend 310p上执行图模式 #3119

Open
yezekun opened this issue Feb 7, 2025 · 7 comments
Open

[Feature] ascend 310p上执行图模式 #3119

yezekun opened this issue Feb 7, 2025 · 7 comments
Assignees

Comments

@yezekun
Copy link

yezekun commented Feb 7, 2025

Motivation

在ascend 310p上创建以下环境torch=2.3.1+cpu、torch-npu=2.3.1.post4

LMdeploy和dlinfer代码分别使用DeepLink-org:support_310Pyao-fengchen:support_310P

ascend 310p上进行图模式的推理任务时

from lmdeploy import pipeline
from lmdeploy import PytorchEngineConfig, GenerationConfig

if __name__ == "__main__":
    pipe = pipeline("/mnt/data/llm/Qwen1.5-7B-Chat/",
                    backend_config=PytorchEngineConfig(
                        tp=1,
                        device_type="ascend",
                        dtype='float16',
                        eager_mode=False,
                        cache_max_entry_count=0.5))
    # question = ["Shanghai is"]
    question = ["Shanghai is the largest city in China. Please introduce it."]
    response = pipe(question, gen_config=GenerationConfig(max_new_tokens=10))
    print(response)

存在以下问题

/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch_npu/utils/collect_env.py:59: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/latest owner does not match the current owner.
  warnings.warn(f"Warning: The {path} owner does not match the current owner.")
/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch_npu/utils/collect_env.py:59: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/8.0.0.alpha001/x86_64-linux/ascend_toolkit_install.info owner does not match the current owner.
  warnings.warn(f"Warning: The {path} owner does not match the current owner.")
/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch_npu/contrib/transfer_to_npu.py:292: ImportWarning:
    *************************************************************************************************************
    The torch.Tensor.cuda and torch.nn.Module.cuda are replaced with torch.Tensor.npu and torch.nn.Module.npu now..
    The torch.cuda.DoubleTensor is replaced with torch.npu.FloatTensor cause the double type is not supported now..
    The backend in torch.distributed.init_process_group set to hccl now..
    The torch.cuda.* and torch.cuda.amp.* are replaced with torch.npu.* and torch.npu.amp.* now..
    The device parameters have been replaced with npu in the function below:
    torch.logspace, torch.randint, torch.hann_window, torch.rand, torch.full_like, torch.ones_like, torch.rand_like, torch.randperm, torch.arange, torch.frombuffer, torch.normal, torch._empty_per_channel_affine_quantized, torch.empty_strided, torch.empty_like, torch.scalar_tensor, torch.tril_indices, torch.bartlett_window, torch.ones, torch.sparse_coo_tensor, torch.randn, torch.kaiser_window, torch.tensor, torch.triu_indices, torch.as_tensor, torch.zeros, torch.randint_like, torch.full, torch.eye, torch._sparse_csr_tensor_unsafe, torch.empty, torch._sparse_coo_tensor_unsafe, torch.blackman_window, torch.zeros_like, torch.range, torch.sparse_csr_tensor, torch.randn_like, torch.from_file, torch._cudnn_init_dropout_state, torch._empty_affine_quantized, torch.linspace, torch.hamming_window, torch.empty_quantized, torch._pin_memory, torch.autocast, torch.load, torch.Generator, torch.set_default_device, torch.Tensor.new_empty, torch.Tensor.new_empty_strided, torch.Tensor.new_full, torch.Tensor.new_ones, torch.Tensor.new_tensor, torch.Tensor.new_zeros, torch.Tensor.to, torch.nn.Module.to, torch.nn.Module.to_empty
    *************************************************************************************************************

  warnings.warn(msg, ImportWarning)
/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch_npu/contrib/transfer_to_npu.py:247: RuntimeWarning: torch.jit.script and torch.jit.script_method will be disabled by transfer_to_npu, which currently does not support them, if you need to enable them, please do not use transfer_to_npu.
  warnings.warn(msg, RuntimeWarning)
2025-02-07 22:25:43,668 - lmdeploy - WARNING - transformers.py:22 - LMDeploy requires transformers version: [4.33.0 ~ 4.46.1], but found version: 4.48.0
Loading weights from safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  2.21it/s]
/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/backends/dlinfer/ascend/graph_runner.py:51: RuntimeWarning:

************************************************************
  Graph mode is an experimental feature. We currently
  support both dense and Mixture of Experts (MoE) models
  with bf16 and fp16 data types.
  If graph mode does not function correctly with your model,
  please consider using eager mode as an alternative.
************************************************************

  warnings.warn(
2025-02-07 22:26:04,020 - lmdeploy - WARNING - async_engine.py:625 - GenerationConfig: GenerationConfig(n=1, max_new_tokens=50, do_sample=False, top_p=1.0, top_k=50, min_p=0.0, temperature=0.8, repetition_penalty=1.0, ignore_eos=False, random_seed=None, stop_words=None, bad_words=None, stop_token_ids=[151645], bad_token_ids=None, min_new_tokens=None, skip_special_tokens=True, spaces_between_special_tokens=True, logprobs=None, response_format=None, logits_processors=None, output_logits=None, output_last_hidden_state=None)
2025-02-07 22:26:04,020 - lmdeploy - WARNING - async_engine.py:626 - Since v0.6.0, lmdeploy add `do_sample` in GenerationConfig. It defaults to False, meaning greedy decoding. Please set `do_sample=True` if sampling  decoding is needed
/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch_npu/utils/storage.py:38: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  if self.device.type != 'cpu':
2025-02-07 22:26:12,933 - lmdeploy - ERROR - engine.py:907 - Task <MainLoopBackground> failed
Traceback (most recent call last):
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 902, in __task_callback
    task.result()
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 860, in _async_loop_background
    await self._async_step_background(
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 729, in _async_step_background
    output = await self._async_model_forward(inputs,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/utils.py", line 234, in __tmp
    return (await func(*args, **kwargs))
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 627, in _async_model_forward
    ret = await __forward(inputs)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 604, in __forward
    return await self.model_agent.async_forward(inputs, swap_in_map=swap_in_map, swap_out_map=swap_out_map)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/model_agent.py", line 256, in async_forward
    output = self._forward_impl(inputs, swap_in_map=swap_in_map, swap_out_map=swap_out_map)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/model_agent.py", line 239, in _forward_impl
    output = model_forward(
             ^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/model_agent.py", line 151, in model_forward
    output = model(**input_dict)
             ^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/backends/graph_runner.py", line 24, in __call__
    return self.model(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 921, in catch_errors
    return callback(frame, cache_entry, hooks, frame_state, skip=1)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 400, in _convert_frame_assert
    return _compile(
           ^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 676, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 535, in compile_inner
    out_code = transform_code_object(code, transform)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/bytecode_transformation.py", line 1036, in transform_code_object
    transformations(instructions, code_options)
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 165, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 500, in transform
    tracer.run()
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2149, in run
    super().run()
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
        ^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2268, in RETURN_VALUE
    self.output.compile_subgraph(
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 981, in compile_subgraph
    self.compile_and_call_fx_graph(tx, list(reversed(stack_values)), root)
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1178, in compile_and_call_fx_graph
    compiled_fn = self.call_user_compiler(gm)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1251, in call_user_compiler
    raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1232, in call_user_compiler
    compiled_fn = compiler_fn(gm, self.example_inputs())
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/repro/after_dynamo.py", line 117, in debug_wrapper
    compiled_gm = compiler_fn(gm, example_inputs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/repro/after_dynamo.py", line 117, in debug_wrapper
    compiled_gm = compiler_fn(gm, example_inputs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/__init__.py", line 1770, in __call__
    return self.compiler_fn(model_, inputs_, **self.kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/dlinfer_support_310P/dlinfer/graph/dicp/vendor/AtbGraph/__init__.py", line 7, in atbgraph
    return compile_fx(gm, fake_input_tensor, "atbgraph")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/dlinfer_support_310P/dlinfer/graph/dicp/dynamo_bridge/compile_fx.py", line 103, in compile_fx
    return compile_fx_210(model_, example_inputs_, backend, inner_compile)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/dlinfer_support_310P/dlinfer/graph/dicp/dynamo_bridge/compile_fx.py", line 255, in compile_fx_210
    return aot_autograd(
           ^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/backends/common.py", line 58, in compiler_fn
    cg = aot_module_simplified(gm, example_inputs, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 903, in aot_module_simplified
    compiled_fn = create_aot_dispatcher_function(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 628, in create_aot_dispatcher_function
    compiled_fn = compiler_fn(flat_fn, fake_flat_args, aot_config, fw_metadata=fw_metadata)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 443, in aot_wrapper_dedupe
    return compiler_fn(flat_fn, leaf_flat_args, aot_config, fw_metadata=fw_metadata)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 648, in aot_wrapper_synthetic_base
    return compiler_fn(flat_fn, flat_args, aot_config, fw_metadata=fw_metadata)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py", line 119, in aot_dispatch_base
    compiled_fw = compiler(fw_module, updated_flat_args)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/dlinfer_support_310P/dlinfer/graph/dicp/dynamo_bridge/compile_fx.py", line 217, in fw_compiler_base
    return inner_compile(
           ^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/dlinfer_support_310P/dlinfer/graph/dicp/dynamo_bridge/compile_fx.py", line 81, in compile_fx_inner
    gt = GraphTransformer(gm, backend)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/dlinfer_support_310P/dlinfer/graph/dicp/dynamo_bridge/graph.py", line 53, in __init__
    from dlinfer.graph.dicp.vendor.AtbGraph.codegen.atb import AtbCodegen
  File "/home/yzk/dlinfer_support_310P/dlinfer/graph/dicp/vendor/AtbGraph/codegen/atb.py", line 8, in <module>
    from dlinfer.graph.dicp.vendor.AtbGraph.codegen.atb_graph import Graph, parse_graph
  File "/home/yzk/dlinfer_support_310P/dlinfer/graph/dicp/vendor/AtbGraph/codegen/atb_graph.py", line 7, in <module>
    from dlinfer.graph.dicp.vendor.AtbGraph.codegen import atb_infer_param as infer_param
  File "/home/yzk/dlinfer_support_310P/dlinfer/graph/dicp/vendor/AtbGraph/codegen/atb_infer_param.py", line 138, in <module>
    @dataclass
     ^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/dataclasses.py", line 1232, in dataclass
    return wrap(cls)
           ^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/dataclasses.py", line 1222, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/dataclasses.py", line 958, in _process_class
    cls_fields.append(_get_field(cls, name, type, kw_only))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/dataclasses.py", line 815, in _get_field
    raise ValueError(f'mutable default {type(f.default)} for field '
torch._dynamo.exc.BackendCompilerFailed: backend='atbgraph' raised:
ValueError: mutable default <class 'dlinfer.graph.dicp.vendor.AtbGraph.codegen.atb_infer_param.NormParam'> for field normParam is not allowed: use default_factory

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information

You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True

2025-02-07 22:26:12,937 - lmdeploy - ERROR - async_engine.py:773 - session 0 finished, reason "error"
2025-02-07 22:26:12,937 - lmdeploy - ERROR - async_engine.py:773 - session 1 finished, reason "error"
2025-02-07 22:26:12,938 - lmdeploy - ERROR - async_engine.py:773 - session 2 finished, reason "error"
[Response(text='internal error happened', generate_token_len=0, input_token_len=22, finish_reason='error', token_ids=[], logprobs=None, logits=None, last_hidden_state=None, index=0), Response(text='internal error happened', generate_token_len=0, input_token_len=22, finish_reason='error', token_ids=[], logprobs=None, logits=None, last_hidden_state=None, index=1), Response(text='internal error happened', generate_token_len=0, input_token_len=23, finish_reason='error', token_ids=[], logprobs=None, logits=None, last_hidden_state=None, index=2)]

在定义数据类时,某个字段使用了可变对象作为默认值,而Python的dataclass不允许这样做。根据Python的dataclass规范,如果字段的默认值是可变对象(比如列表、字典或自定义类的实例),应该使用default_factory来生成默认值,而不是直接赋值。修改对应代码为下图所示,并执行source /usr/local/Ascend/nnal/atb/set_env.sh

Image

出现了如下错误

/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch_npu/utils/collect_env.py:59: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/latest owner does not match the current owner.
  warnings.warn(f"Warning: The {path} owner does not match the current owner.")
/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch_npu/utils/collect_env.py:59: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/8.0.0.alpha001/x86_64-linux/ascend_toolkit_install.info owner does not match the current owner.
  warnings.warn(f"Warning: The {path} owner does not match the current owner.")
/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch_npu/contrib/transfer_to_npu.py:292: ImportWarning:
    *************************************************************************************************************
    The torch.Tensor.cuda and torch.nn.Module.cuda are replaced with torch.Tensor.npu and torch.nn.Module.npu now..
    The torch.cuda.DoubleTensor is replaced with torch.npu.FloatTensor cause the double type is not supported now..
    The backend in torch.distributed.init_process_group set to hccl now..
    The torch.cuda.* and torch.cuda.amp.* are replaced with torch.npu.* and torch.npu.amp.* now..
    The device parameters have been replaced with npu in the function below:
    torch.logspace, torch.randint, torch.hann_window, torch.rand, torch.full_like, torch.ones_like, torch.rand_like, torch.randperm, torch.arange, torch.frombuffer, torch.normal, torch._empty_per_channel_affine_quantized, torch.empty_strided, torch.empty_like, torch.scalar_tensor, torch.tril_indices, torch.bartlett_window, torch.ones, torch.sparse_coo_tensor, torch.randn, torch.kaiser_window, torch.tensor, torch.triu_indices, torch.as_tensor, torch.zeros, torch.randint_like, torch.full, torch.eye, torch._sparse_csr_tensor_unsafe, torch.empty, torch._sparse_coo_tensor_unsafe, torch.blackman_window, torch.zeros_like, torch.range, torch.sparse_csr_tensor, torch.randn_like, torch.from_file, torch._cudnn_init_dropout_state, torch._empty_affine_quantized, torch.linspace, torch.hamming_window, torch.empty_quantized, torch._pin_memory, torch.autocast, torch.load, torch.Generator, torch.set_default_device, torch.Tensor.new_empty, torch.Tensor.new_empty_strided, torch.Tensor.new_full, torch.Tensor.new_ones, torch.Tensor.new_tensor, torch.Tensor.new_zeros, torch.Tensor.to, torch.nn.Module.to, torch.nn.Module.to_empty
    *************************************************************************************************************

  warnings.warn(msg, ImportWarning)
/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch_npu/contrib/transfer_to_npu.py:247: RuntimeWarning: torch.jit.script and torch.jit.script_method will be disabled by transfer_to_npu, which currently does not support them, if you need to enable them, please do not use transfer_to_npu.
  warnings.warn(msg, RuntimeWarning)
2025-02-07 23:04:46,154 - lmdeploy - WARNING - transformers.py:22 - LMDeploy requires transformers version: [4.33.0 ~ 4.46.1], but found version: 4.48.0
Loading weights from safetensors: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  2.28it/s]
/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/backends/dlinfer/ascend/graph_runner.py:51: RuntimeWarning:

************************************************************
  Graph mode is an experimental feature. We currently
  support both dense and Mixture of Experts (MoE) models
  with bf16 and fp16 data types.
  If graph mode does not function correctly with your model,
  please consider using eager mode as an alternative.
************************************************************

  warnings.warn(
2025-02-07 23:05:06,454 - lmdeploy - WARNING - async_engine.py:625 - GenerationConfig: GenerationConfig(n=1, max_new_tokens=50, do_sample=False, top_p=1.0, top_k=50, min_p=0.0, temperature=0.8, repetition_penalty=1.0, ignore_eos=False, random_seed=None, stop_words=None, bad_words=None, stop_token_ids=[151645], bad_token_ids=None, min_new_tokens=None, skip_special_tokens=True, spaces_between_special_tokens=True, logprobs=None, response_format=None, logits_processors=None, output_logits=None, output_last_hidden_state=None)
2025-02-07 23:05:06,454 - lmdeploy - WARNING - async_engine.py:626 - Since v0.6.0, lmdeploy add `do_sample` in GenerationConfig. It defaults to False, meaning greedy decoding. Please set `do_sample=True` if sampling  decoding is needed
/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch_npu/utils/storage.py:38: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  if self.device.type != 'cpu':
mki_log mkdir /home/yzk/atb/
mki_log mkdir /home/yzk/atb/log
[2025-02-07 23:05:28.176] [dicp] [error] [model.cpp:266] execute node[0] fail, error code: 3
[2025-02-07 23:05:28.176] [dicp] [critical] [model.cpp:241] 0 execute node[0] failed, error code: 3
2025-02-07 23:05:28,179 - lmdeploy - ERROR - engine.py:907 - Task <MainLoopBackground> failed
Traceback (most recent call last):
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 902, in __task_callback
    task.result()
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 860, in _async_loop_background
    await self._async_step_background(
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 729, in _async_step_background
    output = await self._async_model_forward(inputs,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/utils.py", line 234, in __tmp
    return (await func(*args, **kwargs))
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 627, in _async_model_forward
    ret = await __forward(inputs)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 604, in __forward
    return await self.model_agent.async_forward(inputs, swap_in_map=swap_in_map, swap_out_map=swap_out_map)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/model_agent.py", line 256, in async_forward
    output = self._forward_impl(inputs, swap_in_map=swap_in_map, swap_out_map=swap_out_map)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/model_agent.py", line 239, in _forward_impl
    output = model_forward(
             ^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/model_agent.py", line 151, in model_forward
    output = model(**input_dict)
             ^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/backends/graph_runner.py", line 24, in __call__
    return self.model(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/models/qwen2.py", line 314, in forward
    def forward(
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_dynamo/external_utils.py", line 36, in inner
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 917, in forward
    return compiled_fn(full_args)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/utils.py", line 89, in g
    return f(*args)
           ^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 106, in runtime_wrapper
    all_outs = call_func_at_runtime_with_args(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/utils.py", line 113, in call_func_at_runtime_with_args
    out = normalize_as_list(f(args))
                            ^^^^^^^
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py", line 152, in rng_functionalization_wrapper
    return compiled_fw(args)
           ^^^^^^^^^^^^^^^^^
  File "/tmp/torchinductor_yzk/y2/cy2jeobsbovrkxb6lf3hmxh6e577xp33j7yz2w4boxravr64yuq6.py", line 78, in call
    kernel_cpp_0(inputs, outputs, param)
  File "/home/yzk/.conda/envs/yzk-lmdeploy/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/yzk/dlinfer_support_310P/dlinfer/graph/dicp/vendor/AtbGraph/codegen/load_and_run.py", line 17, in run
    self.model.execute_out(inputs, outputs, param)
RuntimeError: Fatal error occurred: 0 execute node[0] failed, error code: 3
2025-02-07 23:05:28,182 - lmdeploy - ERROR - async_engine.py:773 - session 0 finished, reason "error"
2025-02-07 23:05:28,182 - lmdeploy - ERROR - async_engine.py:773 - session 1 finished, reason "error"
2025-02-07 23:05:28,182 - lmdeploy - ERROR - async_engine.py:773 - session 2 finished, reason "error"
[Response(text='internal error happened', generate_token_len=0, input_token_len=22, finish_reason='error', token_ids=[], logprobs=None, logits=None, last_hidden_state=None, index=0), Response(text='internal error happened', generate_token_len=0, input_token_len=22, finish_reason='error', token_ids=[], logprobs=None, logits=None, last_hidden_state=None, index=1), Response(text='internal error happened', generate_token_len=0, input_token_len=23, finish_reason='error', token_ids=[], logprobs=None, logits=None, last_hidden_state=None, index=2)]

这个errcode:3这个报错和issues2745中最后的报错一样。请问310P目前是否具有支持图模式的计划

Related resources

No response

Additional context

No response

@yao-fengchen
Copy link
Collaborator

目前310P还在支持中,请问你这里的CANN用的什么版本

@jinminxi104
Copy link
Collaborator

jinminxi104 commented Feb 8, 2025

图模式下我们只测试3.10/3.9/3.8的情况,由于python版本会影响到图模式的抓图过程,请暂时使用3.10来跑。
你这里提出的dataclass问题,我们这里修复一下。
其他的问题我们这里复现一下。

@yezekun
Copy link
Author

yezekun commented Feb 8, 2025

目前310P还在支持中,请问你这里的CANN用的什么版本

目前用的CANN版本是8.0.0.alpha001

@yezekun
Copy link
Author

yezekun commented Feb 8, 2025

图模式下我们只测试3.10/3.9/3.8的情况,由于python版本会影响到图模式的抓图过程,请暂时使用3.10来跑。 你这里提出的dataclass问题,我们这里修复一下。 其他的问题我们这里复现一下。

嗯嗯好的,非常感谢

@yao-fengchen
Copy link
Collaborator

CANN 8.0.0.alpha001还没有测试,我们目前使用的是8.0.RC3.beta1,改用Python3.10后还有问题吗?

@yezekun
Copy link
Author

yezekun commented Feb 10, 2025

CANN 8.0.0.alpha001还没有测试,我们目前使用的是8.0.RC3.beta1,改用Python3.10后还有问题吗?

是的,依旧存在问题,conda环境如下

(yzk-lmdeploy-3.10) [yzk@devserver-2efb lmdeploy_support_310P]$ python --version
Python 3.10.16
(yzk-lmdeploy-3.10) [yzk@devserver-2efb lmdeploy_support_310P]$ pip list
Package                   Version     Editable project location
------------------------- ----------- -------------------------------
absl-py                   2.1.0
accelerate                1.3.0
addict                    2.4.0
aiohappyeyeballs          2.4.6
aiohttp                   3.11.12
aiosignal                 1.3.2
annotated-types           0.7.0
anyio                     4.8.0
ascendebug                0.1.0
async-timeout             5.0.1
attrs                     25.1.0
auto_tune                 0.1.0
certifi                   2025.1.31
charset-normalizer        3.4.1
click                     8.1.8
cloudpickle               3.1.1
cmake                     3.31.4
dataflow                  0.0.1
datasets                  3.2.0
decorator                 5.1.1
dill                      0.3.8
diskcache                 5.6.3
distro                    1.9.0
dlinfer-ascend            0.1.5       /home/yzk/dlinfer_support_310P
einops                    0.8.1
exceptiongroup            1.2.2
fastapi                   0.115.8
filelock                  3.13.1
fire                      0.7.0
frozenlist                1.5.0
fsspec                    2024.6.1
h11                       0.14.0
h5py                      3.12.1
hccl                      0.1.0
hccl_parser               0.1
httpcore                  1.0.7
httpx                     0.28.1
huggingface-hub           0.28.1
idna                      3.10
interegular               0.3.3
Jinja2                    3.1.4
jiter                     0.8.2
jsonschema                4.23.0
jsonschema-specifications 2024.10.1
lark                      1.2.2
llm_datadist              0.0.1
llvmlite                  0.44.0
lmdeploy                  0.7.0       /home/yzk/lmdeploy_support_310P
markdown-it-py            3.0.0
MarkupSafe                2.1.5
mdurl                     0.1.2
ml_dtypes                 0.5.1
mmengine-lite             0.10.6
mpmath                    1.3.0
msadvisor                 1.0.0
msobjdump                 0.1.0
multidict                 6.1.0
multiprocess              0.70.16
nest-asyncio              1.6.0
networkx                  3.3
ninja                     1.11.1.3
numba                     0.61.0
numpy                     1.26.4
nvidia-ml-py              12.570.86
op_compile_tool           0.1.0
op_gen                    0.1
op_test_frame             0.1
opc_tool                  0.1.0
openai                    1.61.1
outlines                  0.0.46
packaging                 24.2
pandas                    2.2.3
peft                      0.11.1
pillow                    11.0.0
pip                       25.0.1
platformdirs              4.3.6
propcache                 0.2.1
protobuf                  5.29.3
psutil                    6.1.1
pyairports                2.1.1
pyarrow                   19.0.0
pycountry                 24.6.1
pydantic                  2.10.6
pydantic_core             2.27.2
Pygments                  2.19.1
pynvml                    12.0.0
python-dateutil           2.9.0.post0
pytz                      2025.1
PyYAML                    6.0.2
referencing               0.36.2
regex                     2024.11.6
requests                  2.32.3
rich                      13.9.4
rpds-py                   0.22.3
safetensors               0.5.2
schedule_search           0.0.1
scikit-build              0.18.1
scipy                     1.15.1
sentencepiece             0.2.0
setuptools                69.5.1
shortuuid                 1.0.13
show_kernel_debug_data    0.1.0
six                       1.17.0
sniffio                   1.3.1
starlette                 0.45.3
sympy                     1.13.1
te                        0.4.0
termcolor                 2.5.0
tiktoken                  0.8.0
tokenizers                0.21.0
tomli                     2.2.1
torch                     2.3.1+cpu
torch-npu                 2.3.1.post4
torchvision               0.18.1+cpu
tornado                   6.4.2
tqdm                      4.67.1
transformers              4.48.3
typing_extensions         4.12.2
tzdata                    2025.1
urllib3                   2.3.0
uvicorn                   0.34.0
wheel                     0.45.1
xxhash                    3.5.0
yapf                      0.43.0
yarl                      1.18.3

报错信息如下

(yzk-lmdeploy-3.10) [yzk@devserver-2efb lmdeploy_support_310P]$ python qwen1test.py
/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch_npu/utils/collect_env.py:59: UserWarning: Warning: The /usr/local/Ascend/asce   nd-toolkit/latest owner does not match the current owner.
  warnings.warn(f"Warning: The {path} owner does not match the current owner.")
/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch_npu/utils/collect_env.py:59: UserWarning: Warning: The /usr/local/Ascend/asce   nd-toolkit/8.0.0.alpha001/x86_64-linux/ascend_toolkit_install.info owner does not match the current owner.
  warnings.warn(f"Warning: The {path} owner does not match the current owner.")
/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch_npu/contrib/transfer_to_npu.py:292: ImportWarning:
    *************************************************************************************************************
    The torch.Tensor.cuda and torch.nn.Module.cuda are replaced with torch.Tensor.npu and torch.nn.Module.npu now..
    The torch.cuda.DoubleTensor is replaced with torch.npu.FloatTensor cause the double type is not supported now..
    The backend in torch.distributed.init_process_group set to hccl now..
    The torch.cuda.* and torch.cuda.amp.* are replaced with torch.npu.* and torch.npu.amp.* now..
    The device parameters have been replaced with npu in the function below:
    torch.logspace, torch.randint, torch.hann_window, torch.rand, torch.full_like, torch.ones_like, torch.rand_like, torch.randperm, torch.arange, torch   .frombuffer, torch.normal, torch._empty_per_channel_affine_quantized, torch.empty_strided, torch.empty_like, torch.scalar_tensor, torch.tril_indices, to   rch.bartlett_window, torch.ones, torch.sparse_coo_tensor, torch.randn, torch.kaiser_window, torch.tensor, torch.triu_indices, torch.as_tensor, torch.zer   os, torch.randint_like, torch.full, torch.eye, torch._sparse_csr_tensor_unsafe, torch.empty, torch._sparse_coo_tensor_unsafe, torch.blackman_window, tor   ch.zeros_like, torch.range, torch.sparse_csr_tensor, torch.randn_like, torch.from_file, torch._cudnn_init_dropout_state, torch._empty_affine_quantized,    torch.linspace, torch.hamming_window, torch.empty_quantized, torch._pin_memory, torch.autocast, torch.load, torch.Generator, torch.set_default_device, t   orch.Tensor.new_empty, torch.Tensor.new_empty_strided, torch.Tensor.new_full, torch.Tensor.new_ones, torch.Tensor.new_tensor, torch.Tensor.new_zeros, to   rch.Tensor.to, torch.nn.Module.to, torch.nn.Module.to_empty
    *************************************************************************************************************

  warnings.warn(msg, ImportWarning)
/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch_npu/contrib/transfer_to_npu.py:247: RuntimeWarning: torch.jit.script and torc   h.jit.script_method will be disabled by transfer_to_npu, which currently does not support them, if you need to enable them, please do not use transfer_t   o_npu.
  warnings.warn(msg, RuntimeWarning)
/usr/local/Ascend/ascend-toolkit/8.0.0.alpha001/python/site-packages/tbe/tvm/contrib/ccec.py:863: DeprecationWarning: invalid escape sequence '\L'
  if not dirpath.find("AppData\Local\Temp"):
/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/tbe/dsl/classifier/transdata/transdata_classifier.py:222: DeprecationWarning: invalid escap   e sequence '\B'
  """
/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/tbe/dsl/unify_schedule/vector/transdata/common/graph/transdata_graph_info.py:143: Deprecati   onWarning: invalid escape sequence '\c'
  """
2025-02-10 18:09:59,331 - lmdeploy - WARNING - transformers.py:22 - LMDeploy requires transformers version: [4.33.0 ~ 4.46.1], but found version: 4.48.3
Loading weights from safetensors: 100%|███████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  2.33it/s]
/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/backends/dlinfer/ascend/graph_runner.py:51: RuntimeWarning:

************************************************************
  Graph mode is an experimental feature. We currently
  support both dense and Mixture of Experts (MoE) models
  with bf16 and fp16 data types.
  If graph mode does not function correctly with your model,
  please consider using eager mode as an alternative.
************************************************************


  warnings.warn(
2025-02-10 18:10:19,293 - lmdeploy - WARNING - async_engine.py:625 - GenerationConfig: GenerationConfig(n=1, max_new_tokens=50, do_sample=False, top_p=1   .0, top_k=50, min_p=0.0, temperature=0.8, repetition_penalty=1.0, ignore_eos=False, random_seed=None, stop_words=None, bad_words=None, stop_token_ids=[1   51645], bad_token_ids=None, min_new_tokens=None, skip_special_tokens=True, spaces_between_special_tokens=True, logprobs=None, response_format=None, logi   ts_processors=None, output_logits=None, output_last_hidden_state=None)
2025-02-10 18:10:19,293 - lmdeploy - WARNING - async_engine.py:626 - Since v0.6.0, lmdeploy add `do_sample` in GenerationConfig. It defaults to False, m   eaning greedy decoding. Please set `do_sample=True` if sampling  decoding is needed
/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch_npu/utils/storage.py:38: UserWarning: TypedStorage is deprecated. It will be    removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access U   ntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  if self.device.type != 'cpu':
mki_log log dir:/home/yzk/atb/log exist
[2025-02-10 18:10:39.618] [dicp] [error] [model.cpp:266] execute node[0] fail, error code: 3
[2025-02-10 18:10:39.618] [dicp] [critical] [model.cpp:241] 0 execute node[0] failed, error code: 3
2025-02-10 18:10:39,619 - lmdeploy - ERROR - engine.py:907 - Task <MainLoopBackground> failed
Traceback (most recent call last):
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 902, in __task_callback
    task.result()
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 860, in _async_loop_background
    await self._async_step_background(
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 729, in _async_step_background
    output = await self._async_model_forward(inputs,
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/utils.py", line 234, in __tmp
    return (await func(*args, **kwargs))
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 627, in _async_model_forward
    ret = await __forward(inputs)
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/engine.py", line 604, in __forward
    return await self.model_agent.async_forward(inputs, swap_in_map=swap_in_map, swap_out_map=swap_out_map)
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/model_agent.py", line 256, in async_forward
    output = self._forward_impl(inputs, swap_in_map=swap_in_map, swap_out_map=swap_out_map)
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/model_agent.py", line 239, in _forward_impl
    output = model_forward(
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/engine/model_agent.py", line 151, in model_forward
    output = model(**input_dict)
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/backends/graph_runner.py", line 24, in __call__
    return self.model(**kwargs)
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn
    return fn(*args, **kwargs)
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/yzk/lmdeploy_support_310P/lmdeploy/pytorch/models/qwen2.py", line 314, in forward
    def forward(
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn
    return fn(*args, **kwargs)
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/_dynamo/external_utils.py", line 36, in inner
    return fn(*args, **kwargs)
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 917, in forward
    return compiled_fn(full_args)
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/utils.py", line 89, in g
    return f(*args)
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 106, in runtime_w   rapper
    all_outs = call_func_at_runtime_with_args(
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/utils.py", line 113, in call_func_at_runtime   _with_args
    out = normalize_as_list(f(args))
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py", line 152,    in rng_functionalization_wrapper
    return compiled_fw(args)
  File "/tmp/torchinductor_yzk/y6/cy6peg5owyvro5eolcthdxxtc4exus54y5rlzpul4r2ra73sxchn.py", line 78, in call
    kernel_cpp_0(inputs, outputs, param)
  File "/home/yzk/.conda/envs/yzk-lmdeploy-3.10/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/home/yzk/dlinfer_support_310P/dlinfer/graph/dicp/vendor/AtbGraph/codegen/load_and_run.py", line 17, in run
    self.model.execute_out(inputs, outputs, param)
RuntimeError: Fatal error occurred: 0 execute node[0] failed, error code: 3
2025-02-10 18:10:39,620 - lmdeploy - ERROR - async_engine.py:773 - session 0 finished, reason "error"
2025-02-10 18:10:39,620 - lmdeploy - ERROR - async_engine.py:773 - session 1 finished, reason "error"
2025-02-10 18:10:39,620 - lmdeploy - ERROR - async_engine.py:773 - session 2 finished, reason "error"
[Response(text='internal error happened', generate_token_len=0, input_token_len=22, finish_reason='error', token_ids=[], logprobs=None, logits=None, las   t_hidden_state=None, index=0), Response(text='internal error happened', generate_token_len=0, input_token_len=22, finish_reason='error', token_ids=[], l   ogprobs=None, logits=None, last_hidden_state=None, index=1), Response(text='internal error happened', generate_token_len=0, input_token_len=23, finish_r   eason='error', token_ids=[], logprobs=None, logits=None, last_hidden_state=None, index=2)]

@yao-fengchen
Copy link
Collaborator

我们这里没有复现这个问题,可以通过以下环境变量打开日志看一下
另外,目前在310P上MHA模型建议使用eager_mode=True

export ASDOPS_LOG_TO_STDOUT=1
export ATB_LOG_TO_STDOUT=1
export ATB_LOG_LEVEL=INFO
export DICP_LOG_LEVEL=INFO
export ASCEND_GLOBAL_LOG_LEVEL=0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants