[Bug]: 基于Paddle 3.0.0b1 使用PaddleNLP，执行llama 7B 报错,AttributeError: module 'paddle.base.libpaddle.eager.ops.legacy' has no attribute 'c_identity'. Did you mean: 'npu_identity'? #9211

shang-mt · 2024-09-27T02:50:46Z

软件环境

- paddlepaddle:
- paddlepaddle-gpu: 3.0.0b1
- paddlenlp: https://github.com/ZHUI/PaddleNLP/tree/sci/benchmark

重复问题

I have searched the existing issues

错误描述

Traceback (most recent call last):
  File "/workspace/PaddleNLP/llm/run_pretrain.py", line 597, in <module>
    main()
  File "/workspace/PaddleNLP/llm/run_pretrain.py", line 575, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/workspace/PaddleNLP/paddlenlp/trainer/trainer.py", line 926, in train
    tr_loss_step = self.training_step(model, inputs)
  File "/workspace/PaddleNLP/paddlenlp/trainer/trainer.py", line 1950, in training_step
    return self.training_pipeline_step(model, inputs)
  File "/workspace/PaddleNLP/paddlenlp/trainer/trainer.py", line 2019, in training_pipeline_step
    loss = model.forward_backward_pipeline(inputs, self.scaler if self.do_grad_scaling else None)
  File "/usr/local/lib/python3.10/dist-packages/paddle/distributed/fleet/meta_parallel/pipeline_parallel.py", line 543, in forward_backward_pipeline
    output_tensor = self._forward_step(input_tensor, micro_dataset)
  File "/usr/local/lib/python3.10/dist-packages/paddle/distributed/fleet/meta_parallel/pipeline_parallel.py", line 810, in _forward_step
    output_tensor = self._layers.forward(input_tensor, chunk_id=chunk_id)
  File "/usr/local/lib/python3.10/dist-packages/paddle/distributed/fleet/meta_parallel/parallel_layers/pp_layers.py", line 809, in forward
    input = self.forward_function(0, len(self.run_function))(input)
  File "/usr/local/lib/python3.10/dist-packages/paddle/distributed/fleet/meta_parallel/parallel_layers/pp_layers.py", line 785, in execute_func
    x = layer(x)
  File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1426, in __call__
    return self.forward(*inputs, **kwargs)
  File "/workspace/PaddleNLP/paddlenlp/transformers/llama/modeling.py", line 1540, in forward
    logits = parallel_matmul(hidden_states, self.weight, tensor_parallel_output=tensor_parallel_output)
  File "/workspace/PaddleNLP/paddlenlp/transformers/llama/modeling.py", line 166, in parallel_matmul
    input_parallel = paddle.distributed.collective._c_identity(x, group=model_parallel_group)
  File "/usr/local/lib/python3.10/dist-packages/paddle/distributed/fleet/layers/mpu/mp_ops.py", line 100, in _c_identity
    return c_identity_eager.apply(tensor, group, skip_c_identity_dynamic)
  File "/usr/local/lib/python3.10/dist-packages/paddle/distributed/fleet/layers/mpu/mp_ops.py", line 34, in forward
    return _legacy_C_ops.c_identity(
AttributeError: module 'paddle.base.libpaddle.eager.ops.legacy' has no attribute 'c_identity'. Did you mean: 'npu_identity'?

稳定复现步骤 & 代码

bash run_dist.sh

{
"model_name_or_path": "facebook/llama-7b",
"tokenizer_name_or_path": "facebook/llama-7b",
"input_dir": "/workspace",
"output_dir": "/root/llama-7b",
"per_device_train_batch_size": 2,
"gradient_accumulation_steps": 256,
"per_device_eval_batch_size": 64,
"tensor_parallel_degree": 2,
"pipeline_parallel_degree": 2,
"pipeline_parallel_config": "disable_partial_send_recv",
"sharding_parallel_degree": -1,
"virtual_pp_degree": 1,
"sharding": "stage1",
"sequence_parallel": 1,
"adam_beta1": 0.9,
"adam_beta2": 0.95,
"use_flash_attention": true,
"use_fused_rms_norm": true,
"use_fused_rope": true,
"max_seq_length": 2048,
"learning_rate": 1e-04,
"initializer_range": 0.002,
"min_learning_rate": 1e-05,
"warmup_steps": 2000,
"logging_steps": 1,
"max_steps": 200000,
"save_steps": 200000,
"eval_steps": 2000,
"weight_decay": 0.1,
"max_grad_norm": 1.0,
"amp_master_grad": 1,
"fp16": true,
"fp16_opt_level": "O2",
"dataloader_num_workers": 1,
"continue_training": 0,
"do_train": true,
"do_eval": true,
"do_predict": true,
"disable_tqdm": true,
"recompute": false,
"distributed_dataloader": 0,
"recompute_granularity": "full",
"save_total_limit": 2,
"eval_accumulation_steps": 16
}

shang-mt · 2024-09-27T03:07:56Z

https://github.com/PaddlePaddle/Paddle/issues/68483，

在Paddle也提了issue

shang-mt added the bug Something isn't working label Sep 27, 2024

paddle-bot bot assigned DesmonDay Sep 27, 2024

shang-mt mentioned this issue Sep 27, 2024

跑llama碰到AttributeError: module 'paddle.base.libpaddle.eager.ops.legacy' has no attribute 'c_identity'. Did you mean: 'npu_identity'? PaddlePaddle/Paddle#68483

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: 基于Paddle 3.0.0b1 使用PaddleNLP，执行llama 7B 报错,AttributeError: module 'paddle.base.libpaddle.eager.ops.legacy' has no attribute 'c_identity'. Did you mean: 'npu_identity'? #9211

[Bug]: 基于Paddle 3.0.0b1 使用PaddleNLP，执行llama 7B 报错,AttributeError: module 'paddle.base.libpaddle.eager.ops.legacy' has no attribute 'c_identity'. Did you mean: 'npu_identity'? #9211

shang-mt commented Sep 27, 2024

shang-mt commented Sep 27, 2024

[Bug]: 基于Paddle 3.0.0b1 使用PaddleNLP，执行llama 7B 报错,AttributeError: module 'paddle.base.libpaddle.eager.ops.legacy' has no attribute 'c_identity'. Did you mean: 'npu_identity'? #9211

[Bug]: 基于Paddle 3.0.0b1 使用PaddleNLP，执行llama 7B 报错,AttributeError: module 'paddle.base.libpaddle.eager.ops.legacy' has no attribute 'c_identity'. Did you mean: 'npu_identity'? #9211

Comments

shang-mt commented Sep 27, 2024

软件环境

重复问题

错误描述

稳定复现步骤 & 代码

shang-mt commented Sep 27, 2024