-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Closed
Labels
good first issueGood for newcomersGood for newcomerssolvedThis problem has been already solvedThis problem has been already solved
Description
- 微调训练的时候开始就出现 b16 问题,直接改配置 yaml, 增加
fp16: false
一路无痛 - 当合并 Lora 进行 Chat 推理时问题会稍微麻烦点
llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
会有如下错误
/Users/mapix/miniconda/envs/llama-factory/lib/python3.10/site-packages/torch/nn/modules/module.py:2026: UserWarning: for base_model.model.model.layers.31.mlp.gate_proj.lora_A.default.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/Users/mapix/miniconda/envs/llama-factory/lib/python3.10/site-packages/torch/nn/modules/module.py:2026: UserWarning: for base_model.model.model.layers.31.mlp.gate_proj.lora_B.default.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/Users/mapix/miniconda/envs/llama-factory/lib/python3.10/site-packages/torch/nn/modules/module.py:2026: UserWarning: for base_model.model.model.layers.31.mlp.up_proj.lora_A.default.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/Users/mapix/miniconda/envs/llama-factory/lib/python3.10/site-packages/torch/nn/modules/module.py:2026: UserWarning: for base_model.model.model.layers.31.mlp.up_proj.lora_B.default.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/Users/mapix/miniconda/envs/llama-factory/lib/python3.10/site-packages/torch/nn/modules/module.py:2026: UserWarning: for base_model.model.model.layers.31.mlp.down_proj.lora_A.default.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/Users/mapix/miniconda/envs/llama-factory/lib/python3.10/site-packages/torch/nn/modules/module.py:2026: UserWarning: for base_model.model.model.layers.31.mlp.down_proj.lora_B.default.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
Traceback (most recent call last):
File "/Users/mapix/miniconda/envs/llama-factory/bin/llamafactory-cli", line 8, in <module>
sys.exit(main())
File "/Users/mapix/workspace/LLaMA-Factory/src/llamafactory/cli.py", line 81, in main
run_chat()
File "/Users/mapix/workspace/LLaMA-Factory/src/llamafactory/chat/chat_model.py", line 127, in run_chat
chat_model = ChatModel()
File "/Users/mapix/workspace/LLaMA-Factory/src/llamafactory/chat/chat_model.py", line 43, in __init__
self.engine: "BaseEngine" = HuggingfaceEngine(model_args, data_args, finetuning_args, generating_args)
File "/Users/mapix/workspace/LLaMA-Factory/src/llamafactory/chat/hf_engine.py", line 58, in __init__
self.model = load_model(
File "/Users/mapix/workspace/LLaMA-Factory/src/llamafactory/model/loader.py", line 160, in load_model
model = init_adapter(config, model, model_args, finetuning_args, is_trainable)
File "/Users/mapix/workspace/LLaMA-Factory/src/llamafactory/model/adapter.py", line 301, in init_adapter
model = _setup_lora_tuning(
File "/Users/mapix/workspace/LLaMA-Factory/src/llamafactory/model/adapter.py", line 191, in _setup_lora_tuning
model: "LoraModel" = PeftModel.from_pretrained(model, adapter, **init_kwargs)
File "/Users/mapix/miniconda/envs/llama-factory/lib/python3.10/site-packages/peft/peft_model.py", line 475, in from_pretrained
model.load_adapter(
File "/Users/mapix/miniconda/envs/llama-factory/lib/python3.10/site-packages/peft/peft_model.py", line 1076, in load_adapter
self._update_offload(offload_index, adapters_weights)
File "/Users/mapix/miniconda/envs/llama-factory/lib/python3.10/site-packages/peft/peft_model.py", line 957, in _update_offload
safe_module = dict(self.named_modules())[extended_prefix]
KeyError: 'base_model.model.model.model.layers.10.input_layernorm'
2.1 首先很多这样的 UserWarning 要么直接全局禁 Warning,或者你的本地内存足够大的时候可以直接去掉 offline 内存的逻辑, 在配置 yaml 中添加 low_cpu_mem_usage: false
,于是这部分 warning 消失。
2.2 至于这个 KeyError,不是很清楚不过根据这个出错看 model 这个单词连着拼了三遍,发现 dict 中都是两个,直接改源码绕过 peft/peft_model.py
#extended_prefix = prefix + block_id + safe_key[:suffix_pos]
extended_prefix = prefix + safe_key[:suffix_pos]
2.3 进一步会出现 MPS 兼容问题,在命令行设置环境变量可以解决。
NotImplementedError: The operator 'aten::isin.Tensor_Tensor_out' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
最终命令
PYTORCH_ENABLE_MPS_FALLBACK=1 llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
不是太确定是否是因为缓存问题,当我想回头记录一下的时候会退代码并没有复现,先记录一下防止有人踩同样的坑。
xtchen96, l1006986533, yanshanwang, kahlun and qinantong
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomerssolvedThis problem has been already solvedThis problem has been already solved