-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qwen2VL+Lora微调后合并模型丢失文件bug #5749
Comments
哪怕我把chat_template.json复制到了微调后模型后依然报错,怀疑是整个Processor保存有问题,只能通过processor.chat_template=xx强行设定才行 |
同样问题。现在只能用预训练的qwen/Qwen2-VL-7B-Instruct的processor,然后weight用自己的local file |
我现在用的方法如下,运行后看起来没问题,不过不知道对不对,是否正确加载了微调后的权重。 processor = AutoProcessor.from_pretrained("sft_dir")
processor.chat_template = "{% set image_count = namespace(value=0) %}{% set video_count = namespace(value=0) %}{% for message in messages %}{% if loop.first and message['role'] != 'system' %}<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n{% endif %}<|im_start|>{{ message['role'] }}\n{% if message['content'] is string %}{{ message['content'] }}<|im_end|>\n{% else %}{% for content in message['content'] %}{% if content['type'] == 'image' or 'image' in content or 'image_url' in content %}{% set image_count.value = image_count.value + 1 %}{% if add_vision_id %}Picture {{ image_count.value }}: {% endif %}<|vision_start|><|image_pad|><|vision_end|>{% elif content['type'] == 'video' or 'video' in content %}{% set video_count.value = video_count.value + 1 %}{% if add_vision_id %}Video {{ video_count.value }}: {% endif %}<|vision_start|><|video_pad|><|vision_end|>{% elif 'text' in content %}{{ content['text'] }}{% endif %}{% endfor %}<|im_end|>\n{% endif %}{% endfor %}{% if add_generation_prompt %}<|im_start|>assistant\n{% endif %}" |
这样也可以,你这个和权重没关系,权重是用automodel加载 |
好的感谢 |
感觉你还是open比较好,这个是一个明显的bug |
same issue |
same issue |
It has been fixed,and i have try new version of llamafactory for two-stage sft and there is no problems |
用lora训练完的模型,merge以后用官方代码推理:processor = AutoProcessor.from_pretrained(model_path) 还是报错Exception: data did not match any variant of untagged enum ModelWrapper at line 757371 column 3 |
我也碰到过这个问题,后来按照qwen2vl官网的安装引导重新开了一个环境,好像就没这个问题了。我的transformers版本是4.45.0,不知道能不能解决您的问题 |
感谢回复,我的transformers版本也是4.45.0.dev0,用官方代码测试官方的模型正常推理,把模型换成lora微调后的模型就出现这个问题了,看报错是这句processor = AutoProcessor.from_pretrained(model_path)出现的问题。同时看合并以后的文件是有chat_template.json的 |
请问这个解决了吗? |
Reminder
System Info
[2024-10-19 11:31:58,444] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)
llamafactory
version: 0.9.1.dev0Reproduction
按照官网执行Qwen2VL+Lora微调代码
llamafactory-cli train examples/train_lora/qwen2vl_lora_sft.yaml llamafactory-cli export examples/merge_lora/qwen2vl_lora_sft.yaml
其中的qwen2vl_lora_sft.yaml如下
qwen2vl_lora_sft.yaml如下
Expected behavior
Bug
微调过程没有出错,在合并lora的时候也没有报错,但是观察原始文件和微调模型文件发现丢失chat_template.json,具体如下
经过两张图的对比,发现合并后的文件夹中丢失了以下文件
多了以下文件
其中的
chat_template.json
是很重要的文件,模型缺失了这个就无法推理报错官方推理的推理代码
报错如下
Others
额外信息
The text was updated successfully, but these errors were encountered: