Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ms-swift 3.1.0 python推理merged_lora出错求助 #3082

Open
shenxhui opened this issue Feb 12, 2025 · 3 comments
Open

ms-swift 3.1.0 python推理merged_lora出错求助 #3082

shenxhui opened this issue Feb 12, 2025 · 3 comments

Comments

@shenxhui
Copy link

背景:
安装了ms-swift 3.1.0,基于Qwen2.5-0.5B-Instruct采用LoRA方式微调模型。目前,根据教程,微调模型成功,Merge权重成功,命令行推理lora位置以及合并后模型都成功了。两个命令如下:
CUDA_VISIBLE_DEVICES=3 swift infer --ckpt_dir /DIR_LLM/shenhui/test_swift_0211/output2/v4-20250211-075427/checkpoint-309
CUDA_VISIBLE_DEVICES=3 swift infer --ckpt_dir /DIR_LLM/shenhui/test_swift_0211/output2/v3-20250211-072237/checkpoint-309-merged

python代码推理方式,训练后LoRA推理已经跑通,参照的是:https://swift.readthedocs.io/zh-cn/latest/Instruction/%E9%A2%84%E8%AE%AD%E7%BB%83%E4%B8%8E%E5%BE%AE%E8%B0%83.html#id5

问题:
现在我想采用python代码直接推理Merged后的模型,一直有问题。我参照的是:
https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo.py,只将其中的 “model = 'Qwen/Qwen2.5-1.5B-Instruct'”。改成了
model = "/DIR_LLM/shenhui/test_swift_0211/output2/v3-20250211-072237/checkpoint-309-merged",然而报错:

[INFO:swift] Successfully registered /home/dsremote/.conda/envs/sh_swift/lib/python3.10/site-packages/swift/llm/dataset/data/dataset_info.json

Image

我想跑通python代码推理Merged后的模型,该怎么修改呢?

@Jintao-Huang
Copy link
Collaborator

额外传入 model_type='xxx'

@Jintao-Huang
Copy link
Collaborator

在ptEngine那里

@shenxhui
Copy link
Author

在ptEngine那里

非常感谢指导。model定义merged后的地址,engine = PtEngine(model, model_type="qwen2_5", max_batch_size=64),增加model_type已经调通。

但我又发现一个现象,我微调的模型是:Qwen/Qwen2.5-0.5B-Instruct, model_type=无论设置成"qwen"、“qwen2”、“qwen2_5“都可以运行,这正常么?刚接触swift,请教加了model_type的作用是什么?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants