We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
背景: 安装了ms-swift 3.1.0,基于Qwen2.5-0.5B-Instruct采用LoRA方式微调模型。目前,根据教程,微调模型成功,Merge权重成功,命令行推理lora位置以及合并后模型都成功了。两个命令如下: CUDA_VISIBLE_DEVICES=3 swift infer --ckpt_dir /DIR_LLM/shenhui/test_swift_0211/output2/v4-20250211-075427/checkpoint-309 CUDA_VISIBLE_DEVICES=3 swift infer --ckpt_dir /DIR_LLM/shenhui/test_swift_0211/output2/v3-20250211-072237/checkpoint-309-merged
python代码推理方式,训练后LoRA推理已经跑通,参照的是:https://swift.readthedocs.io/zh-cn/latest/Instruction/%E9%A2%84%E8%AE%AD%E7%BB%83%E4%B8%8E%E5%BE%AE%E8%B0%83.html#id5
问题: 现在我想采用python代码直接推理Merged后的模型,一直有问题。我参照的是: https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo.py,只将其中的 “model = 'Qwen/Qwen2.5-1.5B-Instruct'”。改成了 model = "/DIR_LLM/shenhui/test_swift_0211/output2/v3-20250211-072237/checkpoint-309-merged",然而报错:
[INFO:swift] Successfully registered /home/dsremote/.conda/envs/sh_swift/lib/python3.10/site-packages/swift/llm/dataset/data/dataset_info.json
/home/dsremote/.conda/envs/sh_swift/lib/python3.10/site-packages/swift/llm/dataset/data/dataset_info.json
我想跑通python代码推理Merged后的模型,该怎么修改呢?
The text was updated successfully, but these errors were encountered:
额外传入 model_type='xxx'
Sorry, something went wrong.
在ptEngine那里
非常感谢指导。model定义merged后的地址,engine = PtEngine(model, model_type="qwen2_5", max_batch_size=64),增加model_type已经调通。
但我又发现一个现象,我微调的模型是:Qwen/Qwen2.5-0.5B-Instruct, model_type=无论设置成"qwen"、“qwen2”、“qwen2_5“都可以运行,这正常么?刚接触swift,请教加了model_type的作用是什么?
No branches or pull requests
背景:
安装了ms-swift 3.1.0,基于Qwen2.5-0.5B-Instruct采用LoRA方式微调模型。目前,根据教程,微调模型成功,Merge权重成功,命令行推理lora位置以及合并后模型都成功了。两个命令如下:
CUDA_VISIBLE_DEVICES=3 swift infer --ckpt_dir /DIR_LLM/shenhui/test_swift_0211/output2/v4-20250211-075427/checkpoint-309
CUDA_VISIBLE_DEVICES=3 swift infer --ckpt_dir /DIR_LLM/shenhui/test_swift_0211/output2/v3-20250211-072237/checkpoint-309-merged
python代码推理方式,训练后LoRA推理已经跑通,参照的是:https://swift.readthedocs.io/zh-cn/latest/Instruction/%E9%A2%84%E8%AE%AD%E7%BB%83%E4%B8%8E%E5%BE%AE%E8%B0%83.html#id5
问题:
现在我想采用python代码直接推理Merged后的模型,一直有问题。我参照的是:
https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo.py,只将其中的 “model = 'Qwen/Qwen2.5-1.5B-Instruct'”。改成了
model = "/DIR_LLM/shenhui/test_swift_0211/output2/v3-20250211-072237/checkpoint-309-merged",然而报错:
[INFO:swift] Successfully registered
/home/dsremote/.conda/envs/sh_swift/lib/python3.10/site-packages/swift/llm/dataset/data/dataset_info.json
我想跑通python代码推理Merged后的模型,该怎么修改呢?
The text was updated successfully, but these errors were encountered: