-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Huawei NPU device_map=auto doesn't split model evenly over all devices #31
Comments
请使用最新版本的cann和torch_npu,如果不行,请提供具体报错信息。 |
环境:910B4,32G*2 代码如下,我使用官方示例加载qwen1.5-32B-chat,发现模型权重没有平均加载到两张卡上,而是每张卡都加载了一遍,导致显存不足,帮忙看下
|
我是这么解决的 export ASCEND_LAUNCH_BLOCKING=1 PyTorch训练或在线推理场景,可通过此环境变量控制算子执行时是否启动同步模式。当设置为“1”时,强制算子采用同步模式运行,从而更容易地调试和跟踪代码中的问题。设置为“0”时则会采用异步方式执行。 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
huggingface/accelerate#2368
这个地址说4月底能够解决,目前好像还是不行,不知道是不是解决了?
The text was updated successfully, but these errors were encountered: