Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于显存申请基本单位改为MiB但不起作用的问题 #217

Open
harrymore opened this issue Nov 10, 2023 · 0 comments
Open

关于显存申请基本单位改为MiB但不起作用的问题 #217

harrymore opened this issue Nov 10, 2023 · 0 comments

Comments

@harrymore
Copy link

harrymore commented Nov 10, 2023

修改了gpushare-device-plugin-ds 中的 --memory-unit=MiB,重启之后,使用kubectl inspect gpushare命令显示如下:

NAME IPADDRESS GPU0(Allocated/Total) GPU Memory(MiB)
k8s-node0 192.168.1.3 0/6144 0/6144
k8s-node1 192.168.1.4 0/8192 0/8192
k8s-node2 192.168.1.5 0/6144 0/6144

Allocated/Total GPU Memory In Cluster:
0/20480 (0%)

然后我在statefulset中申请的时候,发现基本单位还是GiB,只是显示的时候是MiB,譬如无法申请类似3000这样的大数,会报"0/3 nodes are available: 3 Insufficient GPU Memory in one device.",而是只能使用3进行申请,最后申请成功,显示:

NAME IPADDRESS GPU0(Allocated/Total) GPU Memory(MiB)
k8s-node0 192.168.1.3 0/6144 0/6144
k8s-node1 192.168.1.4 0/8192 3/8192
k8s-node2 192.168.1.5 0/6144 0/6144

Allocated/Total GPU Memory In Cluster:
3/20480 (0%)

请问这是什么问题?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant