Whether to support single card multi-instance loading #66

dushulin · 2023-08-14T03:05:21Z

one 3090-24g gpu, load multi instance, like triton

hiworldwzj · 2023-08-14T03:25:06Z

@dushulin You can try to use Docker to load multi instance. Lightllm can not load multi instance.

dushulin · 2023-08-14T03:55:08Z

thks, we deploy models on k8s

XHPlus · 2023-08-15T02:04:53Z

If you deploy LightLLM with K8S, you can directly utilize the native load balance ability of K8S infra, such as starting multiple pods in K8S.

dushulin · 2023-08-15T02:36:32Z

@XHPlus yes, you are right, but If you want to load multiple models on one gpu to share their computing power, the cost of using k8s vgpu technology may be too high. If you can directly deploy multiple models on a single card, it will be lighter

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whether to support single card multi-instance loading #66

Whether to support single card multi-instance loading #66

dushulin commented Aug 14, 2023

hiworldwzj commented Aug 14, 2023

dushulin commented Aug 14, 2023

XHPlus commented Aug 15, 2023

dushulin commented Aug 15, 2023

Whether to support single card multi-instance loading #66

Whether to support single card multi-instance loading #66

Comments

dushulin commented Aug 14, 2023

hiworldwzj commented Aug 14, 2023

dushulin commented Aug 14, 2023

XHPlus commented Aug 15, 2023

dushulin commented Aug 15, 2023