is deepspeed inference now supporting llama2 replace_with_kernel_inject=True? #4290
Unanswered
liveforfun
asked this question in
Q&A
Replies: 1 comment
-
Hi @liveforfun I am adding the support for this here: #4313 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I tried to inference with replace_with_kernel_inject=True option with llama2-70b.
but I got some errors.
as far as I know replace_with_kernel_inject=True this option is injecting the high-performance kernels.
but it might be not supporting now right?
Beta Was this translation helpful? Give feedback.
All reactions