You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.
In one of my test applications, I use an InferenceSession to load in a prompt that I later reuse. However, I realised while doing this that you can't actually clone an InferenceSession in memory (and I think it should be possible?), so I had to serialize the session to a Vec<u8> and rehydrate it when I needed to infer from it.
I think this should be easy enough to fix, but we should check that there aren't any weird assumptions that we're violating if we do so. (I assume this would also allocate another ctx, but that should be fine)
The text was updated successfully, but these errors were encountered:
Yup, I don't see any problems here (other than this just hasn't been implemented yet) 😄
This might require some careful handling of the underlying ggml context. Make sure a new context is allocated and any tensor data is copied over to the new context. For this, you might need some C-like pointer fiddling and maybe expose a few more GGML functions. Simply cloning the pointers would result in the wrong behavior, and most likely UB. But from your question I think you already accounted for that 👍
In one of my test applications, I use an
InferenceSession
to load in a prompt that I later reuse. However, I realised while doing this that you can't actually clone anInferenceSession
in memory (and I think it should be possible?), so I had to serialize the session to aVec<u8>
and rehydrate it when I needed to infer from it.I think this should be easy enough to fix, but we should check that there aren't any weird assumptions that we're violating if we do so. (I assume this would also allocate another
ctx
, but that should be fine)The text was updated successfully, but these errors were encountered: