-
-
Notifications
You must be signed in to change notification settings - Fork 96
n_keep parameter #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This value is not exposed by default, so no. I will update this issue if that changes. |
Technically speaking, Is this feature still necessary? @GameOverFlowChart |
I think it would be still useful, for example to ensure that the system message is always kept. If I understand it correctly, 4 is like the minimum to have the speed gain but setting it higher should also work. |
This actually isn't necessary, as CUI controls the prompt building JS side, not via lcpp. All prompts are always:
Examples and older context are phased out as the limit is reached, and context shift ensures that this doesn't incur a full reprocess. Is there any other reason to implement n_keep? |
Wait so system message is already always kept? So the n_keep of 4 means the first 4 tokens is from the examples or the rest of the context? I think if the system message is already kept than we don't even need n_keep to be 4 it could be 0 or am I missing something? Except for the case that there is no system message (I don't think a system message between 1 and 4 would ever make sense). And yes if the system message is always kept, I wouldn't need that parameter. Edit: I hope the system suffix is kept together with the message and instruct tokens are also considered in the calculations for the context. |
Yes this is the case, I think we can close this issue as n_keep relies on llama.cpp for handling prompt building which is not used. |
I don't know if llama.rn exposes that parameter, if so could we add the option to set n_keep (and possibly also n_discard ?). The main reason I'm asking is this:
ggml-org/llama.cpp#3440 (comment)
The text was updated successfully, but these errors were encountered: