Skip to content

n_keep parameter #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
GameOverFlowChart opened this issue May 3, 2024 · 6 comments
Closed

n_keep parameter #19

GameOverFlowChart opened this issue May 3, 2024 · 6 comments

Comments

@GameOverFlowChart
Copy link

GameOverFlowChart commented May 3, 2024

I don't know if llama.rn exposes that parameter, if so could we add the option to set n_keep (and possibly also n_discard ?). The main reason I'm asking is this:
ggml-org/llama.cpp#3440 (comment)

@Vali-98
Copy link
Owner

Vali-98 commented May 4, 2024

This value is not exposed by default, so no.

I will update this issue if that changes.

@Vali-98
Copy link
Owner

Vali-98 commented Aug 6, 2024

Technically speaking, n_keep is always set to 4 now, even in llama.rn.

Is this feature still necessary? @GameOverFlowChart

@GameOverFlowChart
Copy link
Author

Technically speaking, n_keep is always set to 4 now, even in llama.rn.

Is this feature still necessary? @GameOverFlowChart

I think it would be still useful, for example to ensure that the system message is always kept. If I understand it correctly, 4 is like the minimum to have the speed gain but setting it higher should also work.
But having it on 4 is better than keeping nothing for sure (I guess that was the case before?)

@Vali-98
Copy link
Owner

Vali-98 commented Aug 7, 2024

I think it would be still useful, for example to ensure that the system message is always kept.

This actually isn't necessary, as CUI controls the prompt building JS side, not via lcpp.

All prompts are always:

[system][examples if context has space][context until context limit]

Examples and older context are phased out as the limit is reached, and context shift ensures that this doesn't incur a full reprocess.

Is there any other reason to implement n_keep?

@GameOverFlowChart
Copy link
Author

GameOverFlowChart commented Aug 7, 2024

I think it would be still useful, for example to ensure that the system message is always kept.

This actually isn't necessary, as CUI controls the prompt building JS side, not via lcpp.

All prompts are always:

[system][examples if context has space][context until context limit]

Examples and older context are phased out as the limit is reached, and context shift ensures that this doesn't incur a full reprocess.

Is there any other reason to implement n_keep?

Wait so system message is already always kept? So the n_keep of 4 means the first 4 tokens is from the examples or the rest of the context? I think if the system message is already kept than we don't even need n_keep to be 4 it could be 0 or am I missing something? Except for the case that there is no system message (I don't think a system message between 1 and 4 would ever make sense).

And yes if the system message is always kept, I wouldn't need that parameter.

Edit: I hope the system suffix is kept together with the message and instruct tokens are also considered in the calculations for the context.

@Vali-98
Copy link
Owner

Vali-98 commented Aug 8, 2024

Edit: I hope the system suffix is kept together with the message and instruct tokens are also considered in the calculations for the context.

Yes this is the case, I think we can close this issue as n_keep relies on llama.cpp for handling prompt building which is not used.

@Vali-98 Vali-98 closed this as completed Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants