Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset context instead of quitting in interactive mode #145

Closed
avada-z opened this issue Mar 14, 2023 · 5 comments
Closed

Reset context instead of quitting in interactive mode #145

avada-z opened this issue Mar 14, 2023 · 5 comments
Labels
duplicate This issue or pull request already exists enhancement New feature or request

Comments

@avada-z
Copy link

avada-z commented Mar 14, 2023

It's really annoying that I have to restart the program every time it quits by [end of text] or exceeding context limits, as I need to reload model, which is inefficient.
Is there any way to add an option that instead of quitting just resets to the initial prompt?

@zzhkikyou
Copy link

@ggerganov Please,Thanks!

@aratic
Copy link

aratic commented Mar 15, 2023

it's currently on remaining_tokens , the -n parameter, main verb is on llama_tokenize
not sure where the context reset is.

@aratic
Copy link

aratic commented Mar 15, 2023

think this relates to #23 but not exactly the same

@gjmulder gjmulder added the enhancement New feature or request label Mar 15, 2023
@j3k0
Copy link
Contributor

j3k0 commented Mar 16, 2023

I'm on an attempt to fix the [end of text] issue:

diff --git a/main.cpp b/main.cpp
index ca0fca8..126e53f 100644
--- a/main.cpp
+++ b/main.cpp
@@ -991,6 +991,14 @@ int main(int argc, char ** argv) {
             fflush(stdout);
         }

+        // check for [end of text]
+        if (params.interactive && embd.back() == 2) {
+            fprintf(stderr, " [end of text]\n");
+            // insert the antiprompt to continue the conversation.
+            // however, after this it seems like everything was lost.
+            embd_inp.insert(embd_inp.end(), antiprompt_inp.begin(), antiprompt_inp.end());
+        }
+
         // in interactive mode, and not currently processing queued inputs;
         // check if we should prompt the user for more
         if (params.interactive && embd_inp.size() <= input_consumed) {
@@ -1037,7 +1045,7 @@ int main(int argc, char ** argv) {
         }

         // end of text token
-        if (embd.back() == 2) {
+        if (!params.interactive && embd.back() == 2) {
             fprintf(stderr, " [end of text]\n");
             break;
         }

However, it looks like the context is broken when resuming the conversation after the [end of text]... It's like starting a fresh interaction. So I guess being able to restore the context will be necessary to complete this.

@jart
Copy link
Contributor

jart commented Mar 16, 2023

Thank you for using llama.cpp and thank you for sharing your feature request. While you've provided valuable feedback on UX improvements, it overlaps a lot with what's being discussed in #23, and right now my top priority is to solve this issue by fixing the underlying technical issue described in #91. Please use those other issues for further discussion, which will reach a broader audience of folks following it.

@jart jart closed this as completed Mar 16, 2023
@jart jart added the duplicate This issue or pull request already exists label Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants