Skip to content

Disable KV cache shifting automatically for unsupported models#11053

Merged
ggerganov merged 2 commits intoggml-org:masterfrom
MollySophia:context-shift
Jan 3, 2025
Merged

Disable KV cache shifting automatically for unsupported models#11053
ggerganov merged 2 commits intoggml-org:masterfrom
MollySophia:context-shift

Conversation

@MollySophia
Copy link
Collaborator

Disable KV cache shifting automatically for unsupported models instead of exiting directly.

This makes it easier for models that doesn't support KV cache shifting.
Currently in arg.cpp --no-context-shift is only enabled in LLAMA_EXAMPLE_MAIN, LLAMA_EXAMPLE_SERVER, LLAMA_EXAMPLE_IMATRIX, LLAMA_EXAMPLE_PERPLEXITY. As a result, for example, using llama-parallel with recurrent models will fail with message indicating that context-shift is not supported. But --no-context-shift isn't an available parameter for llama-parallel.

instead of exiting directly

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
@ggerganov ggerganov merged commit 4b0c638 into ggml-org:master Jan 3, 2025
47 checks passed
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
…ls (ggml-org#11053)

* Disable KV cache shifting automatically for unsupported models

instead of exiting directly

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* Update common/common.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants