Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Conversation

@dsikka
Copy link
Contributor

@dsikka dsikka commented Jan 29, 2024

… cases, handle kv_cache full during prefill (#1562)

* split prep_for_generation operator

* fix logits

* update non-kv cache pipeline and tests

* add tests to address edge cases

* add condition to check of kv_cache full during prompt inference, add test to cover this case, revert debugging changes

* fix typing

* remove commented code

* remove irrelevant condition

* address PR comments

(cherry picked from commit 7b028d4)
@bfineran bfineran merged commit 73d0471 into release/1.7 Jan 29, 2024
@bfineran bfineran deleted the cherry_pick_1.7 branch January 29, 2024 15:57
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants