Skip to content

Conversation

@MasterJH5574
Copy link
Contributor

This PR introduces the tree attention to PagedKVCache. With this feature, now the KV cache is ready for tree attention cases such as speculative decoding trees.

This PR adds tree attention tests to test the correctness.

The changes in this PR to KVState interface are backward compatible.

@MasterJH5574 MasterJH5574 force-pushed the tvm-dev/2024-05-31-paged-kv-tree-attn branch from a51c9c6 to 2ba8a69 Compare May 31, 2024 14:52
This PR introduces the tree attention to PagedKVCache. With this
feature, now the KV cache is ready for tree attention cases such as
speculative decoding trees.

This PR adds tree attention tests to test the correctness.

The changes in this PR to KVState interface are backward compatible.
@MasterJH5574 MasterJH5574 force-pushed the tvm-dev/2024-05-31-paged-kv-tree-attn branch from 2ba8a69 to f497859 Compare May 31, 2024 15:11
@tqchen
Copy link
Member

tqchen commented Jun 1, 2024

@tvm-bot rerun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants