Commit a51c9c6
committed
[Runtime] Support PagedKVCache with tree attention
This PR introduces the tree attention to PagedKVCache. With this
feature, now the KV cache is ready for tree attention cases such as
speculative decoding trees.
This PR adds tree attention tests to test the correctness.
The changes in this PR to KVState interface are backward compatible.1 parent 1eac178 commit a51c9c6
File tree
5 files changed
+1083
-134
lines changed- src/runtime/relax_vm
- tests/python/relax
5 files changed
+1083
-134
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | | - | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
44 | 55 | | |
45 | 56 | | |
46 | 57 | | |
47 | 58 | | |
48 | 59 | | |
49 | 60 | | |
| 61 | + | |
| 62 | + | |
50 | 63 | | |
51 | 64 | | |
52 | 65 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
92 | 95 | | |
93 | | - | |
| 96 | + | |
| 97 | + | |
94 | 98 | | |
95 | 99 | | |
96 | 100 | | |
| |||
142 | 146 | | |
143 | 147 | | |
144 | 148 | | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
145 | 158 | | |
146 | 159 | | |
147 | 160 | | |
| |||
0 commit comments