Commit b1309e7
authored
Aoti support multi method (#14715)
This pull request introduces several improvements to the CUDA backend.
The main changes include adding a new graph pass to replace unnecessary
`slice_copy` operations, improving how method names are tracked in
compilation artifacts, and making the preprocessing pipeline more robust
and accurate.
**Key changes:**
### Graph optimization and preprocessing
* Introduced `ReplaceSliceCopyWithSlicePass`, a new export pass that
replaces non-mutated `slice_copy` operations with more efficient `slice`
view operations in the computational graph
(`replace_slice_copy_with_slice.py`, used in `cuda_backend.py`).
[[1]](diffhunk://#diff-c4a228b182f50f778545991d472609ad705d2325994342174093ff374738851dR1-R113)
[[2]](diffhunk://#diff-5b5ea2257772b3aba04b2534f5ea1429a0c631bfd25a7ef531f526e76c471d7aR115-R117)
* Added context management for attention kernel selection and no-grad
mode during AOT compilation to ensure correct backend selection for
decomposition. This is needed in the short term until we have a flash
attention cuda kernel.
### Method name and compile specification handling
* Added a `COMPILE_SPEC_KEYS` enum and utility methods
(`generate_method_name_compile_spec`, `method_name_from_compile_specs`)
to consistently embed and retrieve the method name in compile specs and
as a key in the data store, improving traceability of compiled
artifacts.
[[1]](diffhunk://#diff-5b5ea2257772b3aba04b2534f5ea1429a0c631bfd25a7ef531f526e76c471d7aL24-R35)
[[2]](diffhunk://#diff-5b5ea2257772b3aba04b2534f5ea1429a0c631bfd25a7ef531f526e76c471d7aL161-R158)
[[3]](diffhunk://#diff-5b5ea2257772b3aba04b2534f5ea1429a0c631bfd25a7ef531f526e76c471d7aR169-R195)
### Code cleanup and maintainability
* Minor refactor in `cuda_partitioner.py` to clarify delegation tag
assignment.
* Improved imports and code organization for clarity in
`cuda_backend.py`.
These changes collectively improve the reliability, performance, and
maintainability of the CUDA backend pipeline.1 parent 96dfa9c commit b1309e7
File tree
3 files changed
+166
-5
lines changed- backends/cuda
3 files changed
+166
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
14 | 18 | | |
15 | 19 | | |
16 | 20 | | |
| |||
21 | 25 | | |
22 | 26 | | |
23 | 27 | | |
24 | | - | |
| 28 | + | |
25 | 29 | | |
26 | 30 | | |
27 | 31 | | |
| |||
30 | 34 | | |
31 | 35 | | |
32 | 36 | | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
33 | 41 | | |
34 | 42 | | |
35 | 43 | | |
| |||
108 | 116 | | |
109 | 117 | | |
110 | 118 | | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
111 | 122 | | |
112 | 123 | | |
113 | 124 | | |
| |||
132 | 143 | | |
133 | 144 | | |
134 | 145 | | |
135 | | - | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
136 | 150 | | |
137 | 151 | | |
138 | 152 | | |
| |||
146 | 160 | | |
147 | 161 | | |
148 | 162 | | |
149 | | - | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
150 | 167 | | |
151 | 168 | | |
152 | 169 | | |
| |||
157 | 174 | | |
158 | 175 | | |
159 | 176 | | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| 47 | + | |
| 48 | + | |
47 | 49 | | |
48 | 50 | | |
49 | 51 | | |
50 | | - | |
51 | 52 | | |
52 | | - | |
| 53 | + | |
| 54 | + | |
53 | 55 | | |
54 | 56 | | |
55 | 57 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
0 commit comments