Commit 09eac16
authored
[aoti-et] Enable multimodal runner for Voxtral on CUDA (#14980)
This pull request introduces changes to the CUDA workflow, model
artifact handling, and multimodal runner logic. The main changes include
restructuring the GitHub Actions workflow to separate model export,
benchmarking, and end-to-end testing for the Voxtral CUDA pipeline,
improving artifact management and reproducibility. Additionally, the
multimodal runner now supports automatic conversion of audio tensors to
bfloat16, ensuring compatibility with expected input types. There are
also enhancements to caching and symbol registration in the CUDA
backend, and build system updates to support linking the CUDA backend.
**Workflow and Artifact Management Improvements:**
* Refactored `.github/workflows/cuda.yml` to split the Voxtral CUDA
pipeline into three jobs: `export-voxtral-cuda-artifact` (exports and
stores model artifacts), `benchmark-voxtral-cuda` (benchmarks using
exported artifacts), and `test-voxtral-cuda-e2e` (runs full end-to-end
tests with artifact download and audio input). Improved artifact
handling, reproducibility, and added explicit checks for required files.
[[1]](diffhunk://#diff-29abea04e0613c2569973e5c8e3c89e04846d408c855eeb1f3efcfae7cfa6f89L90-R91)
[[2]](diffhunk://#diff-29abea04e0613c2569973e5c8e3c89e04846d408c855eeb1f3efcfae7cfa6f89R107)
[[3]](diffhunk://#diff-29abea04e0613c2569973e5c8e3c89e04846d408c855eeb1f3efcfae7cfa6f89R134-R185)
[[4]](diffhunk://#diff-29abea04e0613c2569973e5c8e3c89e04846d408c855eeb1f3efcfae7cfa6f89R196-R267)
[[5]](diffhunk://#diff-29abea04e0613c2569973e5c8e3c89e04846d408c855eeb1f3efcfae7cfa6f89R122)
**Multimodal Runner Logic:**
* Added automatic conversion of audio tensors to bfloat16 in
`MultimodalPrefiller::prefill` and implemented a helper function
`convert_to_bfloat16` in `util.h` to support this. This ensures that
audio inputs match the expected dtype for the encoder, improving
robustness for multimodal inference.
[[1]](diffhunk://#diff-ad4fcb32ffc5f1f7b4f87b5ee58927cb948a8c0976295befd10e3de445913ae4L96-R136)
[[2]](diffhunk://#diff-db4801445eaa3bb4f1370fe41d3a00ae2e3ef354a23ad4d5ace141ecc3c6f413R144-R180)
**CUDA Backend and Caching Enhancements:**
* Improved caching logic in `common_shims.cpp` for tensor strides and
sizes by validating cached values and updating them when necessary. This
prevents stale cache issues and ensures correct tensor metadata.
[[1]](diffhunk://#diff-1e7c9d572d434c9a85c9d466e7f406877bc974a373c370fe7ddb3fe32852c1f2R54-R81)
[[2]](diffhunk://#diff-1e7c9d572d434c9a85c9d466e7f406877bc974a373c370fe7ddb3fe32852c1f2R104-R130)
* Added dynamic symbol re-registration in `CudaBackend` to handle
multiple shared objects in the same process, ensuring correct execution
when switching between models.
* Removed redundant logging statements in CUDA backend for cleaner
output.
[[1]](diffhunk://#diff-a4b17eccf1aa933837671c5184e02bc815d934a362344bb2b17b789cdfaa5375L226)
[[2]](diffhunk://#diff-a4b17eccf1aa933837671c5184e02bc815d934a362344bb2b17b789cdfaa5375L256)
**Build System Updates:**
* Updated `CMakeLists.txt` and `executorch-config.cmake` to include and
link the CUDA backend (`aoti_cuda`) when building Voxtral and other
components, improving build flexibility and CUDA support.
[[1]](diffhunk://#diff-606feb24310595f592d98d021a2c90618346977d94decb80b35b7e26ed8ccc1eR89-R95)
[[2]](diffhunk://#diff-6a78a155992483ff6f35d595ff6cef63b477d1c853f6482e77acae6ef443f0e4R56)
**Debugging and Tuning Options:**
* Added support for enabling debug compilation in `cuda_backend.py` via
the `DEBUG` environment variable, allowing easier troubleshooting and
development.1 parent 7533df6 commit 09eac16
File tree
11 files changed
+374
-30
lines changed- .github/workflows
- backends
- aoti
- cuda/runtime
- examples/models/voxtral
- extension/llm/runner
- test
- tools/cmake
11 files changed
+374
-30
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
87 | 87 | | |
88 | 88 | | |
89 | 89 | | |
90 | | - | |
91 | | - | |
| 90 | + | |
| 91 | + | |
92 | 92 | | |
93 | 93 | | |
94 | 94 | | |
| |||
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
| 107 | + | |
107 | 108 | | |
108 | 109 | | |
109 | 110 | | |
| |||
118 | 119 | | |
119 | 120 | | |
120 | 121 | | |
| 122 | + | |
121 | 123 | | |
122 | 124 | | |
123 | 125 | | |
| |||
129 | 131 | | |
130 | 132 | | |
131 | 133 | | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
132 | 143 | | |
133 | 144 | | |
134 | | - | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
135 | 186 | | |
136 | 187 | | |
137 | 188 | | |
| |||
142 | 193 | | |
143 | 194 | | |
144 | 195 | | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
145 | 258 | | |
146 | | - | |
147 | 259 | | |
148 | 260 | | |
149 | | - | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
150 | 268 | | |
151 | 269 | | |
152 | 270 | | |
153 | 271 | | |
154 | 272 | | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | | - | |
165 | | - | |
166 | | - | |
167 | | - | |
168 | | - | |
169 | | - | |
| 273 | + | |
| 274 | + | |
170 | 275 | | |
171 | 276 | | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
172 | 282 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
| 54 | + | |
| 55 | + | |
54 | 56 | | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
55 | 72 | | |
56 | 73 | | |
57 | 74 | | |
58 | 75 | | |
59 | 76 | | |
60 | | - | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
61 | 80 | | |
62 | 81 | | |
63 | 82 | | |
| |||
80 | 99 | | |
81 | 100 | | |
82 | 101 | | |
| 102 | + | |
| 103 | + | |
83 | 104 | | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
84 | 120 | | |
85 | 121 | | |
86 | 122 | | |
87 | 123 | | |
88 | 124 | | |
89 | | - | |
| 125 | + | |
| 126 | + | |
90 | 127 | | |
91 | 128 | | |
92 | 129 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
165 | 165 | | |
166 | 166 | | |
167 | 167 | | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
168 | 176 | | |
169 | 177 | | |
170 | 178 | | |
| |||
223 | 231 | | |
224 | 232 | | |
225 | 233 | | |
226 | | - | |
227 | 234 | | |
228 | 235 | | |
229 | 236 | | |
| |||
253 | 260 | | |
254 | 261 | | |
255 | 262 | | |
256 | | - | |
257 | 263 | | |
258 | 264 | | |
259 | 265 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
89 | 96 | | |
90 | 97 | | |
91 | 98 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
39 | 62 | | |
40 | 63 | | |
41 | 64 | | |
| |||
56 | 79 | | |
57 | 80 | | |
58 | 81 | | |
| 82 | + | |
| 83 | + | |
59 | 84 | | |
60 | 85 | | |
61 | 86 | | |
| |||
64 | 89 | | |
65 | 90 | | |
66 | 91 | | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
67 | 112 | | |
68 | 113 | | |
69 | 114 | | |
| |||
88 | 133 | | |
89 | 134 | | |
90 | 135 | | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
91 | 142 | | |
92 | 143 | | |
93 | 144 | | |
| |||
0 commit comments