Commit c076a02
[TRTLLM-4629] [feat] Add support of CUDA13 and sm103 devices (#7568)
Signed-off-by: Xiwen Yu <[email protected]>
Signed-off-by: Tian Zheng <[email protected]>
Signed-off-by: Daniel Stokes <[email protected]>
Signed-off-by: Zhanrui Sun <[email protected]>
Signed-off-by: Xiwen Yu <[email protected]>
Signed-off-by: Jiagan Cheng <[email protected]>
Signed-off-by: Yiqing Yan <[email protected]>
Signed-off-by: Bo Deng <[email protected]>
Signed-off-by: ZhanruiSunCh <[email protected]>
Signed-off-by: xiweny <[email protected]>
Co-authored-by: Tian Zheng <[email protected]>
Co-authored-by: Daniel Stokes <[email protected]>
Co-authored-by: Zhanrui Sun <[email protected]>
Co-authored-by: Jiagan Cheng <[email protected]>
Co-authored-by: Yiqing Yan <[email protected]>
Co-authored-by: Bo Deng <[email protected]>
Co-authored-by: Zhanrui Sun <[email protected]>1 parent 809c4d2 commit c076a02
File tree
97 files changed
+1112
-511
lines changed- 3rdparty
- cpp
- cmake/modules
- include/tensorrt_llm
- common
- deep_gemm
- kernels/fmha_v2
- tensorrt_llm
- common
- cutlass_extensions/include/cutlass_extensions
- deep_ep
- executor
- kernels
- contextFusedMultiHeadAttention
- cutlass_kernels
- fp4_gemm
- fp8_blockscale_gemm
- fpA_intB_gemm
- launchers
- include
- moe_gemm
- launchers
- python
- decoderMaskedMultiheadAttention
- internal_cutlass_kernels/include
- speculativeDecoding
- runtime
- moeLoadBalancer
- utils
- thop
- tests/unit_tests/kernels
- docker
- common
- jenkins
- scripts
- scripts
- tensorrt_llm
- _torch
- auto_deploy
- custom_ops
- models/patches
- models
- modules
- fused_moe
- pyexecutor
- tests
- integration
- defs
- accuracy
- disaggregated/test_configs
- test_lists/test-db
- unittest
- _torch
- auto_deploy/unit/singlegpu
- misc
- modeling
- modules
- multi_gpu
- thop/parallel
- trt/attention
- utils
- triton_backend/inflight_batcher_llm
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
97 files changed
+1112
-511
lines changed- README.md+4-2
- csrc/apis/gemm.hpp+471
- csrc/apis/layout.hpp+85
- csrc/apis/runtime.hpp+28
- csrc/jit/compiler.hpp+6-4
- csrc/jit/device_runtime.hpp+4-2
- csrc/jit/handle.hpp+1-1
- csrc/jit/kernel_runtime.hpp+2-2
- csrc/jit_kernels/heuristics/common.hpp+6-3
- csrc/jit_kernels/heuristics/sm100.hpp+2-2
- csrc/jit_kernels/heuristics/sm90.hpp+7-3
- csrc/jit_kernels/impls/sm100_bf16_gemm.hpp+143
- csrc/jit_kernels/impls/sm100_fp8_gemm_1d1d.hpp+3-2
- csrc/jit_kernels/impls/sm100_fp8_gemm_1d2d.hpp+3-2
- csrc/jit_kernels/impls/sm90_bf16_gemm.hpp+229
- csrc/jit_kernels/impls/sm90_fp8_gemm_1d2d.hpp+3-2
- csrc/jit_kernels/impls/smxx_layout.hpp+55-8
- csrc/python_api.cpp+6-399
- csrc/utils/exception.hpp+10-3
- deep_gemm/__init__.py+38-10
- deep_gemm/include/deep_gemm/common/scheduler.cuh+6-5
- deep_gemm/include/deep_gemm/common/sm90_utils.cuh+76
- deep_gemm/include/deep_gemm/common/utils.cuh+18
- deep_gemm/include/deep_gemm/impls/sm100_bf16_gemm.cuh+495-1
- deep_gemm/include/deep_gemm/impls/sm100_fp8_gemm_1d1d.cuh+3-4
- deep_gemm/include/deep_gemm/impls/sm100_fp8_gemm_1d2d.cuh+8-5
- deep_gemm/include/deep_gemm/impls/sm90_bf16_gemm.cuh+341-1
- deep_gemm/include/deep_gemm/impls/sm90_fp8_gemm_1d2d.cuh+1-1
- deep_gemm/include/deep_gemm/impls/smxx_layout.cuh+39
- pyproject.toml-3
- setup.py+4
- tests/generators.py+34-22
- tests/test_bf16.py+125
- tests/test_fp8.py+3-3
- tests/test_layout.py+29-17
- tests/test_lazy_init.py+15
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
248 | 248 | | |
249 | 249 | | |
250 | 250 | | |
| 251 | + | |
251 | 252 | | |
252 | 253 | | |
253 | 254 | | |
| |||
510 | 511 | | |
511 | 512 | | |
512 | 513 | | |
513 | | - | |
514 | 514 | | |
515 | 515 | | |
516 | 516 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
138 | 138 | | |
139 | 139 | | |
140 | 140 | | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
141 | 144 | | |
142 | 145 | | |
143 | 146 | | |
| |||
150 | 153 | | |
151 | 154 | | |
152 | 155 | | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
153 | 159 | | |
154 | 160 | | |
155 | 161 | | |
| |||
160 | 166 | | |
161 | 167 | | |
162 | 168 | | |
163 | | - | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
164 | 177 | | |
165 | 178 | | |
166 | 179 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
311 | 311 | | |
312 | 312 | | |
313 | 313 | | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
314 | 320 | | |
315 | 321 | | |
316 | 322 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
95 | 95 | | |
96 | 96 | | |
97 | 97 | | |
98 | | - | |
| 98 | + | |
99 | 99 | | |
100 | 100 | | |
101 | 101 | | |
| |||
110 | 110 | | |
111 | 111 | | |
112 | 112 | | |
113 | | - | |
| 113 | + | |
114 | 114 | | |
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
118 | | - | |
| 118 | + | |
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | 93 | | |
97 | 94 | | |
98 | 95 | | |
| |||
125 | 122 | | |
126 | 123 | | |
127 | 124 | | |
128 | | - | |
129 | | - | |
130 | | - | |
| 125 | + | |
| 126 | + | |
131 | 127 | | |
132 | 128 | | |
133 | 129 | | |
| |||
152 | 148 | | |
153 | 149 | | |
154 | 150 | | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | 151 | | |
163 | 152 | | |
164 | 153 | | |
165 | 154 | | |
166 | 155 | | |
167 | 156 | | |
168 | | - | |
169 | 157 | | |
170 | 158 | | |
171 | 159 | | |
| |||
248 | 236 | | |
249 | 237 | | |
250 | 238 | | |
251 | | - | |
252 | | - | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | 239 | | |
258 | 240 | | |
259 | 241 | | |
| |||
269 | 251 | | |
270 | 252 | | |
271 | 253 | | |
272 | | - | |
273 | | - | |
274 | | - | |
275 | | - | |
276 | | - | |
277 | | - | |
278 | 254 | | |
279 | 255 | | |
280 | 256 | | |
| |||
314 | 290 | | |
315 | 291 | | |
316 | 292 | | |
317 | | - | |
318 | | - | |
319 | | - | |
320 | 293 | | |
321 | 294 | | |
322 | 295 | | |
323 | 296 | | |
324 | 297 | | |
325 | | - | |
326 | | - | |
327 | | - | |
328 | | - | |
329 | | - | |
330 | | - | |
331 | 298 | | |
332 | 299 | | |
333 | 300 | | |
| |||
343 | 310 | | |
344 | 311 | | |
345 | 312 | | |
346 | | - | |
347 | | - | |
348 | | - | |
349 | | - | |
350 | | - | |
351 | | - | |
352 | 313 | | |
353 | 314 | | |
354 | 315 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2530 | 2530 | | |
2531 | 2531 | | |
2532 | 2532 | | |
2533 | | - | |
2534 | | - | |
| 2533 | + | |
| 2534 | + | |
2535 | 2535 | | |
2536 | 2536 | | |
2537 | 2537 | | |
2538 | 2538 | | |
2539 | 2539 | | |
2540 | 2540 | | |
2541 | | - | |
| 2541 | + | |
2542 | 2542 | | |
2543 | 2543 | | |
2544 | 2544 | | |
2545 | 2545 | | |
2546 | 2546 | | |
2547 | | - | |
2548 | | - | |
| 2547 | + | |
| 2548 | + | |
2549 | 2549 | | |
2550 | 2550 | | |
2551 | 2551 | | |
| |||
Lines changed: 8 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
| 27 | + | |
26 | 28 | | |
27 | 29 | | |
28 | 30 | | |
| |||
155 | 157 | | |
156 | 158 | | |
157 | 159 | | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
158 | 163 | | |
159 | 164 | | |
160 | 165 | | |
| |||
411 | 416 | | |
412 | 417 | | |
413 | 418 | | |
414 | | - | |
| 419 | + | |
415 | 420 | | |
416 | 421 | | |
417 | 422 | | |
418 | 423 | | |
419 | 424 | | |
420 | 425 | | |
421 | | - | |
| 426 | + | |
422 | 427 | | |
423 | 428 | | |
| 429 | + | |
424 | 430 | | |
425 | 431 | | |
426 | 432 | | |
| |||
0 commit comments