Commit fe2bc8c
authored
[TileOp] Implement WGMMA for T.gemm_v2 (tile-ai#813)
* [Feature] Introduce WGMMA support and enhance GEMM layout handling
- Added support for the WGMMA intrinsic in the TileLang framework, enabling efficient matrix multiplication on newer architectures.
- Refactored GEMM layout functions to accept a boolean parameter for K dimension handling, improving flexibility in layout generation.
- Updated layout inference logic to accommodate new WGMMA configurations and ensure compatibility with existing GEMM operations.
- Enhanced Python bindings for layout functions, allowing for better integration and usability in user-defined operations.
- Improved documentation for layout functions and GEMM operations to clarify usage and parameters.
These changes enhance the performance and usability of GEMM operations, particularly for advanced architectures, while maintaining backward compatibility with existing implementations.
* [Refactor] Clean up code formatting and enhance layout function readability
- Improved code formatting across multiple files for better readability, including consistent indentation and line breaks.
- Updated layout function signatures to enhance clarity, particularly in `gemm_layouts.cc`, `layout.cc`, and `layout.h`.
- Refactored lambda functions in `builtin.cc` and `gemm_py.cc` for improved structure and maintainability.
- Enhanced comments and documentation in layout-related files to clarify usage and parameters.
These changes contribute to a cleaner codebase and improved maintainability of layout functions in the TileLang framework.
* [Feature] Add descriptor initialization and offset manipulation for WGMMA
- Introduced new TileLang builtins `initialize_descriptor` and `increase_descriptor_offset` to facilitate descriptor management for WGMMA operations.
- Updated `builtin.cc` and `builtin.h` to define and document the new builtins, enhancing the framework's capabilities for descriptor handling.
- Modified `codegen_cuda.cc` and `ptx.cc` to integrate the new builtins into the code generation process, ensuring proper assembly generation for WGMMA operations.
- Enhanced the `GemmWGMMA` class to utilize the new descriptor functionalities, improving the efficiency of matrix multiplication operations.
- Updated related tests and documentation to reflect the new features and ensure comprehensive coverage.
These changes enhance the TileLang framework's support for advanced matrix operations on newer architectures, improving performance and usability.
* [Refactor] Improve code formatting and readability in various files
- Enhanced code formatting across multiple files for better readability, including consistent indentation and line breaks.
- Updated function signatures and comments in `builtin.h`, `codegen_cuda.cc`, and `ptx.cc` to improve clarity.
- Refactored descriptor initialization and offset manipulation functions in `builtin.py` and `wgmma_macro_generator.py` for improved structure.
- Cleaned up unnecessary whitespace and improved alignment in `common.h` and `allocate.py`.
These changes contribute to a cleaner and more maintainable codebase in the TileLang framework.
* [Update] Update subproject commit and refactor layout function call
- Updated the subproject commit for `cutlass` to indicate a dirty state.
- Refactored the `UpdateAnalyzer` function in `layout.cc` to call `LayoutNode::getVarMap()` instead of `getVarMap()`, improving clarity and ensuring proper context for variable mapping.
These changes enhance the maintainability and clarity of the layout handling in the TileLang framework.
* support more data types
* gemm_rs support
* lint fix
* wgmma wrapper
* Remove debug logging for wgmma assembly code and refactor swizzle byte size calculations in wgmma macro generator. Enhanced handling of leading and stride byte offsets based on swizzle mode, improving clarity and performance in tensor core intrinsic emissions.
* Refactor GEMM layout functions to replace 'kfactor' with 'k_inner' for improved clarity and consistency. Update includes necessary changes in error messages for Hopper and Sm100 layouts. Additionally, include a new header for CUTE utilities in common.h.
* Comprehensively support WGMMA GEMM SS
* remove debug print
* lint fix
* remove debug print
* reduce bwd test shape
* lint fix
* clear cache for pytest
* lint fix
* Update sparse MLA examples to support SKV adjustment and correctness checks
- Changed SKV parameter from 32768 to 8192 in sparse MLA backward and forward tests.
- Added check_correctness parameter to test functions for validation of outputs.
- Updated test cases to reflect new SKV values and correctness checks.
* test fix
* adjust test case
* test fix
* skip some test currently1 parent 398d5e9 commit fe2bc8c
File tree
43 files changed
+2943
-173
lines changed- .github/workflows
- examples
- deepseek_v32
- flash_attention
- norm
- src
- layout
- op
- target
- tl_templates/cuda
- instruction
- transform
- testing/python/tilelibrary
- tilelang
- intrinsics
- language
- ast
- tir
- layout
- tileop/gemm
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
43 files changed
+2943
-173
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
49 | 50 | | |
50 | 51 | | |
51 | 52 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
122 | | - | |
| 122 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
118 | | - | |
| 118 | + | |
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
123 | 123 | | |
124 | 124 | | |
125 | | - | |
| 125 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
92 | 92 | | |
93 | 93 | | |
94 | 94 | | |
95 | | - | |
| 95 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
333 | 333 | | |
334 | 334 | | |
335 | 335 | | |
336 | | - | |
| 336 | + | |
337 | 337 | | |
338 | 338 | | |
339 | 339 | | |
340 | 340 | | |
341 | 341 | | |
342 | | - | |
| 342 | + | |
| 343 | + | |
343 | 344 | | |
344 | 345 | | |
345 | 346 | | |
| |||
359 | 360 | | |
360 | 361 | | |
361 | 362 | | |
362 | | - | |
| 363 | + | |
363 | 364 | | |
364 | 365 | | |
365 | 366 | | |
| |||
385 | 386 | | |
386 | 387 | | |
387 | 388 | | |
388 | | - | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
234 | 234 | | |
235 | 235 | | |
236 | 236 | | |
237 | | - | |
| 237 | + | |
238 | 238 | | |
239 | 239 | | |
240 | 240 | | |
241 | 241 | | |
242 | 242 | | |
243 | | - | |
| 243 | + | |
| 244 | + | |
244 | 245 | | |
245 | 246 | | |
246 | 247 | | |
| |||
254 | 255 | | |
255 | 256 | | |
256 | 257 | | |
257 | | - | |
| 258 | + | |
258 | 259 | | |
259 | 260 | | |
260 | 261 | | |
| |||
277 | 278 | | |
278 | 279 | | |
279 | 280 | | |
280 | | - | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
399 | 399 | | |
400 | 400 | | |
401 | 401 | | |
402 | | - | |
| 402 | + | |
403 | 403 | | |
404 | 404 | | |
405 | 405 | | |
406 | 406 | | |
407 | 407 | | |
408 | 408 | | |
409 | | - | |
| 409 | + | |
| 410 | + | |
410 | 411 | | |
411 | 412 | | |
412 | 413 | | |
| |||
456 | 457 | | |
457 | 458 | | |
458 | 459 | | |
459 | | - | |
| 460 | + | |
460 | 461 | | |
461 | 462 | | |
462 | | - | |
463 | | - | |
| 463 | + | |
| 464 | + | |
Lines changed: 6 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
| 23 | + | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
30 | | - | |
| 31 | + | |
| 32 | + | |
31 | 33 | | |
32 | 34 | | |
33 | 35 | | |
34 | 36 | | |
35 | 37 | | |
36 | | - | |
| 38 | + | |
| 39 | + | |
37 | 40 | | |
38 | 41 | | |
39 | 42 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
| 30 | + | |
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | | - | |
| 41 | + | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| |||
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
69 | | - | |
| 69 | + | |
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
74 | | - | |
| 74 | + | |
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
66 | | - | |
67 | | - | |
| 66 | + | |
68 | 67 | | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
| 68 | + | |
75 | 69 | | |
76 | 70 | | |
77 | 71 | | |
| |||
0 commit comments