[Dlight] Enhance Decode-GEMV Schedule #15195

junrushao · 2023-07-02T06:48:24Z

This PR enhances Decode-GEMV rule with the following changes:

Normalize the GEMV iter domain to S-R-C via transform-block-layout.
This would help with further analysis and scheduling, in cases for
example, when there was no spatial loop in the original reduction
block.
Get rid of the ad hoc iter type analysis, including the logic calling
into a TVM packed func tir.schedule.GetLoopIterType using
tvm._ffi.get_global_func.
Split out the logic for two separate cases of scheduling, where the
innermost dimension is spatial or reduction.
Introduces suggest_threads_per_block to guess the threads to be
allocated each threadblock. This helps avoid the previous case where
dlight allocates 256 threads for a workload whose degree of parallelism
is only 128.
Misc improvements.

This rest of the changes are split out to separate PRs that are already
merged to main.

Pass the hints to arithmetic analyzer that shape variables should
be positive ones ([TIR][Schedule] Derive Nonnegative Bounds from Shape Var #15210)
Eliminate unnecessary block predicate generation - should be
provable via affine analysis ([ARITH] Allow Analyzer to MarkGlobalNonNegValue #15193)
Shrink local memory allocation if only one element X[threadIdx.x]
is used ([TIR][Transform] Add LiftThreadBinding Pass #15207)

tvm-bot · 2023-07-02T06:48:27Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

No users to tag found in teams: dlight _{See #10317 for details}

_{Generated by tvm-bot}

This PR enhances Decode-GEMV rule with the following changes: - Normalize the GEMV iter domain to S-R-C via transform-block-layout. This would help with further analysis and scheduling, in cases for example, when there was no spatial loop in the original reduction block. - Get rid of the ad hoc iter type analysis, including the logic calling into a TVM packed func `tir.schedule.GetLoopIterType` using `tvm._ffi.get_global_func`. - Split out the logic for two separate cases of scheduling, where the innermost dimension is spatial or reduction. - Introduces `suggest_threads_per_block` to guess the threads to be allocated each threadblock. This helps avoid the previous case where dlight allocates 256 threads for a workload whose degree of parallelism is only 128. - Misc improvements. This rest of the changes are split out to separate PRs that are already merged to main. - [x] Pass the hints to arithmetic analyzer that shape variables should be positive ones (apache#15210) - [x] Eliminate unnecessary block predicate generation - should be provable via affine analysis (apache#15193) - [x] Shrink local memory allocation if only one element `X[threadIdx.x]` is used (apache#15207)

tqchen · 2023-07-05T16:58:40Z

python/tvm/dlight/gpu/utils.py

+    dynamic: List[int] = []
+    for i, loop in enumerate(loops):
+        loop_extent = loop.extent
+        if isinstance(loop_extent, tir.IntImm):


We should be able to factor out the loop extent into constant and dynamic component, this will handle extents like 32 * n

junrushao force-pushed the feature/2023-07-01/gemv-compute-at branch 6 times, most recently from 0596441 to 5d715aa Compare July 3, 2023 06:08

junrushao marked this pull request as ready for review July 3, 2023 06:08

junrushao force-pushed the feature/2023-07-01/gemv-compute-at branch 5 times, most recently from dc62d15 to 0d8ff66 Compare July 4, 2023 23:51

junrushao force-pushed the feature/2023-07-01/gemv-compute-at branch from 0d8ff66 to b25bd0b Compare July 5, 2023 00:51

MasterJH5574 approved these changes Jul 5, 2023

View reviewed changes

MasterJH5574 merged commit 04f22a9 into apache:unity Jul 5, 2023

tqchen reviewed Jul 5, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Dlight] Enhance Decode-GEMV Schedule #15195

[Dlight] Enhance Decode-GEMV Schedule #15195

Uh oh!

junrushao commented Jul 2, 2023 •

edited

Loading

Uh oh!

tvm-bot commented Jul 2, 2023 •

edited

Loading

Uh oh!

tqchen Jul 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Dlight] Enhance Decode-GEMV Schedule #15195

[Dlight] Enhance Decode-GEMV Schedule #15195

Uh oh!

Conversation

junrushao commented Jul 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tvm-bot commented Jul 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tqchen Jul 5, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

junrushao commented Jul 2, 2023 •

edited

Loading

tvm-bot commented Jul 2, 2023 •

edited

Loading