[AutoTVM] Add batch_matmul to tunable operations #4242

jwfromm · 2019-11-01T00:46:05Z

The rising popularity of BERT models suggests that batch matmul is going to become an increasingly important op for TVM to have first class support of. This PR adds batch_matmul to the list of operations that AutoTVM can tune for and templatizes the x86 schedule. Although the templatization and backends covered in this PR are very limited, it serves a starting point to build off of. The x86 fallback configuration is identical to the static schedule in master branch.

jwfromm · 2019-11-01T00:53:07Z

@soiferj @icemelon9, can you guys take a look at this PR?

tqchen · 2019-11-01T15:56:41Z

@Hzfengsy can you also take a look?

icemelon · 2019-11-01T17:52:05Z

@jwfromm Thanks for adding this. Could you add some latency measure number, e.g., on c5.9xlarge?

topi/python/topi/x86/batch_matmul.py

comaniac · 2019-11-03T09:21:37Z

topi/python/topi/x86/batch_matmul.py

+    # create tuning space
+    cfg.define_split("tile_y", M, num_outputs=2)
+    cfg.define_split("tile_x", N, num_outputs=2)
+    cfg.define_split("tile_k", K, num_outputs=2)
+    if cfg.is_fallback:
+        _default_batch_matmul_nopack_config(cfg, M, N, K)


This part should belong to the schedule instead of the declaration. I suggest moving them to the schedule function like other ops.

This is actually extremely similar to the topi dense declaration in x86 as it's based directly on it. I would argue that the functional similarity between dense and batch_matmul encourages us to keep the syntax as close as possible to make transferring optimizations simple. If you feel strongly that configuration declarations should be in the schedule, I'd by happy to move both the batch_matmul and the dense declarations.

I personally don't think that's the best practice, because it would be tedious when we want to reuse the compute function on the different target (e.g., CUDA). It would also be confusing when someone tries to improve the schedule in the future, so I think it would be great to change both dense and batch_norm. On the other hand, I would also be happy to know other's opinion.

cc @icemelon9

I think it would be better to move both to be in the schedule.

+1 for moving this into the schedule.

The latest push moves the configuration declaration in both batch_matmul and dense to scheduling function. However, note that in the dense declaration, some splits are actually used in the computation (tile_k in dense_no_pack for example) and so cannot be moved. This means that the declarations arent all located in the same place. Do you guys prefer it this way or should we leave dense alone and only change batch_matmul?

In this case I personally prefer to leave the dense there, and file a separate PR to refactor the dense compute.
cc @yzhliu @icemelon9 @Laurawly

@comaniac I agree thats the best way to proceed. I reverted the changes to dense in the latest commit.

@comaniac In certain case, the config space must be defined in the compute as the compute needs to use it to define intermediate compute stage.

jwfromm · 2019-11-06T00:32:11Z

@icemelon9, autotuning seems to yield speedups around 20% faster than the base configuration on a 32 core CPU. Again, this template is very simple and can almost certainly be improved, this PR is just a starting point to make that improvement easier.

icemelon · 2019-11-07T00:07:25Z

Thanks @jwfromm

* Batch matmul tuning running but with errors. * Default x86 schedule as good as before. * Code Cleanup * Remove unused argument. * improved template documentation. * Silly lint fix * Removed leftover comment. * Moved cfg declaration to schedule for batch_matmul * Moved x86 dense cfg declaration to schedule. * lint fix * Removed duplicate cfg declaration in dense. * Reverted changes to dense.

jwfromm added 4 commits October 31, 2019 17:42

Batch matmul tuning running but with errors.

d260174

Default x86 schedule as good as before.

a7de1cf

Code Cleanup

eca29b3

Remove unused argument.

d817355

tqchen added the status: need review label Nov 1, 2019

Hzfengsy reviewed Nov 1, 2019

View reviewed changes

topi/python/topi/x86/batch_matmul.py Show resolved Hide resolved

jwfromm added 2 commits November 1, 2019 17:08

improved template documentation.

649c47c

Silly lint fix

e4f5e58

soiferj reviewed Nov 2, 2019

View reviewed changes

topi/python/topi/x86/batch_matmul.py Outdated Show resolved Hide resolved

comaniac reviewed Nov 3, 2019

View reviewed changes

Removed leftover comment.

4905aba

yzhliu added the status: need update need update based on feedbacks label Nov 5, 2019

jwfromm added 3 commits November 5, 2019 15:58

Moved cfg declaration to schedule for batch_matmul

dcbface

Moved x86 dense cfg declaration to schedule.

d00513a

lint fix

479decb

jwfromm added 2 commits November 5, 2019 16:34

Removed duplicate cfg declaration in dense.

fa2d760

Reverted changes to dense.

2464981

tqchen assigned yzhliu Nov 6, 2019

icemelon approved these changes Nov 7, 2019

View reviewed changes

icemelon merged commit 14a5a35 into apache:master Nov 7, 2019

icemelon added status: accepted and removed status: need review status: need update need update based on feedbacks labels Nov 7, 2019

tqchen mentioned this pull request Nov 16, 2019

[RELEASE][DRAFT] TVM v0.6 Release candidate #4259

Closed

jwfromm deleted the auto_batch_matmul branch January 11, 2020 22:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoTVM] Add batch_matmul to tunable operations #4242

[AutoTVM] Add batch_matmul to tunable operations #4242

jwfromm commented Nov 1, 2019

jwfromm commented Nov 1, 2019

tqchen commented Nov 1, 2019

icemelon commented Nov 1, 2019 •

edited

Loading

comaniac Nov 3, 2019

jwfromm Nov 5, 2019

comaniac Nov 5, 2019

yzhliu Nov 5, 2019

soiferj Nov 5, 2019

jwfromm Nov 6, 2019

comaniac Nov 6, 2019

jwfromm Nov 6, 2019

icemelon Nov 7, 2019

jwfromm commented Nov 6, 2019 •

edited

Loading

icemelon commented Nov 7, 2019

[AutoTVM] Add batch_matmul to tunable operations #4242

[AutoTVM] Add batch_matmul to tunable operations #4242

Conversation

jwfromm commented Nov 1, 2019

jwfromm commented Nov 1, 2019

tqchen commented Nov 1, 2019

icemelon commented Nov 1, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jwfromm commented Nov 6, 2019 • edited Loading

icemelon commented Nov 7, 2019

icemelon commented Nov 1, 2019 •

edited

Loading

jwfromm commented Nov 6, 2019 •

edited

Loading