Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoTVM] Enhance tuning space of split #3949

Merged
merged 3 commits into from
Sep 15, 2019

Conversation

comaniac
Copy link
Contributor

This PR enhances the tuning space of split scheduling primitive. Originally, it only has two policies: "all" for all divisible factors of the dimension length; "candidate" for manual entities. The issue is that for irregualr shape like prime numbers or product of prime numbers (e.g., 229x229 in Inception V3), the "all" policy has less opportunities to achieve a better performance.

In this change, split now has four policies:

  • "factors" for the origianl "all".
  • "power2" for all less or equal power-of-two numbers.
  • "verbose" for the union of "factors" and "power2".
  • "candidate" for the original "candidate".

Note that we will still use "factors" policy as the default one so merging this PR will not change performance of TOPI ops at all, but users can have an option to search more values as they want, and this could benefit to the follow-up dynamic shape implementation.

I use Inception V3 to illustrate the benefit of using "power2" policy. Here is the result of 44 tasks of Inception V3 tuned with "factors" (default) and "power2" on AWS EC2 c5.9xlarge. For each task, the n_trial is set to 100 and minimum repeat ms is set to 1000. In summary, 22/44 tasks achieve better performance with "power2" policy, and the average speedup over "factors" is 1.03. Note that the speedup is more obvious in later layers which have more irregular shapes.

Orig (GFlop/s) Power2 (GFlop/s) Speedup
563.39 559.96 0.99
822.92 820.5 1.00
787.27 861.51 1.09
629.48 530.76 0.84
839.51 940.14 1.12
1125.07 1189.68 1.06
938.41 924.16 0.98
708.54 795.64 1.12
915.48 867.44 0.95
691.18 631.62 0.91
1401.51 1241.6 0.89
1243.3 1126.36 0.91
1135.83 916.01 0.81
1415.87 1187.75 0.84
738.55 685.24 0.93
1214.39 1104.74 0.91
1166.35 1017.5 0.87
1001.36 916.12 0.91
1334.92 1102.88 0.83
647.49 638.22 0.99
1086.77 955.19 0.88
1163.08 1033.46 0.89
715.03 752.06 1.05
527.43 422.63 0.80
476.56 440.67 0.92
417.25 411.61 0.99
1465.72 1328.09 0.91
1665.03 1527.72 0.92
467.34 428.14 0.92
1898.54 1604.38 0.85
538.29 548.46 1.02
1333.45 1553.18 1.16
1735.57 1855.26 1.07
395.53 510.74 1.29
883.57 2326.81 2.63
586.29 612.74 1.05
596.6 626.83 1.05
595.78 632.98 1.06
1215.91 1174.21 0.97
728.3 789 1.08
721.63 835.6 1.16
571.13 658.04 1.15
465.5 756.41 1.62

- Rename policy "all" to "factors"
- Add policy "verbose" and "power2"
@comaniac
Copy link
Contributor Author

@vinx13 @kevinthesun @icemelon9 please help review this PR.

Copy link
Contributor

@kevinthesun kevinthesun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use verbose mode, we should get most performance improvement for inceptionv3?

@comaniac
Copy link
Contributor Author

If we use verbose mode, we should get most performance improvement for inceptionv3?

Maybe. I did see the schedule AutoTVM found for some shapes (not in Inception V3) uses a combination of factors and power of two. However, the tuning space grows rapidly and it requires longer search time or smarter search algorithm.

@vinx13 vinx13 merged commit da03979 into apache:master Sep 15, 2019
@vinx13
Copy link
Member

vinx13 commented Sep 15, 2019

Thanks @comaniac @kevinthesun This is merged

@comaniac comaniac deleted the enhance_autotvm_space branch September 15, 2019 03:45
wweic pushed a commit to wweic/tvm that referenced this pull request Sep 16, 2019
* Refine policies for define_split

- Rename policy "all" to "factors"
- Add policy "verbose" and "power2"

* Refine search space

* add doc
wweic pushed a commit to wweic/tvm that referenced this pull request Sep 16, 2019
* Refine policies for define_split

- Rename policy "all" to "factors"
- Add policy "verbose" and "power2"

* Refine search space

* add doc
wweic pushed a commit to neo-ai/tvm that referenced this pull request Sep 16, 2019
* Refine policies for define_split

- Rename policy "all" to "factors"
- Add policy "verbose" and "power2"

* Refine search space

* add doc
@zhenhuaw-me
Copy link
Contributor

Hi @comaniac , this is really interesting change. I have not read all the code, but may I ask that if the PR changes the split representation such that the highest split is left as -1, for example ["tile_co", "sp", [-1, 8]]? I have some schedules extracting split factor like CO, CI = cfg['tile_co'].size, which seems to be broken by this PR.

PS. I didn't verify all the patches, but the change leading to my concern is among these below.

9e4f07b4695a8849590cdd46de662e3fa273d59b Enable miopen transpose convolution and fp16 support (#3952)
0482623e9c20518fac8e6ceb34a6552f674fed9b [Relay][TensorFlow] Add support for SquaredDifference (#3930)
da039794cfc7c7d94855b0dc6f93248f13222c7f [AutoTVM] Enhance tuning space of split (#3949)
e35e1cc2e8d6b14aae5c888b12205b85afc80bb2 trivial (#3954)
4b431c67fdc55062afa682fe3d4816bd02b9f7ec 1) Add EQ op to the deduce_bound and add unittests for the same (#3775)
2536465c2e274273344c5e67337fd3b4c252ec2f Vulkan2 Runtime API (#3849)
06aecc60ea55e10cc029ec4c2f3ff2fcc811c802 [VTA] RPC path update. (#3924)

@comaniac
Copy link
Contributor Author

Yes, this PR also changes the leftmost split factor to -1, but it performs exactly the same behavior as before, because AutoTVM will automatically replace -1 with L/prod(others) when applying the split factor. The reason I made this change is that we are trying to apply a schedule config from one shape to another. For example, assuming we have a config ['tile_x', 'sp', [4, 8]] from a shape with length 32, and we wish to apply it to another shape with length 16. In this case, ['tile_x', 'sp', [4, 8]] is inapplicable, but ['tile_x', 'sp', [-1, 8]] is fine, as it implies ['tile_x', 'sp', [2, 8]].

Sorry if this change breaks your schedule. Since CI was passed before merging this PR, I suppose this change should not afffect any schedule in TOPI.

@zhenhuaw-me
Copy link
Contributor

The reason I made this change is that we are trying to apply a schedule config from one shape to another.

Interesting feature, which came into my mind that I saw the recorded split factors. I think the side effects are worth... Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants