Commit fb416ef
committed
[Codegen][Metal] Support metal warp-level primitive
This PR introduces the warp-level shuffle primitives used in Metal
Shading Language, and uses them in the implementation of allreduce
lowering.
The introduced primitives are:
* `simd_shuffle`,
* `simd_shuffle_up`,
* `simd_shuffle_down`.
See section 6.9.2 of https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf
for details.
The correctness are validated by `test_allreduce_cuda` with the backend
changed to Metal. Given we do not have Metal CI tests, the correctness
is checked only locally.
Given the Metal shuffle primitives do not support (or need) masking,
the pass LowerThreadAllreduce is updated to support such backend
which does not have masks. One unit test for metal is added to ensure
that no mask is used.1 parent 9ff74fb commit fb416ef
File tree
3 files changed
+180
-11
lines changed- src
- target/source
- tir/transforms
- tests/python/unittest
3 files changed
+180
-11
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
33 | 55 | | |
34 | 56 | | |
35 | 57 | | |
| |||
95 | 117 | | |
96 | 118 | | |
97 | 119 | | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
98 | 151 | | |
99 | 152 | | |
100 | 153 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
476 | 476 | | |
477 | 477 | | |
478 | 478 | | |
479 | | - | |
480 | | - | |
481 | | - | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
482 | 483 | | |
483 | 484 | | |
484 | | - | |
| 485 | + | |
485 | 486 | | |
486 | 487 | | |
487 | 488 | | |
| |||
698 | 699 | | |
699 | 700 | | |
700 | 701 | | |
701 | | - | |
| 702 | + | |
| 703 | + | |
702 | 704 | | |
703 | | - | |
| 705 | + | |
| 706 | + | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
704 | 711 | | |
705 | 712 | | |
706 | 713 | | |
| |||
709 | 716 | | |
710 | 717 | | |
711 | 718 | | |
712 | | - | |
| 719 | + | |
713 | 720 | | |
714 | | - | |
715 | | - | |
716 | | - | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
717 | 728 | | |
718 | 729 | | |
719 | 730 | | |
| |||
745 | 756 | | |
746 | 757 | | |
747 | 758 | | |
748 | | - | |
| 759 | + | |
749 | 760 | | |
750 | 761 | | |
751 | 762 | | |
| |||
769 | 780 | | |
770 | 781 | | |
771 | 782 | | |
| 783 | + | |
| 784 | + | |
772 | 785 | | |
773 | 786 | | |
774 | 787 | | |
| |||
Lines changed: 103 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
702 | 702 | | |
703 | 703 | | |
704 | 704 | | |
| 705 | + | |
| 706 | + | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
705 | 808 | | |
706 | 809 | | |
0 commit comments