Commit a148d62
authored
[Feature] Enhance vectorized conversion support in CUDA codegen (#1095)
* [Feature] Add vectorized float16 and float32 conversion support in CUDA codegen
* Implemented handling for conversions between float16 and float32 types, specifically for vectorized operations using __half22float2 and __float22half2_rn.
* Enhanced the existing code to support both directions of conversion based on the lane count.
* Improved overall type handling in the VisitExpr_ method for better compatibility with TileLang.
* [Feature] Add float32 to float8 conversion support in CUDA codegen
* Implemented handling for conversion from float32 to float8 (E4M3/E5M2) in the VisitExpr_ method.
* Added vectorized conversion support using __nv_cvt_float2_to_fp8x2 for float2 to fp8x2 transformations.
* Enhanced type handling for better compatibility with TileLang, particularly for float8 types.
* lint
* fix a bug
* [Enhancement] Support lanes=4 cases and add unit test for vectorized cast
* lint
* [Feature] Refactor bf16 convertion operations and remove legacy compile flags
* lint1 parent 86c8bb4 commit a148d62
File tree
7 files changed
+221
-98
lines changed- examples/attention_sink
- src/target
- testing/python/language
7 files changed
+221
-98
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
24 | | - | |
| 23 | + | |
25 | 24 | | |
26 | | - | |
27 | | - | |
| 25 | + | |
28 | 26 | | |
29 | 27 | | |
30 | 28 | | |
| |||
140 | 138 | | |
141 | 139 | | |
142 | 140 | | |
143 | | - | |
144 | | - | |
| 141 | + | |
145 | 142 | | |
146 | | - | |
147 | | - | |
| 143 | + | |
148 | 144 | | |
149 | 145 | | |
150 | 146 | | |
| |||
180 | 176 | | |
181 | 177 | | |
182 | 178 | | |
183 | | - | |
184 | | - | |
| 179 | + | |
185 | 180 | | |
186 | | - | |
187 | | - | |
| 181 | + | |
188 | 182 | | |
189 | 183 | | |
190 | 184 | | |
| |||
205 | 199 | | |
206 | 200 | | |
207 | 201 | | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
213 | 205 | | |
214 | 206 | | |
215 | 207 | | |
| |||
Lines changed: 2 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
27 | | - | |
| 26 | + | |
28 | 27 | | |
29 | | - | |
30 | | - | |
| 28 | + | |
31 | 29 | | |
32 | 30 | | |
33 | 31 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
24 | | - | |
| 23 | + | |
25 | 24 | | |
26 | | - | |
27 | | - | |
| 25 | + | |
28 | 26 | | |
29 | 27 | | |
30 | 28 | | |
| |||
137 | 135 | | |
138 | 136 | | |
139 | 137 | | |
140 | | - | |
141 | | - | |
| 138 | + | |
142 | 139 | | |
143 | | - | |
144 | | - | |
| 140 | + | |
145 | 141 | | |
146 | 142 | | |
147 | 143 | | |
| |||
177 | 173 | | |
178 | 174 | | |
179 | 175 | | |
180 | | - | |
181 | | - | |
| 176 | + | |
182 | 177 | | |
183 | | - | |
184 | | - | |
| 178 | + | |
185 | 179 | | |
186 | 180 | | |
187 | 181 | | |
| |||
202 | 196 | | |
203 | 197 | | |
204 | 198 | | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
210 | 202 | | |
211 | 203 | | |
212 | 204 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | | - | |
| 21 | + | |
23 | 22 | | |
24 | | - | |
25 | | - | |
| 23 | + | |
26 | 24 | | |
27 | 25 | | |
28 | 26 | | |
| |||
Lines changed: 2 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
23 | | - | |
| 22 | + | |
24 | 23 | | |
25 | | - | |
26 | | - | |
| 24 | + | |
27 | 25 | | |
28 | 26 | | |
29 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
900 | 900 | | |
901 | 901 | | |
902 | 902 | | |
903 | | - | |
904 | | - | |
905 | | - | |
906 | | - | |
907 | | - | |
908 | | - | |
909 | | - | |
910 | | - | |
911 | | - | |
912 | | - | |
913 | | - | |
914 | | - | |
915 | | - | |
916 | | - | |
917 | | - | |
918 | | - | |
919 | | - | |
920 | | - | |
921 | | - | |
922 | | - | |
923 | | - | |
924 | | - | |
925 | | - | |
926 | | - | |
927 | | - | |
928 | | - | |
929 | | - | |
930 | | - | |
| 903 | + | |
| 904 | + | |
| 905 | + | |
| 906 | + | |
| 907 | + | |
931 | 908 | | |
932 | | - | |
933 | | - | |
934 | | - | |
935 | | - | |
936 | | - | |
937 | | - | |
938 | | - | |
939 | | - | |
940 | | - | |
941 | | - | |
942 | | - | |
943 | | - | |
944 | | - | |
945 | | - | |
946 | | - | |
947 | | - | |
948 | | - | |
949 | | - | |
950 | | - | |
951 | | - | |
952 | | - | |
| 909 | + | |
| 910 | + | |
| 911 | + | |
| 912 | + | |
| 913 | + | |
| 914 | + | |
| 915 | + | |
| 916 | + | |
| 917 | + | |
| 918 | + | |
| 919 | + | |
| 920 | + | |
| 921 | + | |
| 922 | + | |
| 923 | + | |
| 924 | + | |
| 925 | + | |
| 926 | + | |
| 927 | + | |
| 928 | + | |
| 929 | + | |
| 930 | + | |
| 931 | + | |
| 932 | + | |
| 933 | + | |
| 934 | + | |
| 935 | + | |
| 936 | + | |
| 937 | + | |
| 938 | + | |
| 939 | + | |
| 940 | + | |
| 941 | + | |
| 942 | + | |
| 943 | + | |
| 944 | + | |
| 945 | + | |
| 946 | + | |
| 947 | + | |
| 948 | + | |
| 949 | + | |
| 950 | + | |
| 951 | + | |
| 952 | + | |
| 953 | + | |
| 954 | + | |
| 955 | + | |
| 956 | + | |
| 957 | + | |
| 958 | + | |
| 959 | + | |
| 960 | + | |
| 961 | + | |
| 962 | + | |
| 963 | + | |
| 964 | + | |
| 965 | + | |
| 966 | + | |
| 967 | + | |
| 968 | + | |
| 969 | + | |
| 970 | + | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
| 985 | + | |
| 986 | + | |
| 987 | + | |
| 988 | + | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
| 996 | + | |
| 997 | + | |
| 998 | + | |
| 999 | + | |
| 1000 | + | |
| 1001 | + | |
| 1002 | + | |
| 1003 | + | |
| 1004 | + | |
| 1005 | + | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
| 1012 | + | |
| 1013 | + | |
| 1014 | + | |
| 1015 | + | |
| 1016 | + | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
953 | 1020 | | |
954 | 1021 | | |
955 | 1022 | | |
| |||
964 | 1031 | | |
965 | 1032 | | |
966 | 1033 | | |
967 | | - | |
968 | | - | |
969 | | - | |
970 | 1034 | | |
971 | 1035 | | |
972 | 1036 | | |
| |||
0 commit comments