@@ -12,91 +12,91 @@ Legend:
1212-  🟡 Partially supported by this backend
1313-  ❌ Not supported by this backend
1414
15- |  Operation |  BLAS |  CPU |  CUDA |  Metal |  SYCL |  Vulkan | 
16- | -----------| ------| ------| ------| ------| ------| ------| 
17- |                               ABS |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  ❌ | 
18- |                               ACC |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
19- |                               ADD |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅ | 
20- |                              ADD1 |  ❌ |  ✅ |  ✅ |  ❌ |  ✅ |  ❌ | 
21- |                            ARANGE |  ❌ |  ✅ |  ✅ |  ✅ |  ❌ |  ❌ | 
22- |                            ARGMAX |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
23- |                           ARGSORT |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
24- |                             CLAMP |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  🟡 | 
25- |                            CONCAT |  ❌ |  ✅ |  🟡 |  ✅ |  🟡 |  ✅ | 
26- |                              CONT |  ❌ |  ✅ |  ✅ |  ✅ |  🟡 |  🟡 | 
27- |                           CONV_2D |  ❌ |  ✅ |  ❌ |  ❌ |  ❌ |  ✅ | 
28- |                        CONV_2D_DW |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ✅ | 
29- |                 CONV_TRANSPOSE_1D |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
30- |                 CONV_TRANSPOSE_2D |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌ | 
31- |                               COS |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  🟡 | 
32- |                       COUNT_EQUAL |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ✅ | 
33- |                               CPY |  ❌ |  🟡 |  🟡 |  🟡 |  🟡 |  🟡 | 
34- |                CROSS_ENTROPY_LOSS |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌ | 
35- |           CROSS_ENTROPY_LOSS_BACK |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌ | 
36- |                     DIAG_MASK_INF |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅ | 
37- |                               DIV |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅ | 
38- |                               DUP |  ❌ |  ✅ |  🟡 |  🟡 |  ✅ |  🟡 | 
39- |                               ELU |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  ❌ | 
40- |                               EXP |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  ❌ | 
41- |                    FLASH_ATTN_EXT |  ❌ |  ✅ |  🟡 |  🟡 |  ❌ |  🟡 | 
42- |                 GATED_LINEAR_ATTN |  ❌ |  ✅ |  ✅ |  ❌ |  ✅ |  ❌ | 
43- |                             GEGLU |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  🟡 | 
44- |                         GEGLU_ERF |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  🟡 | 
45- |                       GEGLU_QUICK |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  🟡 | 
46- |                              GELU |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 | 
47- |                          GELU_ERF |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 | 
48- |                        GELU_QUICK |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 | 
49- |                          GET_ROWS |  ❌ |  ✅ |  🟡 |  ✅ |  🟡 |  🟡 | 
50- |                     GET_ROWS_BACK |  ❌ |  🟡 |  🟡 |  ❌ |  ❌ |  ❌ | 
51- |                        GROUP_NORM |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
52- |                       HARDSIGMOID |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  ❌ | 
53- |                         HARDSWISH |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  ❌ | 
54- |                            IM2COL |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅ | 
55- |                           L2_NORM |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
56- |                        LEAKY_RELU |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
57- |                               LOG |  ❌ |  ✅ |  ✅ |  ❌ |  ✅ |  ❌ | 
58- |                              MEAN |  ❌ |  ✅ |  ✅ |  ✅ |  ❌ |  ❌ | 
59- |                               MUL |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅ | 
60- |                           MUL_MAT |  🟡 |  🟡 |  🟡 |  🟡 |  🟡 |  🟡 | 
61- |                        MUL_MAT_ID |  ❌ |  ✅ |  ✅ |  ✅ |  🟡 |  ✅ | 
62- |                               NEG |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  ❌ | 
63- |                              NORM |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  🟡 | 
64- |                    OPT_STEP_ADAMW |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ✅ | 
65- |                          OUT_PROD |  🟡 |  🟡 |  🟡 |  ❌ |  🟡 |  ❌ | 
66- |                               PAD |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
67- |                    PAD_REFLECT_1D |  ❌ |  ✅ |  ❌ |  ✅ |  ❌ |  ❌ | 
68- |                           POOL_2D |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
69- |                             REGLU |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  🟡 | 
70- |                              RELU |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 | 
71- |                            REPEAT |  ❌ |  ✅ |  🟡 |  ✅ |  ✅ |  🟡 | 
72- |                       REPEAT_BACK |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ✅ | 
73- |                          RMS_NORM |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅ | 
74- |                     RMS_NORM_BACK |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ✅ | 
75- |                  RMS_NORM_MUL_ADD |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
76- |                              ROLL |  ❌ |  ✅ |  ❌ |  ❌ |  ❌ |  ✅ | 
77- |                              ROPE |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
78- |                         ROPE_BACK |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ✅ | 
79- |                         RWKV_WKV6 |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
80- |                         RWKV_WKV7 |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
81- |                             SCALE |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
82- |                               SET |  ❌ |  ✅ |  ❌ |  ✅ |  ❌ |  ❌ | 
83- |                          SET_ROWS |  ❌ |  🟡 |  🟡 |  🟡 |  🟡 |  🟡 | 
84- |                               SGN |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  ❌ | 
85- |                           SIGMOID |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 | 
86- |                              SILU |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 | 
87- |                         SILU_BACK |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ✅ | 
88- |                               SIN |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  🟡 | 
89- |                          SOFT_MAX |  ❌ |  ✅ |  ✅ |  ✅ |  🟡 |  ✅ | 
90- |                     SOFT_MAX_BACK |  ❌ |  🟡 |  🟡 |  ❌ |  ❌ |  ✅ | 
91- |                               SQR |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  🟡 | 
92- |                              SQRT |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ❌ | 
93- |                          SSM_CONV |  ❌ |  ✅ |  ✅ |  ✅ |  ❌ |  ❌ | 
94- |                          SSM_SCAN |  ❌ |  ✅ |  ✅ |  ✅ |  ❌ |  ❌ | 
95- |                              STEP |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  ❌ | 
96- |                               SUB |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅ | 
97- |                               SUM |  ❌ |  ✅ |  ✅ |  ❌ |  ✅ |  ✅ | 
98- |                          SUM_ROWS |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
99- |                            SWIGLU |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  🟡 | 
100- |                              TANH |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 | 
101- |                TIMESTEP_EMBEDDING |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ | 
102- |                           UPSCALE |  ❌ |  ✅ |  ✅ |  🟡 |  🟡 |  ✅ | 
15+ |  Operation |  BLAS |  CPU |  CUDA |  Metal |  OpenCL  |   SYCL |  Vulkan | 
16+ | -----------| ------| ------| ------| ------| ------| ------| ------ | 
17+ |                               ABS |  ❌ |  ✅ |  🟡 |  🟡 |  ❌  |   🟡 |  ❌ | 
18+ |                               ACC |  ❌ |  ✅ |  ✅ |  ✅ |  ❌  |   ✅ |  ✅ | 
19+ |                               ADD |  ❌ |  ✅ |  ✅ |  🟡 |  🟡  |   ✅ |  ✅ | 
20+ |                              ADD1 |  ❌ |  ✅ |  ✅ |  ❌ |  ❌  |   ✅ |  ❌ | 
21+ |                            ARANGE |  ❌ |  ✅ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌  | 
22+ |                            ARGMAX |  ❌ |  ✅ |  ✅ |  ✅ |  ❌  |   ✅ |  ✅ | 
23+ |                           ARGSORT |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅  | 
24+ |                             CLAMP |  ❌ |  ✅ |  ✅ |  🟡 |  🟡  |   ✅ |  🟡 | 
25+ |                            CONCAT |  ❌ |  ✅ |  🟡 |  ✅ |  🟡 |  🟡  |   ✅ | 
26+ |                              CONT |  ❌ |  ✅ |  ✅ |  ✅ |  🟡 |  🟡 |  🟡  | 
27+ |                           CONV_2D |  ❌ |  ✅ |  ❌ |  ❌ |  ✅  |   ❌ |  ✅ | 
28+ |                        CONV_2D_DW |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌  |   ✅ | 
29+ |                 CONV_TRANSPOSE_1D |  ❌ |  ✅ |  ✅ |  ✅ |  ❌  |   ✅ |  ✅ | 
30+ |                 CONV_TRANSPOSE_2D |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌ |  ❌  | 
31+ |                               COS |  ❌ |  ✅ |  ✅ |  🟡 |  ❌  |   ✅ |  🟡 | 
32+ |                       COUNT_EQUAL |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌  |   ✅ | 
33+ |                               CPY |  ❌ |  🟡 |  🟡 |  🟡 |  🟡 |  🟡 |  🟡  | 
34+ |                CROSS_ENTROPY_LOSS |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌ |  ❌  | 
35+ |           CROSS_ENTROPY_LOSS_BACK |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌ |  ❌  | 
36+ |                     DIAG_MASK_INF |  ❌ |  ✅ |  ✅ |  🟡 |  🟡  |   ✅ |  ✅ | 
37+ |                               DIV |  ❌ |  ✅ |  ✅ |  🟡 |  🟡  |   ✅ |  ✅ | 
38+ |                               DUP |  ❌ |  ✅ |  🟡 |  🟡 |  🟡  |   ✅ |  🟡 | 
39+ |                               ELU |  ❌ |  ✅ |  🟡 |  🟡 |  ❌  |   🟡 |  ❌ | 
40+ |                               EXP |  ❌ |  ✅ |  🟡 |  🟡 |  ❌  |   🟡 |  ❌ | 
41+ |                    FLASH_ATTN_EXT |  ❌ |  ✅ |  🟡 |  🟡 |  ❌ |  ❌  |   🟡 | 
42+ |                 GATED_LINEAR_ATTN |  ❌ |  ✅ |  ✅ |  ❌ |  ❌  |   ✅ |  ❌ | 
43+ |                             GEGLU |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅  |   🟡 | 
44+ |                         GEGLU_ERF |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅  |   🟡 | 
45+ |                       GEGLU_QUICK |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅  |   🟡 | 
46+ |                              GELU |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 |  🟡  | 
47+ |                          GELU_ERF |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 |  🟡  | 
48+ |                        GELU_QUICK |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 |  🟡  | 
49+ |                          GET_ROWS |  ❌ |  ✅ |  🟡 |  ✅ |  🟡 |  🟡 |  🟡  | 
50+ |                     GET_ROWS_BACK |  ❌ |  🟡 |  🟡 |  ❌ |  ❌ |  ❌ |  ❌  | 
51+ |                        GROUP_NORM |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅  | 
52+ |                       HARDSIGMOID |  ❌ |  ✅ |  🟡 |  🟡 |  ❌  |   🟡 |  ❌ | 
53+ |                         HARDSWISH |  ❌ |  ✅ |  🟡 |  🟡 |  ❌  |   🟡 |  ❌ | 
54+ |                            IM2COL |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅ |  ✅  | 
55+ |                           L2_NORM |  ❌ |  ✅ |  ✅ |  ✅ |  ❌  |   ✅ |  ✅ | 
56+ |                        LEAKY_RELU |  ❌ |  ✅ |  ✅ |  ✅ |  ❌  |   ✅ |  ✅ | 
57+ |                               LOG |  ❌ |  ✅ |  ✅ |  ❌ |  ❌  |   ✅ |  ❌ | 
58+ |                              MEAN |  ❌ |  ✅ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌  | 
59+ |                               MUL |  ❌ |  ✅ |  ✅ |  🟡 |  🟡  |   ✅ |  ✅ | 
60+ |                           MUL_MAT |  🟡 |  🟡 |  🟡 |  🟡 |  🟡 |  🟡 |  🟡  | 
61+ |                        MUL_MAT_ID |  ❌ |  ✅ |  ✅ |  ✅ |  🟡 |  🟡  |   ✅ | 
62+ |                               NEG |  ❌ |  ✅ |  🟡 |  🟡 |  ❌  |   🟡 |  ❌ | 
63+ |                              NORM |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅  |   🟡 | 
64+ |                    OPT_STEP_ADAMW |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌  |   ✅ | 
65+ |                          OUT_PROD |  🟡 |  🟡 |  🟡 |  ❌ |  ❌  |   🟡 |  ❌ | 
66+ |                               PAD |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅  | 
67+ |                    PAD_REFLECT_1D |  ❌ |  ✅ |  ❌ |  ✅ |  ❌ |  ❌ |  ❌  | 
68+ |                           POOL_2D |  ❌ |  ✅ |  ✅ |  ✅ |  ❌  |   ✅ |  ✅ | 
69+ |                             REGLU |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅  |   🟡 | 
70+ |                              RELU |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 |  🟡  | 
71+ |                            REPEAT |  ❌ |  ✅ |  🟡 |  ✅ |  🟡  |   ✅ |  🟡 | 
72+ |                       REPEAT_BACK |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌  |   ✅ | 
73+ |                          RMS_NORM |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅ |  ✅  | 
74+ |                     RMS_NORM_BACK |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌  |   ✅ | 
75+ |                  RMS_NORM_MUL_ADD |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅  | 
76+ |                              ROLL |  ❌ |  ✅ |  ❌ |  ❌ |  ❌ |  ❌  |   ✅ | 
77+ |                              ROPE |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅  | 
78+ |                         ROPE_BACK |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌  |   ✅ | 
79+ |                         RWKV_WKV6 |  ❌ |  ✅ |  ✅ |  ✅ |  ❌  |   ✅ |  ✅ | 
80+ |                         RWKV_WKV7 |  ❌ |  ✅ |  ✅ |  ✅ |  ❌  |   ✅ |  ✅ | 
81+ |                             SCALE |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅  | 
82+ |                               SET |  ❌ |  ✅ |  ❌ |  ✅ |  ❌ |  ❌ |  ❌  | 
83+ |                          SET_ROWS |  ❌ |  🟡 |  🟡 |  🟡 |  🟡 |  🟡 |  🟡  | 
84+ |                               SGN |  ❌ |  ✅ |  🟡 |  🟡 |  ❌  |   🟡 |  ❌ | 
85+ |                           SIGMOID |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 |  🟡  | 
86+ |                              SILU |  ❌ |  ✅ |  🟡 |  🟡 |  🟡 |  🟡 |  🟡  | 
87+ |                         SILU_BACK |  ❌ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌  |   ✅ | 
88+ |                               SIN |  ❌ |  ✅ |  ✅ |  🟡 |  ❌  |   ✅ |  🟡 | 
89+ |                          SOFT_MAX |  ❌ |  ✅ |  ✅ |  ✅ |  ✅  |   🟡 |  ✅ | 
90+ |                     SOFT_MAX_BACK |  ❌ |  🟡 |  🟡 |  ❌ |  ❌ |  ❌  |   ✅ | 
91+ |                               SQR |  ❌ |  ✅ |  ✅ |  🟡 |  ❌  |   ✅ |  🟡 | 
92+ |                              SQRT |  ❌ |  ✅ |  ✅ |  🟡 |  ❌  |   ✅ |  ❌ | 
93+ |                          SSM_CONV |  ❌ |  ✅ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌  | 
94+ |                          SSM_SCAN |  ❌ |  ✅ |  ✅ |  ✅ |  ❌ |  ❌ |  ❌  | 
95+ |                              STEP |  ❌ |  ✅ |  🟡 |  🟡 |  ❌  |   🟡 |  ❌ | 
96+ |                               SUB |  ❌ |  ✅ |  ✅ |  🟡 |  🟡  |   ✅ |  ✅ | 
97+ |                               SUM |  ❌ |  ✅ |  ✅ |  ❌ |  ❌  |   ✅ |  ✅ | 
98+ |                          SUM_ROWS |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅  | 
99+ |                            SWIGLU |  ❌ |  ✅ |  ✅ |  🟡 |  ✅ |  ✅  |   🟡 | 
100+ |                              TANH |  ❌ |  ✅ |  🟡 |  🟡 |  ✅  |   🟡 |  🟡 | 
101+ |                TIMESTEP_EMBEDDING |  ❌ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅ |  ✅  | 
102+ |                           UPSCALE |  ❌ |  ✅ |  ✅ |  🟡 |  ✅  |   🟡 |  ✅ | 
0 commit comments