Commit a37f9cc
[Kernel] [Quantization] Add MXFP4 and bias support for marlin kernel (vllm-project#22428)
Signed-off-by: rongfu.leng <[email protected]>
Signed-off-by: Jinzhen Lin <[email protected]>
Signed-off-by: Huzaifa Sidhpurwala <[email protected]>
Signed-off-by: Varun Sundar Rabindranath <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: Animesh Jain <[email protected]>
Signed-off-by: Rui Qiao <[email protected]>
Signed-off-by: Xiongfei Wei <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: yewentao256 <[email protected]>
Signed-off-by: kf <[email protected]>
Signed-off-by: vllmellm <[email protected]>
Signed-off-by: NickLucche <[email protected]>
Signed-off-by: Dipika Sikka <[email protected]>
Signed-off-by: Sage Moore <[email protected]>
Signed-off-by: tjtanaavllm <[email protected]>
Signed-off-by: Yong Hoon Shin <[email protected]>
Signed-off-by: Chih-Chieh-Yang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Vadim Gimpelson <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: zRzRzRzRzRzRzR <[email protected]>
Signed-off-by: Chih-Chieh Yang <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: Thomas Parnell <[email protected]>
Signed-off-by: yan <[email protected]>
Signed-off-by: Yan Ma <[email protected]>
Signed-off-by: Xiao Liu <[email protected]>
Signed-off-by: jiahanc <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Ye (Charlotte) Qi <[email protected]>
Signed-off-by: LopezCastroRoberto <[email protected]>
Signed-off-by: Andy Xie <[email protected]>
Signed-off-by: Haibin Lin <[email protected]>
Signed-off-by: David Ben-David <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]>
Signed-off-by: jiang1.li <[email protected]>
Signed-off-by: Seiji Eicher <[email protected]>
Signed-off-by: zitian.zhao <[email protected]>
Signed-off-by: 22quinn <[email protected]>
Signed-off-by: Abirdcfly <[email protected]>
Signed-off-by: Giancarlo Delfin <[email protected]>
Signed-off-by: Tyler Michael Smith <[email protected]>
Signed-off-by: huangweixiao <[email protected]>
Signed-off-by: alyosha-swamy <[email protected]>
Signed-off-by: Eric Hanley <[email protected]>
Signed-off-by: Abatom <[email protected]>
Signed-off-by: CLFutureX <[email protected]>
Signed-off-by: Linkun Chen <[email protected]>
Signed-off-by: tjtanaa <[email protected]>
Signed-off-by: Gregory Shtrasberg <[email protected]>
Signed-off-by: tlipoca9 <[email protected]>
Signed-off-by: elvischenv <[email protected]>
Signed-off-by: zitian zhao <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: wang.yuqi <[email protected]>
Signed-off-by: Benji Beck <[email protected]>
Signed-off-by: Siyuan Liu <[email protected]>
Signed-off-by: Benjamin Chislett <[email protected]>
Signed-off-by: isotr0py <[email protected]>
Signed-off-by: Chen Zhang <[email protected]>
Signed-off-by: simon-mo <[email protected]>
Signed-off-by: LucasWilkinson <[email protected]>
Signed-off-by: Zhang Jason <[email protected]>
Signed-off-by: Yongye Zhu <[email protected]>
Signed-off-by: asafg <[email protected]>
Signed-off-by: Siyuan Fu <[email protected]>
Signed-off-by: Lain <[email protected]>
Signed-off-by: Max de Bayser <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
Signed-off-by: Kunshang Ji <[email protected]>
Signed-off-by: Tao He <[email protected]>
Signed-off-by: Michael Goin <[email protected]>
Signed-off-by: QscQ <[email protected]>
Signed-off-by: qingjun <[email protected]>
Signed-off-by: Syed Muhammad Bin Asif <[email protected]>
Signed-off-by: Lionel Villard <[email protected]>
Signed-off-by: ycyaw66 <[email protected]>
Signed-off-by: David Chen <[email protected]>
Signed-off-by: Linkun <[email protected]>
Signed-off-by: Moritz Sanft <[email protected]>
Signed-off-by: Ming Yang <[email protected]>
Signed-off-by: Adrian Garcia <[email protected]>
Signed-off-by: shaojunqi <[email protected]>
Signed-off-by: Ricardo Decal <[email protected]>
Signed-off-by: Andrew Chan <[email protected]>
Signed-off-by: Felix Marty <[email protected]>
Signed-off-by: Andrew Sansom <[email protected]>
Signed-off-by: Zhiyu Cheng <[email protected]>
Signed-off-by: Shu Wang <[email protected]>
Signed-off-by: Po-Han Huang <[email protected]>
Signed-off-by: Shu Wang. <[email protected]>
Signed-off-by: XIn Li <[email protected]>
Signed-off-by: Junhao Li <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: iAmir97 <[email protected]>
Signed-off-by: iAmir97 <[email protected]>
Signed-off-by: <[email protected]>
Signed-off-by: Guy Stone <[email protected]>
Signed-off-by: <[email protected]>
Signed-off-by: yyw <[email protected]>
Signed-off-by: Russell Bryant <[email protected]>
Signed-off-by: Pradyun Ramadorai <[email protected]>
Signed-off-by: Pradyun92 <[email protected]>
Signed-off-by: Jinzhen Lin <[email protected]>
Co-authored-by: rongfu.leng <[email protected]>
Co-authored-by: Huzaifa Sidhpurwala <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Russell Bryant <[email protected]>
Co-authored-by: Varun Sundar Rabindranath <[email protected]>
Co-authored-by: Varun Sundar Rabindranath <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
Co-authored-by: Jee Jee Li <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Co-authored-by: Animesh Jain <[email protected]>
Co-authored-by: Rui Qiao <[email protected]>
Co-authored-by: XiongfeiWei <[email protected]>
Co-authored-by: Nick Hill <[email protected]>
Co-authored-by: Wentao Ye <[email protected]>
Co-authored-by: JartX <[email protected]>
Co-authored-by: fhl2000 <[email protected]>
Co-authored-by: vllmellm <[email protected]>
Co-authored-by: kf <[email protected]>
Co-authored-by: Nicolò Lucchesi <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
Co-authored-by: Sage Moore <[email protected]>
Co-authored-by: tjtanaavllm <[email protected]>
Co-authored-by: Yong Hoon Shin <[email protected]>
Co-authored-by: Chih-Chieh Yang <[email protected]>
Co-authored-by: Roger Wang <[email protected]>
Co-authored-by: Vadim Gimpelson <[email protected]>
Co-authored-by: Yuxuan Zhang <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Thomas Parnell <[email protected]>
Co-authored-by: Yan Ma <[email protected]>
Co-authored-by: Xiao <[email protected]>
Co-authored-by: jiahanc <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Co-authored-by: Ye (Charlotte) Qi <[email protected]>
Co-authored-by: Roberto L. Castro <[email protected]>
Co-authored-by: Ning Xie <[email protected]>
Co-authored-by: H <[email protected]>
Co-authored-by: David Ben-David <[email protected]>
Co-authored-by: David Ben-David <[email protected]>
Co-authored-by: Woosuk Kwon <[email protected]>
Co-authored-by: Li, Jiang <[email protected]>
Co-authored-by: TankNee <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Seiji Eicher <[email protected]>
Co-authored-by: ZiTian.Zhao <[email protected]>
Co-authored-by: 22quinn <[email protected]>
Co-authored-by: Abirdcfly <[email protected]>
Co-authored-by: Giancarlo Delfin <[email protected]>
Co-authored-by: Chenxi Yang <[email protected]>
Co-authored-by: Chenxi Yang <[email protected]>
Co-authored-by: Tyler Michael Smith <[email protected]>
Co-authored-by: Weixiao Huang <[email protected]>
Co-authored-by: Raghav Ravishankar <[email protected]>
Co-authored-by: ericehanley <[email protected]>
Co-authored-by: Zhonghua Deng <[email protected]>
Co-authored-by: Po-Han Huang (NVIDIA) <[email protected]>
Co-authored-by: PiteXChen <[email protected]>
Co-authored-by: lkchen <[email protected]>
Co-authored-by: TJian <[email protected]>
Co-authored-by: Gregory Shtrasberg <[email protected]>
Co-authored-by: tlipoca9 <[email protected]>
Co-authored-by: elvischenv <[email protected]>
Co-authored-by: wang.yuqi <[email protected]>
Co-authored-by: Benji Beck <[email protected]>
Co-authored-by: youkaichao <[email protected]>
Co-authored-by: Siyuan Liu <[email protected]>
Co-authored-by: Benjamin Chislett <[email protected]>
Co-authored-by: LiuXiaoxuanPKU <[email protected]>
Co-authored-by: simon-mo <[email protected]>
Co-authored-by: Chen Zhang <[email protected]>
Co-authored-by: Hongxia Yang <[email protected]>
Co-authored-by: Minseok Lee <[email protected]>
Co-authored-by: Yongye Zhu <[email protected]>
Co-authored-by: Lucas Wilkinson <[email protected]>
Co-authored-by: Zhang Jason <[email protected]>
Co-authored-by: Asaf Joseph Gardin <[email protected]>
Co-authored-by: asafg <[email protected]>
Co-authored-by: Lain <[email protected]>
Co-authored-by: tc-mb <[email protected]>
Co-authored-by: imning3 <[email protected]>
Co-authored-by: Maximilien de Bayser <[email protected]>
Co-authored-by: Kunshang Ji <[email protected]>
Co-authored-by: Tao He <[email protected]>
Co-authored-by: qscqesze <[email protected]>
Co-authored-by: Syed Muhammad Bin Asif <[email protected]>
Co-authored-by: Lionel Villard <[email protected]>
Co-authored-by: WeiQing Chen <[email protected]>
Co-authored-by: ycyaw66 <[email protected]>
Co-authored-by: Moritz Sanft <[email protected]>
Co-authored-by: Ming Yang <[email protected]>
Co-authored-by: Adrián García García <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Co-authored-by: JaceyShao <[email protected]>
Co-authored-by: shaojunqi <[email protected]>
Co-authored-by: Ricardo Decal <[email protected]>
Co-authored-by: Andrew Chan <[email protected]>
Co-authored-by: fxmarty-amd <[email protected]>
Co-authored-by: Andrew Sansom <[email protected]>
Co-authored-by: Zhiyu <[email protected]>
Co-authored-by: Shu Wang <[email protected]>
Co-authored-by: XIn Li <[email protected]>
Co-authored-by: Junhao Li <[email protected]>
Co-authored-by: Chauncey <[email protected]>
Co-authored-by: iAmir97 <[email protected]>
Co-authored-by: iAmir97 <[email protected]>
Co-authored-by: Hong Hanh <[email protected]>
Co-authored-by: Daniel Serebrenik <[email protected]>
Co-authored-by: yewentao256 <[email protected]>
Co-authored-by: Guy Stone <[email protected]>
Co-authored-by: yyweiss <[email protected]>
Co-authored-by: Pradyun92 <[email protected]>
Co-authored-by: Pradyun Ramadorai <[email protected]>
Co-authored-by: Nicolò Lucchesi <[email protected]>1 parent 7a995da commit a37f9cc
File tree
34 files changed
+1126
-322
lines changed- benchmarks/kernels
- csrc
- core
- moe
- marlin_moe_wna16
- quantization/gptq_marlin
- tests/kernels
- moe
- quantization
- vllm
- model_executor/layers
- fused_moe
- quantization
- compressed_tensors
- kernels/mixed_precision
- utils
34 files changed
+1126
-322
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
351 | 351 | | |
352 | 352 | | |
353 | 353 | | |
| 354 | + | |
| 355 | + | |
354 | 356 | | |
355 | 357 | | |
356 | 358 | | |
| |||
364 | 366 | | |
365 | 367 | | |
366 | 368 | | |
| 369 | + | |
| 370 | + | |
367 | 371 | | |
| 372 | + | |
368 | 373 | | |
369 | 374 | | |
370 | 375 | | |
| |||
854 | 859 | | |
855 | 860 | | |
856 | 861 | | |
| 862 | + | |
| 863 | + | |
857 | 864 | | |
858 | 865 | | |
859 | 866 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
236 | 236 | | |
237 | 237 | | |
238 | 238 | | |
| 239 | + | |
239 | 240 | | |
240 | 241 | | |
241 | 242 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
321 | 321 | | |
322 | 322 | | |
323 | 323 | | |
| 324 | + | |
| 325 | + | |
324 | 326 | | |
325 | 327 | | |
326 | 328 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| |||
77 | 78 | | |
78 | 79 | | |
79 | 80 | | |
| 81 | + | |
80 | 82 | | |
81 | 83 | | |
82 | 84 | | |
| |||
89 | 91 | | |
90 | 92 | | |
91 | 93 | | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
92 | 106 | | |
93 | 107 | | |
94 | 108 | | |
| 109 | + | |
95 | 110 | | |
96 | 111 | | |
97 | 112 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
11 | | - | |
12 | | - | |
13 | | - | |
14 | | - | |
15 | | - | |
16 | | - | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | | - | |
21 | | - | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| 28 | + | |
27 | 29 | | |
28 | 30 | | |
29 | 31 | | |
| |||
0 commit comments