Commit bdc02ca
committed
enable modular experts for compressed tensor marlin wn16 moe
Signed-off-by: Lu Fang <[email protected]>1 parent 35d801f commit bdc02ca
File tree
2 files changed
+54
-6
lines changed- vllm/model_executor/layers
- fused_moe
- quantization/compressed_tensors
2 files changed
+54
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
501 | 501 | | |
502 | 502 | | |
503 | 503 | | |
504 | | - | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
505 | 507 | | |
506 | 508 | | |
507 | 509 | | |
| |||
616 | 618 | | |
617 | 619 | | |
618 | 620 | | |
619 | | - | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
620 | 626 | | |
621 | 627 | | |
622 | 628 | | |
| |||
720 | 726 | | |
721 | 727 | | |
722 | 728 | | |
723 | | - | |
724 | | - | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
725 | 734 | | |
726 | 735 | | |
727 | 736 | | |
| |||
Lines changed: 41 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
39 | 43 | | |
40 | 44 | | |
41 | 45 | | |
| |||
1562 | 1566 | | |
1563 | 1567 | | |
1564 | 1568 | | |
1565 | | - | |
| 1569 | + | |
| 1570 | + | |
| 1571 | + | |
| 1572 | + | |
| 1573 | + | |
| 1574 | + | |
| 1575 | + | |
| 1576 | + | |
| 1577 | + | |
| 1578 | + | |
| 1579 | + | |
| 1580 | + | |
| 1581 | + | |
| 1582 | + | |
| 1583 | + | |
| 1584 | + | |
| 1585 | + | |
| 1586 | + | |
| 1587 | + | |
| 1588 | + | |
| 1589 | + | |
| 1590 | + | |
| 1591 | + | |
| 1592 | + | |
| 1593 | + | |
| 1594 | + | |
| 1595 | + | |
| 1596 | + | |
| 1597 | + | |
| 1598 | + | |
| 1599 | + | |
| 1600 | + | |
| 1601 | + | |
| 1602 | + | |
| 1603 | + | |
| 1604 | + | |
1566 | 1605 | | |
1567 | 1606 | | |
1568 | 1607 | | |
| |||
0 commit comments