【Hackathon 9th No.93】【RFC】为 Fastdeploy 新增 MiniMax-M1 模型 #1156

ZhijunLStudio · 2025-09-16T01:23:05Z

本文档为新增 MiniMax-M1 模型的 RFC，规划了从 CUDA 算子开发到模型整体集成的技术方案。

paddle-bot · 2025-09-16T01:23:11Z

你的PR提交成功，感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备，具体请参考示例和模版。
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

luotao1 · 2025-09-16T03:18:02Z

@chang-wenbin

chang-wenbin · 2025-09-16T06:34:22Z

rfcs/FastDeploy/20250916_add_minimax_m1_for_fastdeploy.md

+
+**核心技术路径**:
+1.  **复用**: 最大化复用 GLM-4.5 PR 中已有的 Partial RoPE 和标准 GQA Attention 组件。
+2.  **翻译与开发**: 将 vLLM 的 `lightning_attn.py` (Triton) 翻译为高性能的 CUDA C++ 算子，以支持 MiniMax-M1 的线性注意力层。


可以先用triton算子快速验证，同步开发高性能cuda kernel，FD目前支持使用triton算子

chang-wenbin · 2025-09-16T06:37:39Z

rfcs/FastDeploy/20250916_add_minimax_m1_for_fastdeploy.md

+
+---
+
+### **Phase 1: [核心开发] 实现 Mamba/线性注意力 CUDA 算子 (2-4 周)**


目前看主要开发工作在线性注意力，可以先尝试接入triton算子快速验证下，如果attention性能不佳可以尝试实现cuda kenrel优化端到端性能。

add minimax rfc

6c4fda7

paddle-bot bot added the contributor label Sep 16, 2025

luotao1 mentioned this pull request Sep 16, 2025

【Hackathon 9th】开源贡献个人挑战赛 PaddlePaddle/Paddle#74773

Open

luotao1 self-assigned this Sep 16, 2025

chang-wenbin reviewed Sep 16, 2025

View reviewed changes

chang-wenbin approved these changes Sep 19, 2025

View reviewed changes

luotao1 approved these changes Sep 19, 2025

View reviewed changes

luotao1 merged commit cbd6610 into PaddlePaddle:master Sep 19, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

【Hackathon 9th No.93】【RFC】为 Fastdeploy 新增 MiniMax-M1 模型 #1156

【Hackathon 9th No.93】【RFC】为 Fastdeploy 新增 MiniMax-M1 模型 #1156

Uh oh!

ZhijunLStudio commented Sep 16, 2025

Uh oh!

paddle-bot bot commented Sep 16, 2025

Uh oh!

luotao1 commented Sep 16, 2025

Uh oh!

chang-wenbin Sep 16, 2025

Uh oh!

chang-wenbin Sep 16, 2025

Uh oh!

ZhijunLStudio Sep 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		---

		### Phase 1: [核心开发] 实现 Mamba/线性注意力 CUDA 算子 (2-4 周)

【Hackathon 9th No.93】【RFC】为 Fastdeploy 新增 MiniMax-M1 模型 #1156

【Hackathon 9th No.93】【RFC】为 Fastdeploy 新增 MiniMax-M1 模型 #1156

Uh oh!

Conversation

ZhijunLStudio commented Sep 16, 2025

Uh oh!

paddle-bot bot commented Sep 16, 2025

Uh oh!

luotao1 commented Sep 16, 2025

Uh oh!

chang-wenbin Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

chang-wenbin Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

ZhijunLStudio Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants