Skip to content

Conversation

@WanRui37
Copy link
Contributor

@WanRui37 WanRui37 commented Sep 6, 2025

增加单测 cutlass_fp8_fp8_fp8_dual_gemm_fused

  • 当前支持sm90+,其余做了skip处理
  • setup_ops.py阐明了只有cc>=89的时候才会执行auto_gen_fp8_fp8_gemm_fused_kernels_sm90auto_gen_fp8_fp8_gemm_fused_kernels
  • 但是在4090 (sm89) 上用 python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/stable/fastdeploy-gpu-86_89/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple 添加bias仍然会报错
    terminate called after throwing an instance of 'std::runtime_error' what(): fp8 dual gemm_fused config is invalid. Aborted (core dumped)
  • 4090上去掉bias之后虽然不报错,但是结果也不正确,说明python -m pip install安装还是以sm90去安装的,fastdeploy-gpu-86_89存在问题

@paddle-bot
Copy link

paddle-bot bot commented Sep 6, 2025

Thanks for your contribution!

@CLAassistant
Copy link

CLAassistant commented Sep 6, 2025

CLA assistant check
All committers have signed the CLA.

@paddle-bot paddle-bot bot added the contributor External developers label Sep 6, 2025
@ckl117 ckl117 merged commit 276f73c into PaddlePaddle:develop Sep 10, 2025
25 of 28 checks passed
@luotao1
Copy link
Collaborator

luotao1 commented Sep 11, 2025

hi, @WanRui37

  • 非常感谢你对飞桨的贡献,我们正在运营一个PFCC组织。PFCC是飞桨开源的贡献者俱乐部,只有给飞桨合入过代码的开发者才能加入,俱乐部里每两周会有一次例会(按兴趣参加),也会时不时办线下meetup面基,详情可见 https://github.com/luotao1 主页说明。
  • 如果你对PFCC有兴趣,请发送邮件至 [email protected],我们会邀请你加入~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants