Skip to content

[Cherry-pick to release/v0.5.12] [Bug] Fix V4-Pro NaN on Blackwell by converting fp8_einsum input scale to ue8m0 (#25733)#26063

Merged
Kangyan-Zhou merged 1 commit into
release/v0.5.12from
cherry-pick/79ea30d1f-to-v0.5.12-bdb01885
May 22, 2026
Merged

[Cherry-pick to release/v0.5.12] [Bug] Fix V4-Pro NaN on Blackwell by converting fp8_einsum input scale to ue8m0 (#25733)#26063
Kangyan-Zhou merged 1 commit into
release/v0.5.12from
cherry-pick/79ea30d1f-to-v0.5.12-bdb01885

Conversation

@Kangyan-Zhou

@Kangyan-Zhou Kangyan-Zhou commented May 22, 2026

Copy link
Copy Markdown
Collaborator

Cherry-pick of commit 79ea30d1f134b741ddc55f75db45c0299fa2a642 to release/v0.5.12.

Source PR: #25733 (#25733)
Source commit: 79ea30d1f134b741ddc55f75db45c0299fa2a642
Original title: [Bug] Fix V4-Pro NaN on Blackwell by converting fp8_einsum input scale to ue8m0


This PR was automatically created by the cherry-pick workflow.


CI States

Latest PR Test (Base): ❌ Missing run-ci label -- add it to run CI tests.
Latest PR Test (Extra): ❌ Blocked -- run-ci is required first.

@Kangyan-Zhou Kangyan-Zhou merged commit f938570 into release/v0.5.12 May 22, 2026
4 checks passed
@Kangyan-Zhou Kangyan-Zhou deleted the cherry-pick/79ea30d1f-to-v0.5.12-bdb01885 branch May 22, 2026 07:54

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the DeepSeek V4 model implementation by adding a call to deep_gemm.ceil_to_ue8m0 to adjust the scale factor o_s before performing an FP8 einsum operation. I have no feedback to provide as there were no review comments to evaluate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants