Skip to content

Implement extension SPV_KHR_float_controls2#3475

Open
jmmartinez wants to merge 10 commits intoKhronosGroup:mainfrom
jmmartinez:users/jmmartinez/spv_khr_float_controls2
Open

Implement extension SPV_KHR_float_controls2#3475
jmmartinez wants to merge 10 commits intoKhronosGroup:mainfrom
jmmartinez:users/jmmartinez/spv_khr_float_controls2

Conversation

@jmmartinez
Copy link
Contributor

@jmmartinez jmmartinez commented Dec 17, 2025

First attempt at implementing SPV_KHR_float_controls2.

  • When doing SPIRV->LLVM-IR, we first read the ExecutionModeFPFastMathDefault for every kernel, and if instructions in that kernel do not specify a particular FPFastMathMode, we use the kernel one.
  • ExecutionModeFPFastMathDefault is not propagated down to the callees.
  • According to SPV_KHR_float_controls2#issues; we do not have an equivalent of LLVM's afn flag. If we map fadd fast float %a, %b to SPIRV and back, it becomes fadd reassoc nnan ninf nsz arcp contract float %a, %b losing the afn flag.
  • ContractionOff and SignedZeroInfNanPreserve are translated to a ExecutionModeFPFastMathDefault with all flags set to 0.

@MrSidims MrSidims requested review from MrSidims, maarquitos14, svenvh and vmaksimo and removed request for vmaksimo December 18, 2025 11:20
@MrSidims
Copy link
Contributor

Should we propagate the attribute down to the callees?

We shouldn't, as you have quoted: "This rule implies that a function appearing in both call graphs of two distinct entry points may behave differently in each case.". Runtime should be able to pass fast math controls from a caller to a callee.

In that case, should we emit a "zero" FPFastMathMode for instructions without any fast-math-flags

I'm a bit worried about bloating size of SPIR-V modules in this case. In general I'd suggest to align behaviour of the translator and SPIR-V backend in areas where it's possible. So I'd expect llvm-spirv's implementation resulting in the same SPIR-V as llvm/llvm-project#146941 aka there should be FPFastMathDefault set.

@maarquitos14
Copy link
Contributor

I'll go on vacation in a few hours, and I'm afraid I will not have time to review this before I leave. Feel free to merge this without my approval, and I'll make sure I review when I'm back, even if it's a post-merge review.

I did want to bring up a couple of related issues, though. Hopefully they can be resolved by this PR.

@jmmartinez
Copy link
Contributor Author

Should we propagate the attribute down to the callees?

We shouldn't, as you have quoted: "This rule implies that a function appearing in both call graphs of two distinct entry points may behave differently in each case.". Runtime should be able to pass fast math controls from a caller to a callee.

Then the current implementation should be good, since it doesn't propagate anything.

In that case, should we emit a "zero" FPFastMathMode for instructions without any fast-math-flags

I'm a bit worried about bloating size of SPIR-V modules in this case. In general I'd suggest to align behavior of the translator and SPIR-V backend in areas where it's possible. So I'd expect llvm-spirv's implementation resulting in the same SPIR-V as llvm/llvm-project#146941 aka there should be FPFastMathDefault set.

I see. Then I should fix this implementation to always emit a FPFastMathDefault with all flags set to 0 for every kernel. Right?

@jmmartinez
Copy link
Contributor Author

In that case, should we emit a "zero" FPFastMathMode for instructions without any fast-math-flags

I'm a bit worried about bloating size of SPIR-V modules in this case. In general I'd suggest to align behavior of the translator and SPIR-V backend in areas where it's possible. So I'd expect llvm-spirv's implementation resulting in the same SPIR-V as llvm/llvm-project#146941 aka there should be FPFastMathDefault set.

I see. Then I should fix this implementation to always emit a FPFastMathDefault with all flags set to 0 for every kernel. Right?

I've addressed this in b691977 . This commit emits an FPFastMathDefault with all flags set to 0 for every kernel.

@jmmartinez
Copy link
Contributor Author

This one is tricky. reassoc maps to AllowTransform; but AllowTransform requires AllowReassoc and AllowContract to be set. So AllowTransform maps back to reassoc contract.

I've added a commit related to this, but I'll file a separate patch since this issue is not related to the float_controls2 extension.

@jmmartinez jmmartinez force-pushed the users/jmmartinez/spv_khr_float_controls2 branch from cd6a10d to 57840f3 Compare December 22, 2025 15:42
@MrSidims
Copy link
Contributor

I've added a commit related to this, but I'll file a separate patch since this issue is not related to the float_controls2 extension.

Fine with me.

Most (if not all) of the folks working on the translator are currently on holidays (including myself), so guess review will be done a bit later :)

(unless there is a super urgency - in this case I can take a look before New Year)

@jmmartinez
Copy link
Contributor Author

I've added a commit related to this, but I'll file a separate patch since this issue is not related to the float_controls2 extension.

Fine with me.

Most (if not all) of the folks working on the translator are currently on holidays (including myself), so guess review will be done a bit later :)

(unless there is a super urgency - in this case I can take a look before New Year)

No problem! It's not urgent.

@jmmartinez jmmartinez force-pushed the users/jmmartinez/spv_khr_float_controls2 branch from 14519f7 to 8fa049e Compare January 5, 2026 12:40
Copy link
Contributor

@MrSidims MrSidims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I'd like to hear from @maarquitos14 before merging.

@jmmartinez
Copy link
Contributor Author

Just in case, I'd like to bring the attention to one of my previous messages about the issue #3125 :

Currently, this PR maps LLVM's reassoc -> AllowReassoc (this is the behavior that was implemented before float_controls2).

In the issue it is suggested that we'd better translate reassoc -> AllowTransform. The problem with this is that AllowTransform implies both AllowContract and AllowReassoc.

Then, if we map LLVM's to SPIRV and back to LLVM we end up with different semantics:

reassoc -> AllowTransform AllowContract AllowReassoc -> contract reassoc

To avoid this, we could translate

reassoc -> AllowReassoc -> no-flags
contract reassoc -> AllowTransform AllowContract AllowReassoc -> contract reassoc

@MrSidims
Copy link
Contributor

MrSidims commented Jan 7, 2026

Currently, this PR maps LLVM's reassoc -> AllowReassoc (this is the behavior that was implemented before float_controls2).

Thanks for bringing the attention back. I believe we should do one thing at a time and fix behaviour in unrelated to this PR patch.

@jmmartinez jmmartinez force-pushed the users/jmmartinez/spv_khr_float_controls2 branch from 8fa049e to dd4806c Compare January 8, 2026 16:11
@maarquitos14
Copy link
Contributor

LGTM

I'd like to hear from @maarquitos14 before merging.

I plan to look at this today/tomorrow.

@maarquitos14
Copy link
Contributor

I've added a commit related to this, but I'll file a separate patch since this issue is not related to the float_controls2 extension

That works for me, thanks. Just highlighted it here to make sure it worked well with the current implementation.

@maarquitos14
Copy link
Contributor

Currently, this PR maps LLVM's reassoc -> AllowReassoc (this is the behavior that was implemented before float_controls2).

Thanks for bringing the attention back. I believe we should do one thing at a time and fix behaviour in unrelated to this PR patch.

@jmmartinez ping me if you do create a separate patch for this.

Copy link
Contributor

@maarquitos14 maarquitos14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass. I'll do a second pass to check tests.

@jmmartinez jmmartinez force-pushed the users/jmmartinez/spv_khr_float_controls2 branch 3 times, most recently from 2abce3e to 1108325 Compare January 19, 2026 15:14
// Get the scalar type to handle vector operands. And get the first operand
// type (instead of the result) due to fcmp instructions.
Type *FloatType = Inst->getOperand(0)->getType()->getScalarType();
auto Func2FMF = FuncToFastMathFlags.find({Inst->getFunction(), FloatType});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tempted to remove this FuncToFastMathFlags stuff.

It's used to set the FPFastMathFlags that are attached to the execution mode to the individual instructions of a kernel.

But since we're preserving the FPFastMathFlags in the metadata; I'm thinking that this is not needed anymore.

@maarquitos14 should I remove this ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that this logic should be still placed somewhere as the middleend and backend are unlikely to know about this metadata out of the box and honestly it feels like for optimization passes it's easier to work with individual instruction flags. IMHO resolving ExecutionMode to FP flag right away in the SPIR-V consumer won't harm and actually make implementation lower-level drivers friendly.

@svenvh @vmaksimo WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see @MrSidims's point, but at the same time, I think if we do this, the translation wouldn't be 100% accurate. Maybe I'm too picky. Honestly, I don't have a strong opinion here, let's see what other folks think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree with @MrSidims take.

Copy link
Contributor

@MrSidims MrSidims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally LGTM, but lets make tests passing :)

@jmmartinez
Copy link
Contributor Author

Functionally LGTM, but lets make tests passing :)

My bad ! In my defense... It passed in my machine. There was one matrix test failing though (but also over main so I didn't look much into it).

@jmmartinez jmmartinez force-pushed the users/jmmartinez/spv_khr_float_controls2 branch 3 times, most recently from 2c6901c to f521131 Compare January 26, 2026 08:51
Copy link
Contributor

@maarquitos14 maarquitos14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do one extra pass through the whole PR to make sure I didn't miss anything, but I'm quite happy with how it looks now. Great work @jmmartinez!

// Get the scalar type to handle vector operands. And get the first operand
// type (instead of the result) due to fcmp instructions.
Type *FloatType = Inst->getOperand(0)->getType()->getScalarType();
auto Func2FMF = FuncToFastMathFlags.find({Inst->getFunction(), FloatType});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see @MrSidims's point, but at the same time, I think if we do this, the translation wouldn't be 100% accurate. Maybe I'm too picky. Honestly, I don't have a strong opinion here, let's see what other folks think.

Copy link
Contributor

@maarquitos14 maarquitos14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more nits and questions, but overall LGTM.

@maarquitos14
Copy link
Contributor

Also, in terms of deprecations, the spec says:

Deprecation
This extension deprecates the following features:

1. The execution modes ContractionOff and SignedZeroInfNanPreserve. Use FPFastMathDefault with the appropriate flags instead.
2. The decoration NoContraction. Use the FPFastMathMode decoration instead.
3. The FPFastMathMode mode bit Fast. Set all the other FPFastMathMode bits instead.
4. Enabling the FPFastMathMode decoration using the Kernel capability. All uses should declare the FloatControls2 capability.
5. The OpenCL.std instructions fmin_common, fmax_common. Use fmin, fmax with NInf and NNaN instead.

I think we still need to cover 2. and 5., the rest are covered/not necessary.

svenvh pushed a commit that referenced this pull request Feb 2, 2026
@jmmartinez jmmartinez force-pushed the users/jmmartinez/spv_khr_float_controls2 branch from f521131 to 2ff9456 Compare February 10, 2026 13:48
@jmmartinez
Copy link
Contributor Author

  1. The decoration NoContraction. Use the FPFastMathMode decoration instead.

The NoContraction decoration requires Shader. So I don't think it applies then in our case.

@jmmartinez
Copy link
Contributor Author

Also, in terms of deprecations, the spec says:

Deprecation
This extension deprecates the following features:

1. The execution modes ContractionOff and SignedZeroInfNanPreserve. Use FPFastMathDefault with the appropriate flags instead.
2. The decoration NoContraction. Use the FPFastMathMode decoration instead.
3. The FPFastMathMode mode bit Fast. Set all the other FPFastMathMode bits instead.
4. Enabling the FPFastMathMode decoration using the Kernel capability. All uses should declare the FloatControls2 capability.
5. The OpenCL.std instructions fmin_common, fmax_common. Use fmin, fmax with NInf and NNaN instead.

I think we still need to cover 2. and 5., the rest are covered/not necessary.

I've addressed .5 in the last commit, but I probably have to revisit it.

Copy link
Contributor

@maarquitos14 maarquitos14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

With this extension, the execution modes `ContractionOff and
`SignedZeroInfNanPreserve` are deprecated and we should use
`FPFastMathDefault` instead.

Additionally, the `FPFastMathMode` mode `Fast` bit is also deprecated.
```
error: ‘V’ may be used uninitialized [-Werror=maybe-uninitialized]
```
@jmmartinez jmmartinez force-pushed the users/jmmartinez/spv_khr_float_controls2 branch from c6fce01 to 083a212 Compare February 16, 2026 15:28
@jmmartinez
Copy link
Contributor Author

CI fully green. There was an error due to -Werror. (I cannot merge on my own).

Copy link
Contributor

@vmaksimo vmaksimo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// Get the scalar type to handle vector operands. And get the first operand
// type (instead of the result) due to fcmp instructions.
Type *FloatType = Inst->getOperand(0)->getType()->getScalarType();
auto Func2FMF = FuncToFastMathFlags.find({Inst->getFunction(), FloatType});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree with @MrSidims take.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants