Skip to content
158 changes: 158 additions & 0 deletions llvm/docs/LangRef.rst
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a note to both intrinsics that the supported conversions are target dependent. (These aren't going to get generic legalization support, right?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should get generic legalization support

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this note just makes things worse. It's stating poor QoI as a goal.

Many intrinsics are broken for different type combinations on different targets, but this isn't a desirable state. There isn't anything target dependent required to legalize these

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My expectation for these intrinsics is that they indeed do not have generic legalization support. They're just a generic spelling for target-specific conversions.

These intrinsics, if they were legalized, should use libcall legalizations, but there are no libcalls for these and I don't expect that they are going to be introduced, so I don't think legalization support makes a lot of sense.

Having someone implement inline expansions for all the type combinations without actually having a use case for it sounds like a massive waste of time.

Copy link
Contributor Author

@MrSidims MrSidims Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I had in mind is that for FP4 conversions legalization is quite trivial in case if add a look-up table, then FP4 value becomes just an index. After double checking - FP6 case seem to be also trivial as it's indeed just bit shuffling + rounding.

There are FP8 + FN/FNUZ/E8M0FNU case, where generic lowering stops being “just shuffle + one rounding bit” and becomes “shuffle + full special-case semantics + careful NaN/zero rules.”. I lean towards agreeing, that generic legalization is feasible, yet not 100% sure if there will be interest for the intrinsics outside of the use cases, when a hardware supports the conversions in a performant way.

Copy link
Contributor Author

@MrSidims MrSidims Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to remove the note and implement generic legalization in this PR? I'm asking as I'm not 100% sure if I'll be able to work on this very PR past January 14th due to a job switch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the note. Lets actually land some implementation and then see how it goes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving the legalization out of this PR is fine for now, but I do think it should be implemented. That significantly increases the utility of the intrinsics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g., for single source languages it's very useful to have common operations you can rely on for the host and device code. Restrictions to device-only are limiting

Original file line number Diff line number Diff line change
Expand Up @@ -21714,6 +21714,164 @@ environment <floatenv>` *except* for the rounding mode.
This intrinsic is not supported on all targets. Some targets may not support
all rounding modes.

'``llvm.convert.to.arbitrary.fp``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""

::

declare <iNxM> @llvm.convert.to.arbitrary.fp.<iNxM>.<fNxM>(
<fNxM> <value>, metadata <interpretation>,
metadata <rounding mode>, i1 <saturation>)

Overview:
"""""""""

The ``llvm.convert.to.arbitrary.fp`` intrinsic converts a native LLVM
floating-point value to an arbitrary FP format, returning the result as an integer
containing the arbitrary FP bits. This intrinsic is overloaded on both its return
type and first argument.

Arguments:
""""""""""

``value``
The native LLVM floating-point value to convert (e.g., ``half``, ``float``, ``double``).

``interpretation``
A metadata string describing the target arbitrary FP format. Supported format names include:

- FP8 formats: ``"Float8E5M2"``, ``"Float8E5M2FNUZ"``, ``"Float8E4M3"``,
``"Float8E4M3FN"``, ``"Float8E4M3FNUZ"``, ``"Float8E4M3B11FNUZ"``, ``"Float8E3M4"``,
``"Float8E8M0FNU"``
- FP6 formats: ``"Float6E3M2FN"``, ``"Float6E2M3FN"``
- FP4 formats: ``"Float4E2M1FN"``

``rounding mode``
A metadata string specifying the rounding mode. The permitted strings match those
accepted by ``llvm.fptrunc.round`` (for example,
``"round.tonearest"`` or ``"round.towardzero"``).

The rounding mode is only consulted when ``value`` is not exactly representable in the target format.
If the value is exactly representable, all rounding modes produce the same result.

``saturation``
A compile-time constant boolean value (``i1``). This parameter controls how overflow is handled
when values exceed the representable finite range of the target format:

- When ``true``: overflowing values are clamped to the minimum or maximum representable finite value
(saturating to the largest negative finite value or largest positive finite value).
- When ``false``: overflowing values are converted to infinity (preserving sign of the original value) if the
target format supports infinity, or return a poison value if infinity is not supported
by the target format.

This parameter must be an immediate constant.

Semantics:
""""""""""

The intrinsic converts the native LLVM floating-point value to the arbitrary FP
format specified by ``interpretation``, applying the requested rounding mode and
saturation behavior. The conversion is performed in two steps: first, the value is
rounded according to the specified rounding mode to fit the target format's precision;
then, if the rounded result exceeds the target format's representable range, saturation
is applied according to the ``saturation`` parameter. The result is returned as an
integer (e.g., ``i8`` for FP8, ``i6`` for FP6) containing the encoded arbitrary FP bits.

**Handling of special values:**

- **NaN**: NaN values follow LLVM's standard :ref:`NaN rules <floatnan>`. When the target
format supports NaN, the NaN representation is preserved (quiet NaNs remain quiet, signaling
NaNs remain signaling). The exact NaN payload may be truncated or extended to fit the target
format's payload size. If the target format does not support NaN, the intrinsic returns a
poison value.
- **Infinity and Overflow**: If the input is +/-Inf or a finite value that exceeds the representable range:

- When ``saturation`` is ``false`` and the target format supports infinity, the result is +/-Inf (preserving the sign).
- When ``saturation`` is ``false`` and the target format does not support infinity (e.g., formats
with "FN" suffix), the intrinsic returns a poison value.
- When ``saturation`` is ``true``, the value is clamped to the maximum/minimum representable finite value.

For FP6/FP4 interpretations, producers are expected to use ``saturation`` = ``true``; using ``saturation`` = ``false`` and generating NaN/Inf/overflowing values results in a poison value.

Example:
""""""""

::

; Convert half to FP8 E4M3 format
%fp8bits = call i8 @llvm.convert.to.arbitrary.fp.i8.f16(
half %value, metadata !"Float8E4M3",
metadata !"round.tonearest", i1 false)

; Convert vector of float to FP8 E5M2 with saturation
%vec_fp8 = call <4 x i8> @llvm.convert.to.arbitrary.fp.v4i8.v4f32(
<4 x float> %values, metadata !"Float8E5M2",
metadata !"round.towardzero", i1 true)

'``llvm.convert.from.arbitrary.fp``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""

::

declare <fNxM> @llvm.convert.from.arbitrary.fp.<fNxM>.<iNxM>(
<iNxM> <value>, metadata <interpretation>)

Overview:
"""""""""

The ``llvm.convert.from.arbitrary.fp`` intrinsic converts an integer containing
arbitrary FP bits to a native LLVM floating-point value. This intrinsic is
overloaded on both its return type and first argument.

Arguments:
""""""""""

``value``
An integer value containing the arbitrary FP bits (e.g., ``i8`` for FP8, ``i6`` for FP6).

``interpretation``
A metadata string describing the source arbitrary FP format. Supported format names include:

- FP8 formats: ``"Float8E5M2"``, ``"Float8E5M2FNUZ"``, ``"Float8E4M3"``,
``"Float8E4M3FN"``, ``"Float8E4M3FNUZ"``, ``"Float8E4M3B11FNUZ"``, ``"Float8E3M4"``,
``"Float8E8M0FNU"``
- FP6 formats: ``"Float6E3M2FN"``, ``"Float6E2M3FN"``
- FP4 formats: ``"Float4E2M1FN"``

Semantics:
""""""""""

The intrinsic interprets the integer value as arbitrary FP bits according to
``interpretation``, then converts to the native LLVM floating-point result type.

Conversions from arbitrary FP formats to native LLVM floating-point types are
widening conversions (e.g., FP8 to FP16 or FP32), which are exact and require no rounding.
Normal finite values are converted exactly. NaN values follow LLVM's standard :ref:`NaN rules
<floatnan>`; the NaN representation is preserved (quiet NaNs remain quiet, signaling NaNs
remain signaling), and the NaN payload may be truncated or extended to fit the target format's
payload size. Infinity values are preserved as infinity. If a value exceeds the representable
range of the target type (for example, converting ``Float8E8M0FNU`` with large exponents to
``half``), the result is converted to infinity with the appropriate sign.

Example:
""""""""

::

; Convert FP8 E4M3 bits to half
%half_val = call half @llvm.convert.from.arbitrary.fp.f16.i8(
i8 %fp8bits, metadata !"Float8E4M3")

; Convert vector of FP8 E5M2 bits to float
%vec_float = call <4 x float> @llvm.convert.from.arbitrary.fp.v4f32.v4i8(
<4 x i8> %fp8_values, metadata !"Float8E5M2")

Convergence Intrinsics
----------------------

Expand Down
5 changes: 5 additions & 0 deletions llvm/include/llvm/ADT/APFloat.h
Original file line number Diff line number Diff line change
Expand Up @@ -407,6 +407,11 @@ class APFloatBase {
/// Returns the size of the floating point number (in bits) in the given
/// semantics.
LLVM_ABI static unsigned getSizeInBits(const fltSemantics &Sem);

/// Returns true if the given string is a valid arbitrary floating-point
/// format interpretation for llvm.convert.to.arbitrary.fp and
/// llvm.convert.from.arbitrary.fp intrinsics.
LLVM_ABI static bool isValidArbitraryFPFormat(StringRef Format);
};

namespace detail {
Expand Down
16 changes: 16 additions & 0 deletions llvm/include/llvm/IR/Intrinsics.td
Original file line number Diff line number Diff line change
Expand Up @@ -1133,6 +1133,22 @@ let IntrProperties = [IntrNoMem, IntrSpeculatable, IntrNoCreateUndefOrPoison] in
def int_fptrunc_round : DefaultAttrsIntrinsic<[ llvm_anyfloat_ty ],
[ llvm_anyfloat_ty, llvm_metadata_ty ]>;

// Convert from native LLVM floating-point to arbitrary FP format
// Returns an integer containing the arbitrary FP bits
def int_convert_to_arbitrary_fp
: DefaultAttrsIntrinsic<
[ llvm_anyint_ty ],
[ llvm_anyfloat_ty, llvm_metadata_ty, llvm_metadata_ty, llvm_i1_ty ],
[ IntrNoMem, IntrSpeculatable, ImmArg<ArgIndex<3>> ]>;

// Convert from arbitrary FP format to native LLVM floating-point
// Takes an integer containing the arbitrary FP bits
def int_convert_from_arbitrary_fp
: DefaultAttrsIntrinsic<
[ llvm_anyfloat_ty ],
[ llvm_anyint_ty, llvm_metadata_ty ],
[ IntrNoMem, IntrSpeculatable ]>;

def int_canonicalize : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>],
[IntrNoMem]>;
// Arithmetic fence intrinsic.
Expand Down
76 changes: 76 additions & 0 deletions llvm/lib/IR/Verifier.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@
#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Dominators.h"
#include "llvm/IR/EHPersonalities.h"
#include "llvm/IR/FPEnv.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/GCStrategy.h"
#include "llvm/IR/GlobalAlias.h"
Expand Down Expand Up @@ -5972,6 +5973,81 @@ void Verifier::visitIntrinsicCall(Intrinsic::ID ID, CallBase &Call) {
"unsupported rounding mode argument", Call);
break;
}
case Intrinsic::convert_to_arbitrary_fp: {
// Check that vector element counts are consistent.
Type *ValueTy = Call.getArgOperand(0)->getType();
Type *IntTy = Call.getType();

if (auto *ValueVecTy = dyn_cast<VectorType>(ValueTy)) {
auto *IntVecTy = dyn_cast<VectorType>(IntTy);
Check(IntVecTy,
"if floating-point operand is a vector, integer operand must also "
"be a vector",
Call);
Check(ValueVecTy->getElementCount() == IntVecTy->getElementCount(),
"floating-point and integer vector operands must have the same "
"element count",
Call);
}

// Check interpretation metadata (argoperand 1).
auto *InterpMAV = dyn_cast<MetadataAsValue>(Call.getArgOperand(1));
Check(InterpMAV, "missing interpretation metadata operand", Call);
auto *InterpStr = dyn_cast<MDString>(InterpMAV->getMetadata());
Check(InterpStr, "interpretation metadata operand must be a string", Call);
StringRef Interp = InterpStr->getString();

Check(!Interp.empty(), "interpretation metadata string must not be empty",
Call);

// Valid interpretation strings: mini-float format names.
Check(APFloatBase::isValidArbitraryFPFormat(Interp),
"unsupported interpretation metadata string", Call);

// Check rounding mode metadata (argoperand 2).
auto *RoundingMAV = dyn_cast<MetadataAsValue>(Call.getArgOperand(2));
Check(RoundingMAV, "missing rounding mode metadata operand", Call);
auto *RoundingStr = dyn_cast<MDString>(RoundingMAV->getMetadata());
Check(RoundingStr, "rounding mode metadata operand must be a string", Call);

std::optional<RoundingMode> RM =
convertStrToRoundingMode(RoundingStr->getString());
Check(RM && *RM != RoundingMode::Dynamic,
"unsupported rounding mode argument", Call);
break;
}
case Intrinsic::convert_from_arbitrary_fp: {
// Check that vector element counts are consistent.
Type *IntTy = Call.getArgOperand(0)->getType();
Type *ValueTy = Call.getType();

if (auto *ValueVecTy = dyn_cast<VectorType>(ValueTy)) {
auto *IntVecTy = dyn_cast<VectorType>(IntTy);
Check(IntVecTy,
"if floating-point operand is a vector, integer operand must also "
"be a vector",
Call);
Check(ValueVecTy->getElementCount() == IntVecTy->getElementCount(),
"floating-point and integer vector operands must have the same "
"element count",
Call);
}

// Check interpretation metadata (argoperand 1).
auto *InterpMAV = dyn_cast<MetadataAsValue>(Call.getArgOperand(1));
Check(InterpMAV, "missing interpretation metadata operand", Call);
auto *InterpStr = dyn_cast<MDString>(InterpMAV->getMetadata());
Check(InterpStr, "interpretation metadata operand must be a string", Call);
StringRef Interp = InterpStr->getString();

Check(!Interp.empty(), "interpretation metadata string must not be empty",
Call);

// Valid interpretation strings: mini-float format names.
Check(APFloatBase::isValidArbitraryFPFormat(Interp),
"unsupported interpretation metadata string", Call);
break;
}
#define BEGIN_REGISTER_VP_INTRINSIC(VPID, ...) case Intrinsic::VPID:
#include "llvm/IR/VPIntrinsics.def"
#undef BEGIN_REGISTER_VP_INTRINSIC
Expand Down
8 changes: 8 additions & 0 deletions llvm/lib/Support/APFloat.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6155,6 +6155,14 @@ float APFloat::convertToFloat() const {
return Temp.getIEEE().convertToFloat();
}

bool APFloatBase::isValidArbitraryFPFormat(StringRef Format) {
static constexpr StringLiteral ValidFormats[] = {
"Float8E5M2", "Float8E5M2FNUZ", "Float8E4M3", "Float8E4M3FN",
"Float8E4M3FNUZ", "Float8E4M3B11FNUZ", "Float8E3M4", "Float8E8M0FNU",
"Float6E3M2FN", "Float6E2M3FN", "Float4E2M1FN"};
return llvm::is_contained(ValidFormats, Format);
}

APFloat::Storage::~Storage() {
if (usesLayout<IEEEFloat>(*semantics)) {
IEEE.~IEEEFloat();
Expand Down
Loading