From 65c0fcb6b9cc54e3b05e997c85d0c16bb36ea59a Mon Sep 17 00:00:00 2001
From: Momchil Velikov <momchil.velikov@arm.com>
Date: Fri, 13 Sep 2024 10:47:25 +0100
Subject: [PATCH] [fixup] Describe FP8 types and add alpha markers

Change-Id: Idd90f08498c2e8b207218bf97e66747e6b61037a
---
 aapcs64/aapcs64.rst | 179 +++++++++++++++++++++++---------------------
 1 file changed, 95 insertions(+), 84 deletions(-)

diff --git a/aapcs64/aapcs64.rst b/aapcs64/aapcs64.rst
index fb0f1d4..ec6e288 100644
--- a/aapcs64/aapcs64.rst
+++ b/aapcs64/aapcs64.rst
@@ -892,6 +892,8 @@ thread-local storage on platforms where multi-threaded code is
 supported.  The exact location of such information is platform
 specific.
 
+**(Alpha)** 
+
 The FPMR is a system register that controls behaviors of the FP8 instructions.
 It is a temporary register.
 
@@ -2573,6 +2575,9 @@ The mapping of C arithmetic types to Fundamental Data Types is shown in `Table 3
   |                                |                                         | significant bits of the type in a big-endian view.  Non-significant    |
   |                                |                                         | bits within the last quad-word are unspecified.                        |
   +--------------------------------+-----------------------------------------+------------------------------------------------------------------------+
+  |  **(Alpha)** ``__mfp8``        | unsigned byte                           | Arm extension. Values are intrepreted as either E5M2 or E4M3,          |
+  |                                |                                         | depending on processor mode.                                           |
+  +--------------------------------+-----------------------------------------+------------------------------------------------------------------------+
 
 A platform ABI may specify a different combination of primitive variants but we discourage this.
 
@@ -2968,61 +2973,65 @@ The header file ``arm_neon.h`` also defines a number of intrinsic functions that
 
 .. table:: Table 7: Short vector extended types
 
-  +-----------------+-------------------+--------------------------+-----------+
-  | Internal type   | arm\_neon.h type  | Base Type                | Elements  |
-  +=================+===================+==========================+===========+
-  | __Int8x8\_t     | int8x8\_t         | signed byte              | 8         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Int16x4\_t    | int16x4\_t        | signed half-word         | 4         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Int32x2\_t    | int32x2\_t        | signed word              | 2         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Uint8x8\_t    | uint8x8\_t        | unsigned byte            | 8         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Uint16x4\_t   | uint16x4\_t       | unsigned half-word       | 4         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Uint32x2\_t   | uint32x2\_t       | unsigned word            | 2         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Float16x4\_t  | float16x4\_t      | half-precision float     | 4         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Float32x2\_t  | float32x2\_t      | single-precision float   | 2         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Poly8x8\_t    | poly8x8\_t        | unsigned byte            | 8         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Poly16x4\_t   | poly16x4\_t       | unsigned half-word       | 4         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Int8x16\_t    | int8x16\_t        | signed byte              | 16        |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Int16x8\_t    | int16x8\_t        | signed half-word         | 8         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Int32x4\_t    | int32x4\_t        | signed word              | 4         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Int64x2\_t    | int64x2\_t        | signed double-word       | 2         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Uint8x16\_t   | uint8x16\_t       | unsigned byte            | 16        |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Uint16x8\_t   | uint16x8\_t       | unsigned half-word       | 8         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Uint32x4\_t   | uint32x4\_t       | unsigned word            | 4         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Uint64x2\_t   | uint64x2\_t       | unsigned double-word     | 2         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Float16x8\_t  | float16x8\_t      | half-precision float     | 8         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Float32x4\_t  | float32x4\_t      | single-precision float   | 4         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Float64x2\_t  | float64x2\_t      | double-precision float   | 2         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Poly8x16\_t   | poly8x16\_t       | unsigned byte            | 16        |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Poly16x8\_t   | poly16x8\_t       | unsigned half-word       | 8         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Poly64x2\_t   | poly64x2\_t       | unsigned double-word     | 2         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Bfloat16x4\_t | bfloat16x4\_t     | half-precison Brain float| 4         |
-  +-----------------+-------------------+--------------------------+-----------+
-  | __Bfloat16x8\_t | bfloat16x8\_t     | half-precison Brain float| 8         |
-  +-----------------+-------------------+--------------------------+-----------+
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | Internal type               | arm\_neon.h type  | Base Type                | Elements  |
+  +=============================+===================+==========================+===========+
+  | __Int8x8\_t                 | int8x8\_t         | signed byte              | 8         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Int16x4\_t                | int16x4\_t        | signed half-word         | 4         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Int32x2\_t                | int32x2\_t        | signed word              | 2         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Uint8x8\_t                | uint8x8\_t        | unsigned byte            | 8         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Uint16x4\_t               | uint16x4\_t       | unsigned half-word       | 4         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Uint32x2\_t               | uint32x2\_t       | unsigned word            | 2         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Float16x4\_t              | float16x4\_t      | half-precision float     | 4         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Float32x2\_t              | float32x2\_t      | single-precision float   | 2         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Poly8x8\_t                | poly8x8\_t        | unsigned byte            | 8         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Poly16x4\_t               | poly16x4\_t       | unsigned half-word       | 4         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Int8x16\_t                | int8x16\_t        | signed byte              | 16        |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Int16x8\_t                | int16x8\_t        | signed half-word         | 8         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Int32x4\_t                | int32x4\_t        | signed word              | 4         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Int64x2\_t                | int64x2\_t        | signed double-word       | 2         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Uint8x16\_t               | uint8x16\_t       | unsigned byte            | 16        |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Uint16x8\_t               | uint16x8\_t       | unsigned half-word       | 8         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Uint32x4\_t               | uint32x4\_t       | unsigned word            | 4         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Uint64x2\_t               | uint64x2\_t       | unsigned double-word     | 2         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Float16x8\_t              | float16x8\_t      | half-precision float     | 8         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Float32x4\_t              | float32x4\_t      | single-precision float   | 4         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Float64x2\_t              | float64x2\_t      | double-precision float   | 2         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Poly8x16\_t               | poly8x16\_t       | unsigned byte            | 16        |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Poly16x8\_t               | poly16x8\_t       | unsigned half-word       | 8         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Poly64x2\_t               | poly64x2\_t       | unsigned double-word     | 2         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Bfloat16x4\_t             | bfloat16x4\_t     | half-precison Brain float| 4         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | __Bfloat16x8\_t             | bfloat16x8\_t     | half-precison Brain float| 8         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | **(Alpha)** __Mfloat8x8\_t  | mfloat8x8\_t      | modal 8-bit float        | 8         |
+  +-----------------------------+-------------------+--------------------------+-----------+
+  | **(Alpha)** __Mfloat8x16\_t | mfloat8x16\_t     | modal 8-bit float        | 16        |
+  +-----------------------------+-------------------+--------------------------+-----------+
 
 APPENDIX Support for Scalable vectors
 =====================================
@@ -3057,35 +3066,37 @@ document.
 
 .. table:: Table 8: Scalable Vector Types and Scalable Predicate Types
 
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | Internal type       | ``arm_sve.h`` type    | Base type                                 | Elements       |
-  +=====================+=======================+===========================================+================+
-  | ``__SVInt8_t``      | ``svint8_t``          | signed byte                               | VG×8           |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | ``__SVUint8_t``     | ``svuint8_t``         | unsigned byte                             | VG×8           |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | ``__SVInt16_t``     | ``svint16_t``         | signed half-word                          | VG×4           |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | ``__SVUint16_t``    | ``svuint16_t``        | unsigned half-word                        | VG×4           |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | ``__SVFloat16_t``   | ``svfloat16_t``       | half-precision float                      | VG×4           |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | ``__SVBfloat16_t``  | ``svbfloat16_t``      | half-precision brain float                | VG×4           |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | ``__SVInt32_t``     | ``svint32_t``         | signed word                               | VG×2           |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | ``__SVUint32_t``    | ``svuint32_t``        | unsigned word                             | VG×2           |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | ``__SVFloat32_t``   | ``svfloat32_t``       | single-precision float                    | VG×2           |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | ``__SVInt64_t``     | ``svint64_t``         | signed double-word                        | VG             |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | ``__SVUint64_t``    | ``svuint64_t``        | unsigned double-word                      | VG             |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | ``__SVFloat64_t``   | ``svfloat64_t``       | double-precision float                    | VG             |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
-  | ``__SVBool_t``      | ``svbool_t``          | single bit (fully packed into VG bytes)   | VG×8           |
-  +---------------------+-----------------------+-------------------------------------------+----------------+
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | Internal type                  | ``arm_sve.h`` type    | Base type                                 | Elements       |
+  +================================+=======================+===========================================+================+
+  | ``__SVInt8_t``                 | ``svint8_t``          | signed byte                               | VG×8           |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | ``__SVUint8_t``                | ``svuint8_t``         | unsigned byte                             | VG×8           |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | ``__SVInt16_t``                | ``svint16_t``         | signed half-word                          | VG×4           |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | ``__SVUint16_t``               | ``svuint16_t``        | unsigned half-word                        | VG×4           |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | ``__SVFloat16_t``              | ``svfloat16_t``       | half-precision float                      | VG×4           |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | ``__SVBfloat16_t``             | ``svbfloat16_t``      | half-precision brain float                | VG×4           |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | ``__SVInt32_t``                | ``svint32_t``         | signed word                               | VG×2           |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | ``__SVUint32_t``               | ``svuint32_t``        | unsigned word                             | VG×2           |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | ``__SVFloat32_t``              | ``svfloat32_t``       | single-precision float                    | VG×2           |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | ``__SVInt64_t``                | ``svint64_t``         | signed double-word                        | VG             |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | ``__SVUint64_t``               | ``svuint64_t``        | unsigned double-word                      | VG             |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | ``__SVFloat64_t``              | ``svfloat64_t``       | double-precision float                    | VG             |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | ``__SVBool_t``                 | ``svbool_t``          | single bit (fully packed into VG bytes)   | VG×8           |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
+  | **(Alpha)** ``__SVMfloat8_t``  | ``svmfloat8_t``       | modal 8-bit float                         | VG×8           |
+  +--------------------------------+-----------------------+-------------------------------------------+----------------+
 
 
 APPENDIX C++ mangling