Enable AVX NE CONVERT for FP16 to FP32 cast #21183

eralmual · 2024-06-26T18:46:46Z

Description

Implementation of a new cast assembly kernel that uses AVX_NE_CONVERT instructions to accelerate casting from FP16 to FP32. Added CPUID checks to determine support of the ISA.

Motivation and Context

Currently FP16 models executed on systems that lack complete FP16 operator support use single precision on every node to run the model, this means the original FP16 weights have to be casted to FP32 in order to run the model properly, this change aims to accelerate the casting by using upconvert instructions and therefore improve performance.

* Enable AVX_NE_CONVERT detection via CPUID. * Developed assembly kernel using the new ISA. * Integrated kernel.

tianleiwu · 2024-06-26T22:16:58Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

tianleiwu · 2024-06-26T22:17:00Z

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline

tianleiwu · 2024-06-26T22:17:01Z

/azp run Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-06-26T22:17:18Z

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines · 2024-06-26T22:17:35Z

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines · 2024-06-26T22:17:36Z

Azure Pipelines successfully started running 10 pipeline(s).

yufenglee · 2024-06-27T17:27:55Z

onnxruntime/core/mlas/inc/mlas.h

@@ -1037,6 +1037,14 @@ MlasConvertHalfToFloatBuffer(
    size_t Count
    );

+extern "C" void
+MLASCALL
+MlasConvertHalfToFloatBufferAVX2(


as we discussed, could you please combine the MlasConvertHalfToFloatBufferAVX2 and MlasConvertHalfToFloatBuffer function?

yufenglee · 2024-06-27T17:36:45Z

onnxruntime/core/mlas/lib/amd64/cvtfp16Avx2.asm

@@ -0,0 +1,148 @@
+;++


could you please add also add a version for Linux?

yufenglee · 2024-06-27T17:56:41Z

i think the build failure of QNN CI pipeline is that it uses msvc 14.36, which doesn't support vcvtneeph2ps instruction yet. Other windows CI pipeline uses 14.40.

@snnn, any ideas why QNN CI pipeline doesn't use same msvc version?

Enable AVX NE CONVERT for FP16 to FP32 cast

f5bc5d7

* Enable AVX_NE_CONVERT detection via CPUID. * Developed assembly kernel using the new ISA. * Integrated kernel.

eralmual requested a review from a team as a code owner June 26, 2024 18:46

yufenglee reviewed Jun 27, 2024

View reviewed changes

onnxruntime/core/mlas/lib/amd64/cvtfp16Avx2.asm

@@ -0,0 +1,148 @@

;++

Copy link

Member

yufenglee Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please add also add a version for Linux?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable AVX NE CONVERT for FP16 to FP32 cast #21183

Enable AVX NE CONVERT for FP16 to FP32 cast #21183

eralmual commented Jun 26, 2024

tianleiwu commented Jun 26, 2024

tianleiwu commented Jun 26, 2024

tianleiwu commented Jun 26, 2024

azure-pipelines bot commented Jun 26, 2024

azure-pipelines bot commented Jun 26, 2024

azure-pipelines bot commented Jun 26, 2024

yufenglee Jun 27, 2024

yufenglee Jun 27, 2024

yufenglee commented Jun 27, 2024

Enable AVX NE CONVERT for FP16 to FP32 cast #21183

Are you sure you want to change the base?

Enable AVX NE CONVERT for FP16 to FP32 cast #21183

Conversation

eralmual commented Jun 26, 2024

Description

Motivation and Context

tianleiwu commented Jun 26, 2024

tianleiwu commented Jun 26, 2024

tianleiwu commented Jun 26, 2024

azure-pipelines bot commented Jun 26, 2024

azure-pipelines bot commented Jun 26, 2024

azure-pipelines bot commented Jun 26, 2024

yufenglee Jun 27, 2024

Choose a reason for hiding this comment

yufenglee Jun 27, 2024

Choose a reason for hiding this comment

yufenglee commented Jun 27, 2024