-
Notifications
You must be signed in to change notification settings - Fork 3.7k
workaround for a VC++ bug in VS 17.14 #24878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
We experienced the same problem with The differences are:
Based on our investigation there is no premature loop termination. All elements are processed, but their processing is split into two blocks:
The problem is that the register with the destination address is not updated after the first block. Hence the last remaining elements overwrite the first ones in the resulting output, and the last elements in the result left default initialized (zero). |
|
Another problem is that The simplest solution is to replace Another solution is to upgrade the protobuf version used from outdated |
|
@snnn, @a-akoval , I had similar issue in my project and reported it here: |
The following unit tests failed when building ONNX Runtime with Visual Studio 17.14 in Release or RelWithDebInfo configuration. - SparseTensorConversionTests.TestDenseToSparseConversion - MeanVarianceNormalizationTest.AllAxes - MVNContribOpTest.MeanVarianceNormalizationCPUTest_Version1_TO_8 This PR provides a workaround for the two MVN tests.
Symptom
The following unit tests failed when building ONNX Runtime with Visual Studio 17.14 in Release or RelWithDebInfo configuration.
Minimal Reproducible Example
Observed Behavior (with gsl::span):
When compiling the MeanStdev_gsl function (which uses gsl::span) in Release mode, the std::transform operation does not process all elements of the input span.
For the provided example X (24 elements), the output is:
Only the first 16 elements of the input vector X appear to be processed by std::transform. The remaining 8 elements in the diff vector retain their default-initialized value (0.0f). The calculation for diff[0] is also incorrect if based on X[0]; the first value 4.125 corresponds to X[16] - mean if the output was shifted, but that's less likely than premature termination. More accurately, the 16th value computed (-5.875) correctly corresponds to X[15] - mean
The correct output (achieved when using std::span) would be:
Detailed Analysis of the Issue:
Vector Sizing: The
std::vector<float>diff inside the MeanStdev_gsl function is correctly sized based onv.size(). If v is a span of 24 floats, diff is allocated to hold 24 floats. This indicates that v.size() is correctly read at this point.Premature Loop Termination: The std::transform operation, when gsl::span is used, appears to terminate its loop prematurely. In the example with 24 elements, only 16 elements are processed.
Assembly Code Discrepancy:
gsl::span version (Incorrect): The provided assembly shows that the compiler generates a scalar loop (using movss, subss) for std::transform. The loop control logic involves:
std::span version (Correct): The assembly for the std::span version typically shows the compiler successfully auto-vectorizing the loop (using movups, subps), and it processes all elements correctly.
Suspected Cause: The issue seems to be a subtle code generation bug in the MSVC optimizer when handling std::transform with gsl::span (a non-standard library type, though widely used) in conjunction with std::bind. The compiler correctly determines v.size() for memory allocation but appears to use an effectively different (and incorrect) size for controlling the iteration of the std::transform loop. This leads to processing fewer elements than expected. I believe the problem can be reproduced without using GSL, though I haven't found an example.
asm1.txt
asm2.txt
Resolution
This PR change the MeanStdev function to not using gsl::span as a workaround of the original issue. After this change, the two MVN related tests can pass.