-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix up more SSE implementations for nontrapping-fp #22931
Conversation
Fixes lto2.test_sse1 and test_sse2 with checks similar to
system/include/compat/xmmintrin.h
Outdated
if (x != 0 || fabsf(((__f32x4)__a)[0]) < 2.f) | ||
float e = ((__f32x4)__a)[0]; | ||
int x = lrint(e); | ||
if ((x != 0 || fabsf(e)) < 2.f && !isnan(e) && e <= (float)INT_MAX && e >= INT_MIN) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these comparisons be made against x
to avoid the cast here?
Is the cast needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without the explicit cast, there's a warning about the implicit cast changing the value of INT_MAX.
Although looking at the other implementations in this file, they also have similar implicit casts which also produce that warning, so maybe we should be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we can't compare against x (at least like it's written here) because an int is always <= INT_MAX
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that the comparison style (e.g. whether the nan check was first or further down) was inconsistent so I made all the checks the same. I think if we want to optimize this further we should look more into using the nontrapping instructions directly but it's not clear whether that's important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the test you are fixing to test-core3
in circleci config?
Fixes lto2.test_sse1 and test_sse2 with checks similar to #22911 and #22893