-
-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SIMD functionality to the transform submodule (Attempt 2) #2421
Conversation
It looks like I can't seem to share the body of Anyway, I feel like we should avoid sharing anything that is AVX2 dependent in the |
Could having a separate file where we compile the has_avx2 function with the AVX2 compile flag solve the issue? |
I'm trying that now, putting the 'simd_shared.c' file in both submodules (transform and surface) |
So the problem with this approach of sharing a .c file across two submodules is that if one is linking while the other wants to start linking you get this:
I have no idea if there is any way round this other than going back to having the definition duplicated in separate file for each submodule. Unless anyone has any other ideas? EDIT: Macros won't work either as both of these functions are filled with hash defines already. |
The main things I've learned:
Anyway that's how you end up with the code as it is in this PR. I think I discovered most of this the first time I worked on this, but that was a while ago and the process wasn't well reasoned out or documented. I'm open to any suggestions to reduce some of this (fairly minor) code duplication, but it seems like the compiler really wants it to stay duplicated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for being so incremental with these PRs Myre, it makes it nice and easy to review.
Went through this it all seems reasonable, ran the test suite locally, put in some printfs to make sure SIMD was still getting dispatched as it was. All looks good.
!defined(SDL_DISABLE_IMMINTRIN_H) */ | ||
} | ||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I understand correctly, these methods are the duplication in code?
pg_has_avx2
pg_avx2_at_runtime_but_uncomplied
Might it help to define a simd_avx2_shared.h
and put all avx2 related header stuff in there instead of putting it in simd_shared.h
? Or would you have the same problems? (I'm no expert in header includes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what i suggested, myre tried to do that, and 32 bit builds failed for unknown reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well he tried to put the methods in own *.c files. If you suggested what I suggested then it was not clear to me.
My suggestion is to put all avx related stuff into its own *.h file. Not sure what benefit would be gained and its merged already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. I'm no expert in header includes, but reading your changes they seem to be reasonable.
It wouldn't work as you need the function declarations in non AVX2 code
files, and you can't build the function body with AVX2 support and include
it in non-avx2 code files.
If that makes sense.
…On Fri, 8 Sept 2023, 18:17 dr0id, ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src_c/simd_transform_avx2.c
<#2421 (comment)>
:
> +/* This returns 1 when avx2 is available at runtime but support for it isn't
+ * compiled in, 0 in all other cases */
+int
+pg_avx2_at_runtime_but_uncompiled()
+{
+ if (SDL_HasAVX2()) {
+#if defined(__AVX2__) && defined(HAVE_IMMINTRIN_H) && \
+ !defined(SDL_DISABLE_IMMINTRIN_H)
+ return 0;
+#else
+ return 1;
+#endif /* defined(__AVX2__) && defined(HAVE_IMMINTRIN_H) && \
+ !defined(SDL_DISABLE_IMMINTRIN_H) */
+ }
+ return 0;
+}
Well he tried to put the methods in own *.c files. If you suggested what I
suggested then it was not clear to me.
My suggestion is to put all avx related stuff into its own *.h file. Not
sure what benefit would be gained and its merged already.
—
Reply to this email directly, view it on GitHub
<#2421 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADGDGGWKP3YWB2AOHJ27PHDXZNHJVANCNFSM6AAAAAA373CLPI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Hopefully this will be a cleaner PR than the mess of the last one.
The basic idea is to make it fairly easy to add SIMD versions of transform functions to the transform submodule the same way we've been adding them to the blitters in the surface submodule.
The original messy PR is here: #2042
I've quickly tried to address some of the issues highlighted in the reviews of that PR here; so we shall see if it still passes the CI.
I have a follow-up PR to this that builds off it by adding a SIMD version of the current greyscale transform. I'll create that again if we can get this one working.