-
-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimized all Rect/FRect
methods via pgRect_FromObject
#2908
Optimized all Rect/FRect
methods via pgRect_FromObject
#2908
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimization king 👑
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing here looks wonky to me, so I'm approving it
src_c/rect_impl.h
Outdated
@@ -611,16 +614,16 @@ static RectObject | |||
int RectOptional_Freelist_Num = -1; | |||
#endif | |||
|
|||
static InnerRect * | |||
static PG_FORCEINLINE InnerRect * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much of a performance effect does force-inlining have here?
I'm skeptical of this change because it's a large-ish function that's used in a lot of places. We should see if the performance improvement justifies the increase in size in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing inlining gives this performance loss (5-30% generally about 10%), but I'd be more interested in knowing how much inlining would cost here in terms of code size. The function gets inlined only inside of rect_impl.h, outside this we use function pointers which inherently don't inline the function body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got these results:
- without inlining
12759704 bytes
- with inlining
12773283 bytes
So inlining increases code size by 0.1 %, or a 1.00106X increase. I believe this is a fair increase for up to 30% performance improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a long time banging my head into a wall and needing special support from Ankith, I have my own comparison.
It seems like on my computer the new buildconfig chooses mingw, and for some reason this PR doesn't change the size at all (not a single byte) when using mingw.
So now that I have the special flags to actually build pygame-ce with the correct compiler (pip install . -Csetup-args="--vsenv"
), I can see the difference.
This brings rect.cp39-win_amd64.pyd
from 70,656 bytes to 108,544 bytes. This is an increase of 37,888 bytes, or 53%.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Infact I got these results (just for rect.pyd this time, python 3.12):
- NO INLINE: 72,192 bytes
- FORCE INLINE: 111,616 bytes
- INLINE: 72,192 bytes
…mmunity#2908) * Optimized RectExport_RectFromObject * swapped PG_FORCEINLINE for PG_INLINE --------- Co-authored-by: Dan Lawrence <[email protected]>
This PR speeds up virtually all Rect/Frect functions, but most importantly it fixes the terrible performance degradation when mixing calls between Rects and Frects, like calling
Rect.colliderect(Frect)
and alike.This is achieved through modifications to the
pgRect_FromObject
function that are:PyObject_IsInstance
)pgRect_FromObject
is nowPG_INLINE
I've seen variable benefits, but they're generally quite good:
Results
Data to construct those results
OLD
NEW
Test program to generate the data: