Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized all Rect/FRect methods via pgRect_FromObject #2908

Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions src_c/rect.c
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ four_floats_from_obj(PyObject *obj, float *val1, float *val2, float *val3,
#define RectImport_primitiveType int
#define RectImport_RectCheck pgRect_Check
#define RectImport_OtherRectCheck pgFRect_Check
#define RectImport_OtherRectCheckExact pgFRect_CheckExact
#define RectImport_RectCheckExact pgRect_CheckExact
#define RectImport_innerRectStruct SDL_Rect
#define RectImport_otherInnerRectStruct SDL_FRect
Expand Down Expand Up @@ -253,6 +254,7 @@ four_floats_from_obj(PyObject *obj, float *val1, float *val2, float *val3,
#define RectImport_primitiveType float
#define RectImport_RectCheck pgFRect_Check
#define RectImport_OtherRectCheck pgRect_Check
#define RectImport_OtherRectCheckExact pgRect_CheckExact
#define RectImport_RectCheckExact pgFRect_CheckExact
#define RectImport_innerRectStruct SDL_FRect
#define RectImport_otherInnerRectStruct SDL_Rect
Expand Down
30 changes: 24 additions & 6 deletions src_c/rect_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -320,6 +320,9 @@
#ifndef RectImport_RectCheckExact
#error RectImport_RectCheckExact needs to be Defined
#endif
#ifndef RectImport_OtherRectCheckExact
#error RectImport_OtherRectCheckExact needs to be Defined
#endif
#ifndef RectImport_primitiveType
#error RectImport_primitiveType needs to be defined
#endif
Expand Down Expand Up @@ -424,7 +427,7 @@ RectExport_do_rects_intresect(InnerRect *A, InnerRect *B)

#define _pg_do_rects_intersect RectExport_do_rects_intresect

static InnerRect *
static PG_FORCEINLINE InnerRect *
RectExport_RectFromObject(PyObject *obj, InnerRect *temp);
static InnerRect *
RectExport_RectFromFastcallArgs(PyObject *const *args, Py_ssize_t nargs,
Expand Down Expand Up @@ -611,16 +614,16 @@ static RectObject
int RectOptional_Freelist_Num = -1;
#endif

static InnerRect *
static PG_FORCEINLINE InnerRect *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much of a performance effect does force-inlining have here?

I'm skeptical of this change because it's a large-ish function that's used in a lot of places. We should see if the performance improvement justifies the increase in size in this case

Copy link
Member Author

@itzpr3d4t0r itzpr3d4t0r Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing inlining gives this performance loss (5-30% generally about 10%), but I'd be more interested in knowing how much inlining would cost here in terms of code size. The function gets inlined only inside of rect_impl.h, outside this we use function pointers which inherently don't inline the function body.
image

Copy link
Member Author

@itzpr3d4t0r itzpr3d4t0r Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got these results:

  • without inlining 12759704 bytes
  • with inlining 12773283 bytes

So inlining increases code size by 0.1 %, or a 1.00106X increase. I believe this is a fair increase for up to 30% performance improvement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a long time banging my head into a wall and needing special support from Ankith, I have my own comparison.

It seems like on my computer the new buildconfig chooses mingw, and for some reason this PR doesn't change the size at all (not a single byte) when using mingw.

So now that I have the special flags to actually build pygame-ce with the correct compiler (pip install . -Csetup-args="--vsenv"), I can see the difference.

This brings rect.cp39-win_amd64.pyd from 70,656 bytes to 108,544 bytes. This is an increase of 37,888 bytes, or 53%.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so I've swapped PG_FORCEINLINE for a simpler inline and checked code size. It stayed very close to 70kb so almost the same as before but the performance loss was mitigated a bit:
image

Copy link
Member Author

@itzpr3d4t0r itzpr3d4t0r Jun 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Infact I got these results (just for rect.pyd this time, python 3.12):

  • NO INLINE: 72,192 bytes
  • FORCE INLINE: 111,616 bytes
  • INLINE: 72,192 bytes

RectExport_RectFromObject(PyObject *obj, InnerRect *temp)
{
Py_ssize_t length;

if (RectCheck(obj)) {
/* fast path for exact Rect / FRect class */
if (RectImport_RectCheckExact(obj)) {
return &((RectObject *)obj)->r;
}

if (OtherRectCheck(obj)) {
if (RectImport_OtherRectCheckExact(obj)) {
OtherInnerRect rect = ((OtherRectObject *)obj)->r;
temp->x = (PrimitiveType)rect.x;
temp->y = (PrimitiveType)rect.y;
Expand All @@ -629,6 +632,7 @@ RectExport_RectFromObject(PyObject *obj, InnerRect *temp)
return temp;
}

/* fast check for sequences */
if (pgSequenceFast_Check(obj)) {
length = PySequence_Fast_GET_SIZE(obj);
PyObject **items = PySequence_Fast_ITEMS(obj);
Expand Down Expand Up @@ -721,7 +725,20 @@ RectExport_RectFromObject(PyObject *obj, InnerRect *temp)
}
}

/* Try to get the rect attribute */
/* path for possible subclasses (these are very slow checks) */
if (RectImport_RectCheck(obj)) {
return &((RectObject *)obj)->r;
}
if (RectImport_OtherRectCheck(obj)) {
OtherInnerRect rect = ((OtherRectObject *)obj)->r;
temp->x = (PrimitiveType)rect.x;
temp->y = (PrimitiveType)rect.y;
temp->w = (PrimitiveType)rect.w;
temp->h = (PrimitiveType)rect.h;
return temp;
}

/* path to get the 'rect' attribute if present */
PyObject *rectattr;
if (!(rectattr = PyObject_GetAttrString(obj, "rect"))) {
PyErr_Clear();
Expand Down Expand Up @@ -2933,6 +2950,7 @@ RectExport_iterator(RectObject *self)
#undef RectImport_RectCheck
#undef RectImport_OtherRectCheck
#undef RectImport_RectCheckExact
#undef RectImport_OtherRectCheckExact
#undef RectImport_innerRectStruct
#undef RectImport_otherInnerRectStruct
#undef RectImport_innerPointStruct
Expand Down
Loading