-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-118761: Reduce import time of gettext.py by delaying re import #128898
Conversation
44b48bd
to
2a4bca6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 51 still runs import re
! I can't add a suggestion to remove it.
The only other comment I had is maybe you could preserve the string pattern in a module global and move the compilation to the function, to reduce the diff size?
A
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
tfw you forget to |
I wonder if that really helps? The diff size isn't bad right now (IMHO), and having an effectively useless module global as a side effect of the pull request process looks a tiny bit odd to me. If you really want it, I could do it though. |
2a4bca6
to
16cb697
Compare
Out of curiosity, how are the |
It should be fairly good -- this not only avoids a re import, it avoids compiling a regex during import. Unfortunately I can't answer the question... because I just tried testing it and hyperfine reports that "statistical outliers were detected" and wildly different timings across repeated hyperfine runs (including, some hyperfine runs for this PR that were much slower than some runs without this PR). Closing most programs other than my system's Window Manager didn't help reduce the bias. I'm not really experienced with running benchmarks, so maybe there's a trick I'm simply not aware of. If someone else wants to run their own timings I'll happily accept them at face value and include them in the news file though. |
The easiest way is to: $ ./python -X importtime -c 'import gettext' and report the numbers only for $ hyperfine --warmup 8 "./python -c 'import gettext'" You should compare the numbers against |
What I mean is that the timings vary between saying the old code is faster and the new code is faster. But here's an optimistic run I suppose. Maybe we can just assume the most favorable results are true?
Results produced by reasoning that the changes are only to a single .py file, and copying the file twice so I can try importing either the old or the new one alternately. No need to checkout different branches. |
I'm getting similar results as @eli-schwartz (on a release build) # Old
$ hyperfine --warmup 8 "./python -c 'import gettext'"
Benchmark 1: ./python -c 'import gettext'
Time (mean ± σ): 23.7 ms ± 4.5 ms [User: 17.9 ms, System: 5.8 ms]
Range (min … max): 18.1 ms … 31.9 ms 100 runs # New
$ hyperfine --warmup 8 "./python -c 'import gettext'"
Benchmark 1: ./python -c 'import gettext'
Time (mean ± σ): 17.5 ms ± 4.7 ms [User: 12.8 ms, System: 4.6 ms]
Range (min … max): 12.1 ms … 29.5 ms 129 runs |
Can we have the |
Here you go ;)
|
|
That's a definite improvement indeed. It will also help other modules such as |
gettext is often imported in programs that may not end up translating anything. In fact, the `struct` module already has a delayed import when parsing GNUTranslations to speed up the no .mo files case. The re module is also used in the same situation, but behind a function chain only called by GNUTranslations. cache the compiled regex globally the first time it is used. The finditer function can be converted to a method call on the compiled object (it always could) which is slightly more efficient and necessary for the conditional re import.
16cb697
to
fd471b8
Compare
Yup, that was my original motivation in fact. :)
Done. |
``gettext`` is often imported in programs that may not end up translating anything. In fact, the ``struct`` module already has a delayed import when parsing ``GNUTranslations`` to speed up the no ``.mo`` files case. The re module is also used in the same situation, but behind a function chain only called by ``GNUTranslations``. Cache the compiled regex globally the first time it is used. The finditer function is converted to a method call on the compiled object which is slightly more efficient, and necessary for the delayed re import.
gettext is often imported in programs that may not end up translating anything. In fact, the
struct
module already has a delayed import when parsing GNUTranslations to speed up the no .mo files case. The re module is also used in the same situation, but behind a function chain only called by GNUTranslations.cache the compiled regex globally the first time it is used. The finditer function can be converted to a method call on the compiled object (it always could) which is slightly more efficient and necessary for the conditional re import.