-
Notifications
You must be signed in to change notification settings - Fork 52
Open
Description
3.11 and 3.12 local PGO builds with MSC v.1933 are 6%~ slower than the PGO with v.1929 (vs2019) on my pyperformance macro benchmark.
3.11rc2+ (2022-9-18 8e2bda8)
> pyperf compare_to v1929 v1933 -G
Slower (51):
- unpack_sequence: 81.7 ns +- 0.8 ns -> 103 ns +- 5 ns: 1.26x slower
- scimark_monte_carlo: 100 ms +- 1 ms -> 123 ms +- 1 ms: 1.23x slower
- nbody: 163 ms +- 2 ms -> 193 ms +- 2 ms: 1.19x slower
- sqlalchemy_imperative: 46.6 ms +- 1.3 ms -> 53.3 ms +- 8.9 ms: 1.14x slower
- scimark_sparse_mat_mult: 6.34 ms +- 0.06 ms -> 7.22 ms +- 0.09 ms: 1.14x slower
- logging_simple: 22.7 us +- 0.5 us -> 25.7 us +- 1.8 us: 1.13x slower
- scimark_lu: 157 ms +- 2 ms -> 177 ms +- 2 ms: 1.13x slower
- sympy_integrate: 36.4 ms +- 0.4 ms -> 41.0 ms +- 4.2 ms: 1.13x slower
- sympy_expand: 877 ms +- 14 ms -> 987 ms +- 65 ms: 1.13x slower
- scimark_sor: 187 ms +- 3 ms -> 211 ms +- 1 ms: 1.12x slower
- float: 138 ms +- 1 ms -> 153 ms +- 2 ms: 1.11x slower
- pickle_pure_python: 574 us +- 7 us -> 632 us +- 10 us: 1.10x slower
- go: 246 ms +- 3 ms -> 271 ms +- 3 ms: 1.10x slower
- nqueens: 162 ms +- 2 ms -> 177 ms +- 1 ms: 1.10x slower
- scimark_fft: 492 ms +- 6 ms -> 539 ms +- 6 ms: 1.10x slower
- sympy_sum: 318 ms +- 10 ms -> 348 ms +- 28 ms: 1.09x slower
- fannkuch: 615 ms +- 9 ms -> 672 ms +- 11 ms: 1.09x slower
- unpickle_pure_python: 392 us +- 4 us -> 427 us +- 8 us: 1.09x slower
- logging_silent: 177 ns +- 3 ns -> 193 ns +- 2 ns: 1.09x slower
- regex_compile: 236 ms +- 2 ms -> 256 ms +- 4 ms: 1.09x slower
- html5lib: 117 ms +- 12 ms -> 127 ms +- 12 ms: 1.08x slower
- richards: 83.9 ms +- 0.9 ms -> 90.8 ms +- 0.7 ms: 1.08x slower
- deltablue: 7.05 ms +- 0.13 ms -> 7.59 ms +- 0.11 ms: 1.08x slower
- sqlalchemy_declarative: 264 ms +- 8 ms -> 283 ms +- 14 ms: 1.07x slower
- pyflate: 807 ms +- 12 ms -> 863 ms +- 8 ms: 1.07x slower
- sympy_str: 543 ms +- 13 ms -> 578 ms +- 14 ms: 1.06x slower
- chaos: 137 ms +- 1 ms -> 146 ms +- 1 ms: 1.06x slower
- xml_etree_process: 107 ms +- 2 ms -> 114 ms +- 2 ms: 1.06x slower
- meteor_contest: 156 ms +- 3 ms -> 164 ms +- 5 ms: 1.05x slower
- django_template: 87.2 ms +- 1.6 ms -> 91.5 ms +- 1.6 ms: 1.05x slower
- crypto_pyaes: 123 ms +- 2 ms -> 129 ms +- 1 ms: 1.05x slower
- chameleon: 15.9 ms +- 0.3 ms -> 16.6 ms +- 0.3 ms: 1.05x slower
- logging_format: 24.7 us +- 0.8 us -> 25.8 us +- 0.4 us: 1.05x slower
- telco: 11.7 ms +- 0.1 ms -> 12.2 ms +- 0.1 ms: 1.04x slower
- spectral_norm: 173 ms +- 1 ms -> 180 ms +- 1 ms: 1.04x slower
- regex_effbot: 4.13 ms +- 0.04 ms -> 4.29 ms +- 0.05 ms: 1.04x slower
- raytrace: 566 ms +- 5 ms -> 589 ms +- 6 ms: 1.04x slower
- pickle: 21.0 us +- 0.2 us -> 21.8 us +- 0.6 us: 1.04x slower
- hexiom: 11.0 ms +- 0.1 ms -> 11.4 ms +- 0.1 ms: 1.04x slower
- regex_v8: 31.4 ms +- 0.2 ms -> 32.5 ms +- 0.3 ms: 1.04x slower
- 2to3: 593 ms +- 2 ms -> 612 ms +- 9 ms: 1.03x slower
- xml_etree_generate: 152 ms +- 5 ms -> 156 ms +- 5 ms: 1.03x slower
- mako: 20.5 ms +- 0.3 ms -> 21.0 ms +- 0.4 ms: 1.03x slower
- tornado_http: 345 ms +- 21 ms -> 353 ms +- 23 ms: 1.02x slower
- pickle_list: 7.40 us +- 0.11 us -> 7.56 us +- 0.05 us: 1.02x slower
- unpickle: 23.1 us +- 0.5 us -> 23.6 us +- 0.4 us: 1.02x slower
- xml_etree_iterparse: 197 ms +- 10 ms -> 200 ms +- 6 ms: 1.02x slower
- regex_dna: 255 ms +- 2 ms -> 259 ms +- 2 ms: 1.01x slower
- dulwich_log: 244 ms +- 8 ms -> 248 ms +- 3 ms: 1.01x slower
- json_dumps: 22.7 ms +- 0.3 ms -> 22.9 ms +- 0.4 ms: 1.01x slower
- pickle_dict: 46.3 us +- 0.5 us -> 46.6 us +- 0.3 us: 1.01x slower
Faster (4):
- unpickle_list: 7.94 us +- 0.33 us -> 7.05 us +- 0.09 us: 1.13x faster
- pathlib: 341 ms +- 42 ms -> 314 ms +- 37 ms: 1.09x faster
- sqlite_synth: 5.67 us +- 0.06 us -> 5.61 us +- 0.12 us: 1.01x faster
- pidigits: 269 ms +- 2 ms -> 268 ms +- 1 ms: 1.01x faster
Benchmark hidden because not significant (2): json_loads, xml_etree_parse
Geometric mean: 1.06x slower
And python.org official binaries have 8% gap between 3.11rc2(v.1933) and 3.11rc1 (v.1932). (A few percent of the slowdown is necessary from bugfix commits)
> pyperf compare_to 311rc1 311rc2 -G
Slower (52):
- unpack_sequence: 70.4 ns +- 1.6 ns -> 103 ns +- 1 ns: 1.46x slower
- nbody: 147 ms +- 2 ms -> 179 ms +- 3 ms: 1.22x slower
- sqlalchemy_imperative: 47.7 ms +- 3.1 ms -> 56.3 ms +- 12.6 ms: 1.18x slower
- hexiom: 9.84 ms +- 0.05 ms -> 11.5 ms +- 0.0 ms: 1.17x slower
- logging_silent: 172 ns +- 1 ns -> 199 ns +- 2 ns: 1.16x slower
- chameleon: 14.8 ms +- 0.3 ms -> 17.0 ms +- 0.6 ms: 1.15x slower
- pyflate: 754 ms +- 7 ms -> 869 ms +- 20 ms: 1.15x slower
- sympy_integrate: 35.4 ms +- 0.8 ms -> 40.6 ms +- 4.2 ms: 1.15x slower
- unpickle_pure_python: 380 us +- 3 us -> 434 us +- 7 us: 1.14x slower
- regex_compile: 230 ms +- 4 ms -> 261 ms +- 20 ms: 1.14x slower
- scimark_lu: 145 ms +- 1 ms -> 165 ms +- 4 ms: 1.13x slower
- scimark_sparse_mat_mult: 5.99 ms +- 0.11 ms -> 6.73 ms +- 0.15 ms: 1.12x slower
- fannkuch: 618 ms +- 10 ms -> 691 ms +- 13 ms: 1.12x slower
- scimark_sor: 192 ms +- 2 ms -> 214 ms +- 2 ms: 1.11x slower
- sqlalchemy_declarative: 262 ms +- 7 ms -> 291 ms +- 31 ms: 1.11x slower
- raytrace: 549 ms +- 3 ms -> 608 ms +- 6 ms: 1.11x slower
- mako: 19.0 ms +- 0.2 ms -> 21.1 ms +- 0.3 ms: 1.11x slower
- go: 248 ms +- 2 ms -> 274 ms +- 11 ms: 1.10x slower
- float: 140 ms +- 2 ms -> 155 ms +- 6 ms: 1.10x slower
- sympy_expand: 883 ms +- 33 ms -> 974 ms +- 79 ms: 1.10x slower
- sympy_str: 538 ms +- 13 ms -> 592 ms +- 50 ms: 1.10x slower
- scimark_monte_carlo: 104 ms +- 2 ms -> 114 ms +- 1 ms: 1.10x slower
- 2to3: 587 ms +- 2 ms -> 643 ms +- 42 ms: 1.10x slower
- richards: 83.7 ms +- 1.1 ms -> 91.3 ms +- 0.8 ms: 1.09x slower
- sympy_sum: 311 ms +- 8 ms -> 339 ms +- 27 ms: 1.09x slower
- django_template: 84.8 ms +- 2.1 ms -> 92.2 ms +- 2.5 ms: 1.09x slower
- spectral_norm: 167 ms +- 3 ms -> 181 ms +- 3 ms: 1.08x slower
- deltablue: 6.92 ms +- 0.09 ms -> 7.48 ms +- 0.15 ms: 1.08x slower
- logging_simple: 21.7 us +- 0.5 us -> 23.5 us +- 1.0 us: 1.08x slower
- scimark_fft: 468 ms +- 13 ms -> 505 ms +- 6 ms: 1.08x slower
- chaos: 135 ms +- 2 ms -> 145 ms +- 1 ms: 1.08x slower
- pickle_pure_python: 567 us +- 10 us -> 608 us +- 6 us: 1.07x slower
- crypto_pyaes: 123 ms +- 2 ms -> 131 ms +- 2 ms: 1.07x slower
- unpickle_list: 6.85 us +- 0.07 us -> 7.26 us +- 0.12 us: 1.06x slower
- html5lib: 117 ms +- 11 ms -> 123 ms +- 11 ms: 1.06x slower
- meteor_contest: 153 ms +- 3 ms -> 161 ms +- 3 ms: 1.05x slower
- nqueens: 163 ms +- 2 ms -> 171 ms +- 2 ms: 1.05x slower
- json_dumps: 21.8 ms +- 0.3 ms -> 22.9 ms +- 0.4 ms: 1.05x slower
- xml_etree_process: 107 ms +- 1 ms -> 112 ms +- 4 ms: 1.04x slower
- logging_format: 23.7 us +- 0.5 us -> 24.8 us +- 0.4 us: 1.04x slower
- telco: 12.0 ms +- 0.1 ms -> 12.5 ms +- 0.2 ms: 1.04x slower
- dulwich_log: 238 ms +- 2 ms -> 246 ms +- 10 ms: 1.04x slower
- xml_etree_generate: 145 ms +- 4 ms -> 149 ms +- 3 ms: 1.03x slower
- pickle: 21.4 us +- 0.2 us -> 22.0 us +- 0.2 us: 1.03x slower
- xml_etree_iterparse: 194 ms +- 9 ms -> 198 ms +- 8 ms: 1.02x slower
- regex_v8: 31.3 ms +- 0.1 ms -> 31.9 ms +- 0.2 ms: 1.02x slower
- tornado_http: 338 ms +- 13 ms -> 345 ms +- 4 ms: 1.02x slower
- sqlite_synth: 5.43 us +- 0.03 us -> 5.53 us +- 0.03 us: 1.02x slower
- json_loads: 39.4 us +- 0.5 us -> 40.0 us +- 0.4 us: 1.01x slower
- unpickle: 23.1 us +- 0.2 us -> 23.3 us +- 0.3 us: 1.01x slower
- regex_dna: 257 ms +- 2 ms -> 259 ms +- 6 ms: 1.01x slower
- pidigits: 267 ms +- 2 ms -> 269 ms +- 2 ms: 1.01x slower
Faster (3):
- pickle_list: 7.82 us +- 0.07 us -> 7.52 us +- 0.04 us: 1.04x faster
- regex_effbot: 4.35 ms +- 0.22 ms -> 4.22 ms +- 0.06 ms: 1.03x faster
- pickle_dict: 48.0 us +- 0.6 us -> 46.9 us +- 0.2 us: 1.02x faster
Benchmark hidden because not significant (2): pathlib, xml_etree_parse
Geometric mean: 1.08x slower
At one glance, function inlinining in _PyEval_EvalFrameDefault() looks OK. Build logs showed only the following difference:
- v1929:
8816 of 79666 inline instances were from dead/cold paths - v1933:
6591 of 77031 inline instances were from dead/cold paths
3.10 seems fine with v.1933. I'm not sure if this applies to everyone.
cc: @zooba
Metadata
Metadata
Assignees
Labels
No labels