refactor(profiling): change how GenInfo::is_running works#15508
Conversation
|
|
f470762 to
0308a76
Compare
0308a76 to
8357206
Compare
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 249 ± 3 ms. The average import time from base is: 254 ± 4 ms. The import time difference between this PR and base is: -5.0 ± 0.2 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate kowalski/refactor-profiling-change-how-geninfo-is_running-works (8357206) with baseline main (a5d1649) 📈 Performance Regressions (3 suites)📈 iastaspects - 118/118✅ add_aspectTime: ✅ 0.400µs (SLO: <10.000µs 📉 -96.0%) vs baseline: -0.3% Memory: ✅ 40.187MB (SLO: <41.500MB -3.2%) vs baseline: +4.6% ✅ add_inplace_aspectTime: ✅ 0.407µs (SLO: <10.000µs 📉 -95.9%) vs baseline: +0.4% Memory: ✅ 40.206MB (SLO: <41.500MB -3.1%) vs baseline: +4.3% ✅ add_inplace_noaspectTime: ✅ 0.321µs (SLO: <10.000µs 📉 -96.8%) vs baseline: +1.1% Memory: ✅ 40.265MB (SLO: <41.500MB -3.0%) vs baseline: +4.8% ✅ add_noaspectTime: ✅ 0.280µs (SLO: <10.000µs 📉 -97.2%) vs baseline: +0.7% Memory: ✅ 40.088MB (SLO: <41.500MB -3.4%) vs baseline: +4.4% ✅ bytearray_aspectTime: ✅ 1.349µs (SLO: <10.000µs 📉 -86.5%) vs baseline: -0.6% Memory: ✅ 40.206MB (SLO: <41.500MB -3.1%) vs baseline: +4.9% ✅ bytearray_extend_aspectTime: ✅ 1.492µs (SLO: <10.000µs 📉 -85.1%) vs baseline: -0.6% Memory: ✅ 40.088MB (SLO: <41.500MB -3.4%) vs baseline: +4.0% ✅ bytearray_extend_noaspectTime: ✅ 0.607µs (SLO: <10.000µs 📉 -93.9%) vs baseline: -1.1% Memory: ✅ 40.206MB (SLO: <41.500MB -3.1%) vs baseline: +5.0% ✅ bytearray_noaspectTime: ✅ 0.480µs (SLO: <10.000µs 📉 -95.2%) vs baseline: +0.7% Memory: ✅ 40.285MB (SLO: <41.500MB -2.9%) vs baseline: +4.9% ✅ bytes_aspectTime: ✅ 1.295µs (SLO: <10.000µs 📉 -87.0%) vs baseline: +0.7% Memory: ✅ 40.285MB (SLO: <41.500MB -2.9%) vs baseline: +4.8% ✅ bytes_noaspectTime: ✅ 0.493µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.3% Memory: ✅ 40.167MB (SLO: <41.500MB -3.2%) vs baseline: +4.4% ✅ bytesio_aspectTime: ✅ 1.356µs (SLO: <10.000µs 📉 -86.4%) vs baseline: +2.2% Memory: ✅ 40.147MB (SLO: <41.500MB -3.3%) vs baseline: +4.7% ✅ bytesio_noaspectTime: ✅ 0.498µs (SLO: <10.000µs 📉 -95.0%) vs baseline: -0.8% Memory: ✅ 40.305MB (SLO: <41.500MB -2.9%) vs baseline: +4.8% ✅ capitalize_aspectTime: ✅ 0.738µs (SLO: <10.000µs 📉 -92.6%) vs baseline: +0.8% Memory: ✅ 40.167MB (SLO: <41.500MB -3.2%) vs baseline: +4.5% ✅ capitalize_noaspectTime: ✅ 0.434µs (SLO: <10.000µs 📉 -95.7%) vs baseline: -0.2% Memory: ✅ 40.187MB (SLO: <41.500MB -3.2%) vs baseline: +5.0% ✅ casefold_aspectTime: ✅ 0.744µs (SLO: <10.000µs 📉 -92.6%) vs baseline: +1.2% Memory: ✅ 40.246MB (SLO: <41.500MB -3.0%) vs baseline: +5.1% ✅ casefold_noaspectTime: ✅ 0.374µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +1.8% Memory: ✅ 40.344MB (SLO: <41.500MB -2.8%) vs baseline: +5.2% ✅ decode_aspectTime: ✅ 0.727µs (SLO: <10.000µs 📉 -92.7%) vs baseline: +0.7% Memory: ✅ 40.285MB (SLO: <41.500MB -2.9%) vs baseline: +4.9% ✅ decode_noaspectTime: ✅ 0.420µs (SLO: <10.000µs 📉 -95.8%) vs baseline: -0.6% Memory: ✅ 40.226MB (SLO: <41.500MB -3.1%) vs baseline: +5.1% ✅ encode_aspectTime: ✅ 0.709µs (SLO: <10.000µs 📉 -92.9%) vs baseline: -0.2% Memory: ✅ 40.187MB (SLO: <41.500MB -3.2%) vs baseline: +4.3% ✅ encode_noaspectTime: ✅ 0.401µs (SLO: <10.000µs 📉 -96.0%) vs baseline: -0.3% Memory: ✅ 40.226MB (SLO: <41.500MB -3.1%) vs baseline: +4.8% ✅ format_aspectTime: ✅ 3.429µs (SLO: <10.000µs 📉 -65.7%) vs baseline: -0.5% Memory: ✅ 40.265MB (SLO: <41.500MB -3.0%) vs baseline: +5.0% ✅ format_map_aspectTime: ✅ 3.574µs (SLO: <10.000µs 📉 -64.3%) vs baseline: -0.6% Memory: ✅ 40.403MB (SLO: <41.500MB -2.6%) vs baseline: +5.3% ✅ format_map_noaspectTime: ✅ 0.783µs (SLO: <10.000µs 📉 -92.2%) vs baseline: +1.2% Memory: ✅ 40.226MB (SLO: <41.500MB -3.1%) vs baseline: +4.8% ✅ format_noaspectTime: ✅ 0.601µs (SLO: <10.000µs 📉 -94.0%) vs baseline: +0.9% Memory: ✅ 40.147MB (SLO: <41.500MB -3.3%) vs baseline: +4.4% ✅ index_aspectTime: ✅ 0.352µs (SLO: <10.000µs 📉 -96.5%) vs baseline: -3.0% Memory: ✅ 40.246MB (SLO: <41.500MB -3.0%) vs baseline: +4.9% ✅ index_noaspectTime: ✅ 0.275µs (SLO: <10.000µs 📉 -97.2%) vs baseline: -2.2% Memory: ✅ 40.147MB (SLO: <41.500MB -3.3%) vs baseline: +4.7% ✅ join_aspectTime: ✅ 1.344µs (SLO: <10.000µs 📉 -86.6%) vs baseline: +0.7% Memory: ✅ 40.108MB (SLO: <41.500MB -3.4%) vs baseline: +4.7% ✅ join_noaspectTime: ✅ 0.492µs (SLO: <10.000µs 📉 -95.1%) vs baseline: +0.1% Memory: ✅ 40.049MB (SLO: <41.500MB -3.5%) vs baseline: +4.1% ✅ ljust_aspectTime: ✅ 2.597µs (SLO: <20.000µs 📉 -87.0%) vs baseline: +1.0% Memory: ✅ 40.187MB (SLO: <41.500MB -3.2%) vs baseline: +4.4% ✅ ljust_noaspectTime: ✅ 0.404µs (SLO: <10.000µs 📉 -96.0%) vs baseline: -0.6% Memory: ✅ 40.226MB (SLO: <41.500MB -3.1%) vs baseline: +4.8% ✅ lower_aspectTime: ✅ 2.301µs (SLO: <10.000µs 📉 -77.0%) vs baseline: +2.7% Memory: ✅ 40.246MB (SLO: <41.500MB -3.0%) vs baseline: +4.8% ✅ lower_noaspectTime: ✅ 0.364µs (SLO: <10.000µs 📉 -96.4%) vs baseline: -0.5% Memory: ✅ 40.226MB (SLO: <41.500MB -3.1%) vs baseline: +4.8% ✅ lstrip_aspectTime: ✅ 2.265µs (SLO: <20.000µs 📉 -88.7%) vs baseline: +0.5% Memory: ✅ 40.403MB (SLO: <41.500MB -2.6%) vs baseline: +5.1% ✅ lstrip_noaspectTime: ✅ 0.382µs (SLO: <10.000µs 📉 -96.2%) vs baseline: -0.3% Memory: ✅ 40.265MB (SLO: <41.500MB -3.0%) vs baseline: +5.0% ✅ modulo_aspectTime: ✅ 1.043µs (SLO: <10.000µs 📉 -89.6%) vs baseline: +3.5% Memory: ✅ 40.285MB (SLO: <41.500MB -2.9%) vs baseline: +5.0% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 1.548µs (SLO: <10.000µs 📉 -84.5%) vs baseline: -0.4% Memory: ✅ 40.324MB (SLO: <41.500MB -2.8%) vs baseline: +5.0% ✅ modulo_aspect_for_bytesTime: ✅ 0.986µs (SLO: <10.000µs 📉 -90.1%) vs baseline: +1.4% Memory: ✅ 40.344MB (SLO: <41.500MB -2.8%) vs baseline: +4.9% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 1.234µs (SLO: <10.000µs 📉 -87.7%) vs baseline: +1.4% Memory: ✅ 40.265MB (SLO: <41.500MB -3.0%) vs baseline: +5.0% ✅ modulo_noaspectTime: ✅ 0.630µs (SLO: <10.000µs 📉 -93.7%) vs baseline: +0.9% Memory: ✅ 40.285MB (SLO: <41.500MB -2.9%) vs baseline: +5.0% ✅ replace_aspectTime: ✅ 4.892µs (SLO: <10.000µs 📉 -51.1%) vs baseline: -0.8% Memory: ✅ 40.206MB (SLO: <41.500MB -3.1%) vs baseline: +4.3% ✅ replace_noaspectTime: ✅ 0.462µs (SLO: <10.000µs 📉 -95.4%) vs baseline: ~same Memory: ✅ 40.128MB (SLO: <41.500MB -3.3%) vs baseline: +4.2% ✅ repr_aspectTime: ✅ 0.906µs (SLO: <10.000µs 📉 -90.9%) vs baseline: -0.7% Memory: ✅ 40.088MB (SLO: <41.500MB -3.4%) vs baseline: +4.6% ✅ repr_noaspectTime: ✅ 0.416µs (SLO: <10.000µs 📉 -95.8%) vs baseline: -0.2% Memory: ✅ 40.246MB (SLO: <41.500MB -3.0%) vs baseline: +4.9% ✅ rstrip_aspectTime: ✅ 1.932µs (SLO: <20.000µs 📉 -90.3%) vs baseline: +0.5% Memory: ✅ 40.324MB (SLO: <41.500MB -2.8%) vs baseline: +5.3% ✅ rstrip_noaspectTime: ✅ 0.378µs (SLO: <10.000µs 📉 -96.2%) vs baseline: -0.8% Memory: ✅ 40.265MB (SLO: <41.500MB -3.0%) vs baseline: +4.9% ✅ slice_aspectTime: ✅ 0.491µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.4% Memory: ✅ 40.167MB (SLO: <41.500MB -3.2%) vs baseline: +4.7% ✅ slice_noaspectTime: ✅ 0.447µs (SLO: <10.000µs 📉 -95.5%) vs baseline: -0.4% Memory: ✅ 40.423MB (SLO: <41.500MB -2.6%) vs baseline: +5.4% ✅ stringio_aspectTime: ✅ 1.531µs (SLO: <10.000µs 📉 -84.7%) vs baseline: +0.9% Memory: ✅ 40.305MB (SLO: <41.500MB -2.9%) vs baseline: +4.8% ✅ stringio_noaspectTime: ✅ 0.717µs (SLO: <10.000µs 📉 -92.8%) vs baseline: +0.6% Memory: ✅ 40.147MB (SLO: <41.500MB -3.3%) vs baseline: +4.4% ✅ strip_aspectTime: ✅ 2.224µs (SLO: <20.000µs 📉 -88.9%) vs baseline: ~same Memory: ✅ 40.305MB (SLO: <41.500MB -2.9%) vs baseline: +4.8% ✅ strip_noaspectTime: ✅ 0.387µs (SLO: <10.000µs 📉 -96.1%) vs baseline: +0.7% Memory: ✅ 40.265MB (SLO: <41.500MB -3.0%) vs baseline: +5.0% ✅ swapcase_aspectTime: ✅ 2.782µs (SLO: <10.000µs 📉 -72.2%) vs baseline: 📈 +13.4% Memory: ✅ 40.265MB (SLO: <41.500MB -3.0%) vs baseline: +4.3% ✅ swapcase_noaspectTime: ✅ 0.547µs (SLO: <10.000µs 📉 -94.5%) vs baseline: +1.5% Memory: ✅ 40.167MB (SLO: <41.500MB -3.2%) vs baseline: +4.7% ✅ title_aspectTime: ✅ 2.460µs (SLO: <10.000µs 📉 -75.4%) vs baseline: +2.8% Memory: ✅ 40.128MB (SLO: <41.500MB -3.3%) vs baseline: +4.8% ✅ title_noaspectTime: ✅ 0.503µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.4% Memory: ✅ 40.246MB (SLO: <41.500MB -3.0%) vs baseline: +5.3% ✅ translate_aspectTime: ✅ 3.312µs (SLO: <10.000µs 📉 -66.9%) vs baseline: +0.7% Memory: ✅ 40.285MB (SLO: <41.500MB -2.9%) vs baseline: +4.7% ✅ translate_noaspectTime: ✅ 1.042µs (SLO: <10.000µs 📉 -89.6%) vs baseline: +0.2% Memory: ✅ 40.167MB (SLO: <41.500MB -3.2%) vs baseline: +4.8% ✅ upper_aspectTime: ✅ 2.319µs (SLO: <10.000µs 📉 -76.8%) vs baseline: +3.3% Memory: ✅ 40.167MB (SLO: <41.500MB -3.2%) vs baseline: +4.4% ✅ upper_noaspectTime: ✅ 0.368µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +0.3% Memory: ✅ 40.206MB (SLO: <41.500MB -3.1%) vs baseline: +4.4% 📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 5.232µs (SLO: <10.000µs 📉 -47.7%) vs baseline: 📈 +28.0% Memory: ✅ 40.226MB (SLO: <41.000MB 🟡 -1.9%) vs baseline: +4.7% ✅ ospathbasename_noaspectTime: ✅ 1.086µs (SLO: <10.000µs 📉 -89.1%) vs baseline: +0.4% Memory: ✅ 40.246MB (SLO: <41.000MB 🟡 -1.8%) vs baseline: +4.9% ✅ ospathjoin_aspectTime: ✅ 6.156µs (SLO: <10.000µs 📉 -38.4%) vs baseline: -0.4% Memory: ✅ 40.187MB (SLO: <41.000MB 🟡 -2.0%) vs baseline: +4.9% ✅ ospathjoin_noaspectTime: ✅ 2.300µs (SLO: <10.000µs 📉 -77.0%) vs baseline: +0.7% Memory: ✅ 40.167MB (SLO: <41.000MB -2.0%) vs baseline: +4.3% ✅ ospathnormcase_aspectTime: ✅ 3.442µs (SLO: <10.000µs 📉 -65.6%) vs baseline: +0.1% Memory: ✅ 40.226MB (SLO: <41.000MB 🟡 -1.9%) vs baseline: +4.7% ✅ ospathnormcase_noaspectTime: ✅ 0.574µs (SLO: <10.000µs 📉 -94.3%) vs baseline: +1.3% Memory: ✅ 40.167MB (SLO: <41.000MB -2.0%) vs baseline: +4.9% ✅ ospathsplit_aspectTime: ✅ 4.737µs (SLO: <10.000µs 📉 -52.6%) vs baseline: ~same Memory: ✅ 40.167MB (SLO: <41.000MB -2.0%) vs baseline: +4.3% ✅ ospathsplit_noaspectTime: ✅ 1.595µs (SLO: <10.000µs 📉 -84.0%) vs baseline: +0.2% Memory: ✅ 40.167MB (SLO: <41.000MB -2.0%) vs baseline: +5.0% ✅ ospathsplitdrive_aspectTime: ✅ 3.672µs (SLO: <10.000µs 📉 -63.3%) vs baseline: +0.1% Memory: ✅ 40.147MB (SLO: <41.000MB -2.1%) vs baseline: +4.8% ✅ ospathsplitdrive_noaspectTime: ✅ 0.700µs (SLO: <10.000µs 📉 -93.0%) vs baseline: -0.2% Memory: ✅ 40.187MB (SLO: <41.000MB 🟡 -2.0%) vs baseline: +4.3% ✅ ospathsplitext_aspectTime: ✅ 4.508µs (SLO: <10.000µs 📉 -54.9%) vs baseline: ~same Memory: ✅ 40.167MB (SLO: <41.000MB -2.0%) vs baseline: +4.6% ✅ ospathsplitext_noaspectTime: ✅ 1.381µs (SLO: <10.000µs 📉 -86.2%) vs baseline: -0.1% Memory: ✅ 40.364MB (SLO: <41.000MB 🟡 -1.6%) vs baseline: +5.0% 📈 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 3.413µs (SLO: <20.000µs 📉 -82.9%) vs baseline: 📈 +16.2% Memory: ✅ 34.780MB (SLO: <35.500MB -2.0%) vs baseline: +4.9% ✅ 1-count-metrics-100-timesTime: ✅ 204.191µs (SLO: <220.000µs -7.2%) vs baseline: +1.1% Memory: ✅ 34.898MB (SLO: <35.500MB 🟡 -1.7%) vs baseline: +4.7% ✅ 1-distribution-metric-1-timesTime: ✅ 3.364µs (SLO: <20.000µs 📉 -83.2%) vs baseline: +1.8% Memory: ✅ 34.819MB (SLO: <35.500MB 🟡 -1.9%) vs baseline: +4.7% ✅ 1-distribution-metrics-100-timesTime: ✅ 221.707µs (SLO: <230.000µs -3.6%) vs baseline: +1.2% Memory: ✅ 34.878MB (SLO: <35.500MB 🟡 -1.8%) vs baseline: +4.2% ✅ 1-gauge-metric-1-timesTime: ✅ 2.190µs (SLO: <20.000µs 📉 -89.0%) vs baseline: ~same Memory: ✅ 34.839MB (SLO: <35.500MB 🟡 -1.9%) vs baseline: +5.1% ✅ 1-gauge-metrics-100-timesTime: ✅ 138.224µs (SLO: <150.000µs -7.9%) vs baseline: ~same Memory: ✅ 34.898MB (SLO: <35.500MB 🟡 -1.7%) vs baseline: +4.7% ✅ 1-rate-metric-1-timesTime: ✅ 3.118µs (SLO: <20.000µs 📉 -84.4%) vs baseline: +1.2% Memory: ✅ 35.016MB (SLO: <35.500MB 🟡 -1.4%) vs baseline: +5.6% ✅ 1-rate-metrics-100-timesTime: ✅ 217.744µs (SLO: <250.000µs 📉 -12.9%) vs baseline: +0.5% Memory: ✅ 34.800MB (SLO: <35.500MB 🟡 -2.0%) vs baseline: +5.0% ✅ 100-count-metrics-100-timesTime: ✅ 20.389ms (SLO: <22.000ms -7.3%) vs baseline: -0.4% Memory: ✅ 34.701MB (SLO: <35.500MB -2.2%) vs baseline: +4.7% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.286ms (SLO: <2.300ms 🟡 -0.6%) vs baseline: -1.6% Memory: ✅ 34.800MB (SLO: <35.500MB 🟡 -2.0%) vs baseline: +4.0% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.431ms (SLO: <1.550ms -7.6%) vs baseline: +0.8% Memory: ✅ 34.800MB (SLO: <35.500MB 🟡 -2.0%) vs baseline: +5.0% ✅ 100-rate-metrics-100-timesTime: ✅ 2.237ms (SLO: <2.550ms 📉 -12.3%) vs baseline: +1.1% Memory: ✅ 34.780MB (SLO: <35.500MB -2.0%) vs baseline: +5.2% ✅ flush-1-metricTime: ✅ 4.622µs (SLO: <20.000µs 📉 -76.9%) vs baseline: -0.2% Memory: ✅ 35.075MB (SLO: <35.500MB 🟡 -1.2%) vs baseline: +4.7% ✅ flush-100-metricsTime: ✅ 175.431µs (SLO: <250.000µs 📉 -29.8%) vs baseline: +0.2% Memory: ✅ 35.193MB (SLO: <35.500MB 🟡 -0.9%) vs baseline: +5.2% ✅ flush-1000-metricsTime: ✅ 2.186ms (SLO: <2.500ms 📉 -12.5%) vs baseline: +0.2% Memory: ✅ 35.881MB (SLO: <36.500MB 🟡 -1.7%) vs baseline: +4.9% 🟡 Near SLO Breach (15 suites)🟡 coreapiscenario - 10/10 (1 unstable)
|
GenInfo::is_running works
Description
This change replicates P403n1x87/echion#202.
This PR changes the way
GenInfo::is_runningworks. Previously, it would indicate whether the current coroutine was on CPU; now, it indicates whether the current coroutine or the coroutine it (recursively) awaits is on CPU.Making that change also allows us to do less work when we check whether the current coroutine is on CPU or not. Because a Coroutine / Generator /
GenInfocan only be running if it is not awaiting another Generator, we do not need to computeis_runningif we haveawait != nullptr(and we takeawait->is_runningfor the value in that case).Making that change makes it easier to check whether a Task is currently on-CPU; and allows to do less work when we decide how to unwind
asyncioTasks (cf the changes inTaskInfowhich doesn't need theis_on_cpumethod that iterates on theawaitchain anymore).Note that I checked whether
GenInfo::is_runningwas used in any other way than the one I describe and simplify; it is not, so I do think this change is safe to make as-is.Note This PR makes sense on its own, but it is in the context of making P403n1x87/echion#198 simpler.
Note This doesn't need a changelog entry as it makes no difference to the user, it's purely a refactor.