You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I did a short profiling session of Dokka in kotlinx.coroutines.
The methodology was pretty straightforward: warmup Gradle daemon, run ./gradlew cleanDokkaHtmlMultiModule cleanDokkaHtmlPartial dokkaHtmlMultiModule, profile it with async-profiler in CPU/alloc modes, the corresponding flame graphs attached.
The root cause is quite straightforward -- for text-based HTML blocks Dokka invokes parseWithNormalisedSpaces, which is harmless in itself, but under the hood, it invokes Jsoup-parser unconditionally, which is not only a slow operation by itself, but also allocates a 64K temporary buffer for each text element.
PR #2730 invokes Jsoup conditionally by optimistically looking for & in the text first and applying Jsoup only when necessary (also, it's probably worth doing it manually anyway, but it's beyond the scope of my change).
It's quite hard to measure the impact of the change on coroutines because there are a lot of other tasks happening (ktor is probably a better candidate to test against, GC there takes 90% of the CPU time).
My numbers are the following:
No outstanding char[] allocations in the profile (new hotspots are identified)
10-15% less GC in the profile (35% -> ~20-25% of all execution time)
In no-daemon mode, it saves around ~15% of CPU time and peak CPU consumption by ~200% (on 16 core OS X)
The text was updated successfully, but these errors were encountered:
The load average is pretty much dropped from 6 to 5.
It probably should be tested on a less performant machine or with cores restriction (which OS X doesn't have, but Linux has via taskset)
Hi, I did a short profiling session of Dokka in
kotlinx.coroutines
.The methodology was pretty straightforward: warmup Gradle daemon, run
./gradlew cleanDokkaHtmlMultiModule cleanDokkaHtmlPartial dokkaHtmlMultiModule
, profile it withasync-profiler
in CPU/alloc modes, the corresponding flame graphs attached.The root cause is quite straightforward -- for text-based HTML blocks Dokka invokes
parseWithNormalisedSpaces
, which is harmless in itself, but under the hood, it invokes Jsoup-parser unconditionally, which is not only a slow operation by itself, but also allocates a 64K temporary buffer for each text element.PR #2730 invokes Jsoup conditionally by optimistically looking for
&
in the text first and applying Jsoup only when necessary (also, it's probably worth doing it manually anyway, but it's beyond the scope of my change).It's quite hard to measure the impact of the change on coroutines because there are a lot of other tasks happening (ktor is probably a better candidate to test against, GC there takes 90% of the CPU time).
My numbers are the following:
char[]
allocations in the profile (new hotspots are identified)no-daemon
mode, it saves around ~15% of CPU time and peak CPU consumption by ~200% (on 16 core OS X)The text was updated successfully, but these errors were encountered: