Make derivative 5-8 times faster #3322

paulftw · 2024-11-19T19:25:56Z

derivative seems to spend most of its time inside _toString, because constNodes[foo] converts keys to strings.
I'm not sure Set is really needed as a separate type, but because _derivative is a typed function I had to introduce it.

Benchmark before the change (Core i5 2017 macbook):

ddf x 23.77 ops/sec ±0.54% (43 runs sampled)
df x 79.58 ops/sec ±0.57% (69 runs sampled)
ddf x 23.62 ops/sec ±0.54% (43 runs sampled)
df x 78.86 ops/sec ±0.50% (74 runs sampled)
ddf x 23.73 ops/sec ±0.47% (43 runs sampled)
df x 79.73 ops/sec ±0.57% (69 runs sampled)
ddf x 23.85 ops/sec ±0.64% (44 runs sampled)
df x 80.17 ops/sec ±0.43% (69 runs sampled)

Benchmark with this PR (same old macbook):

ddf x 1,781 ops/sec ±14.63% (80 runs sampled)
df x 5,161 ops/sec ±30.53% (62 runs sampled)
ddf x 1,769 ops/sec ±17.68% (79 runs sampled)
df x 5,286 ops/sec ±30.00% (72 runs sampled)
ddf x 1,910 ops/sec ±4.58% (85 runs sampled)
df x 4,275 ops/sec ±41.16% (58 runs sampled)
ddf x 1,959 ops/sec ±4.91% (88 runs sampled)
df x 4,474 ops/sec ±37.39% (62 runs sampled)

josdejong · 2024-11-20T09:54:26Z

Wow, this is a huge performance improvement! That is a smart idea, thanks Paul.

Your PR looks good to go. I have only one thought: the old solution with Object would deduplicate nodes that have the same string representation, and the new solution with Set doesn't deduplicate since it keeps nodes by their reference. Can that somehow lead to issues or a differing behavior? (I can't come up with anything myself, just checking)

paulftw · 2024-11-21T00:43:17Z

@josdejong Good question! When _derivative checks a node against constNodes it relies on the fact that someone (usually plainDerivative) has checked that node and if necessary marked it const. With Object adding a node to constNodes also "adds" all future and current instances with same string representation. With Set only the original node is added.

Things could behave differently if _derivative

constructs a constant node and
recursively passes it back to itself without calling constTag first.

Old code would depend on whether constructed node appears anywhere in the input as a sub-expression. New code will always take derivative of it.
I'm not sure if it's bad or even possible. Assumed that tests would pick it up if it really matters, but maybe the test suite isn't that robust. WDYT?

paulftw · 2024-11-21T00:55:14Z

Bulletproof way would be to add memoization to constTag and never read constNodes directly i.e. make it an internal cache.

paulftw · 2024-11-21T17:40:21Z

@josdejong PTAL
Idea of a potential bug didn't sit too well with me so I've refactored constTag to be a pure (and cached) function. This allowed me to undo the Set & isSet boilerplate.
Surprisingly enough it changed the test output for nthRoot((6x), (2x)). Same expression, but an extra zero is now discarded. Seems like a good thing but it'd be nice for someone knowledgeable to double check.

Will post a new benchmark shortly, can't promise it will be as good as the previous one though.

paulftw · 2024-11-21T18:45:22Z

I've updated the benchmark and it made measurement error smaller (or whatever that ±30.53% meant). I think it's because GC fires less often / finishes faster.

Adding isConstCached made everything somewhat slower (on one specific randomly chosen expression). Still better than the original. Code is sort of more readable and reliable (to me personally, in large part because I wrote it).

Let me know if you have any suggestions how to improve this PR.

Numbers for the new benchmark:

Set()

ddf x 2,083 ops/sec ±0.84% (189 runs sampled)
df  x 6,720 ops/sec ±0.52% (191 runs sampled)
ddf x 2,042 ops/sec ±1.73% (188 runs sampled)
df  x 6,684 ops/sec ±0.48% (191 runs sampled)


isConstCached

ddf x 1,709 ops/sec ±1.19% (188 runs sampled)
df  x 5,589 ops/sec ±0.48% (189 runs sampled)
ddf x 1,668 ops/sec ±0.68% (186 runs sampled)
df  x 5,405 ops/sec ±0.53% (189 runs sampled)

Paul K and others added 2 commits November 19, 2024 21:16

Make derivative 5-8 times faster

0feda84

Merge branch 'develop' into develop

8731683

Refactor: replace constNodes with a memoized function

2eef5fc

Paul K added 2 commits November 21, 2024 20:23

Reduce memory pressure in benchmark/derivative

56d9b73

eslint

fc3bd6c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make derivative 5-8 times faster #3322

Make derivative 5-8 times faster #3322

paulftw commented Nov 19, 2024

josdejong commented Nov 20, 2024

paulftw commented Nov 21, 2024

paulftw commented Nov 21, 2024

paulftw commented Nov 21, 2024

paulftw commented Nov 21, 2024 •

edited

Loading

Make derivative 5-8 times faster #3322

Are you sure you want to change the base?

Make derivative 5-8 times faster #3322

Conversation

paulftw commented Nov 19, 2024

josdejong commented Nov 20, 2024

paulftw commented Nov 21, 2024

paulftw commented Nov 21, 2024

paulftw commented Nov 21, 2024

paulftw commented Nov 21, 2024 • edited Loading

paulftw commented Nov 21, 2024 •

edited

Loading