add more benchmarks #447

lahma · 2017-12-18T11:43:42Z

from https://rushfrisby.com/net-javascript-engine-performance-results/
adds support for NilJS
adds the the array benchmark that can be baseline for some tweaks

This already shows that some of array usage problems might be performance related, I'm going to investigate if there are any easy wins.

BenchmarkDotNet=v0.10.11, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.125)
Processor=Intel Core i7-6820HQ CPU 2.70GHz (Skylake), ProcessorCount=8
Frequency=2648437 Hz, Resolution=377.5812 ns, Timer=TSC
.NET Core SDK=2.1.2
  [Host]     : .NET Core 2.0.3 (Framework 4.6.25815.02), 64bit RyuJIT
  Job-POIOJA : .NET Core 2.0.3 (Framework 4.6.25815.02), 64bit RyuJIT

InvocationCount=4  LaunchCount=1  TargetCount=3  
UnrollFactor=4  WarmupCount=3

Method	N	ReuseEngine	Mean	Error	StdDev	Gen 0	Gen 1	Gen 2	Allocated
Jint	5	False	11,118.2 ms	1,832.937 ms	103.5643 ms	2205666.6667	42666.6667	11333.3333	8922.57 MB
Jurassic	5	False	626.2 ms	293.042 ms	16.5574 ms	25750.0000	5000.0000	2250.0000	126.63 MB
NilJS	5	False	521.8 ms	51.011 ms	2.8822 ms	15500.0000	5000.0000	2250.0000	86.15 MB
Jint	5	True	10,877.9 ms	563.051 ms	31.8134 ms	2207083.3333	44833.3333	10500.0000	8921.86 MB
Jurassic	5	True	572.2 ms	4.661 ms	0.2633 ms	25000.0000	5000.0000	1500.0000	123.85 MB
NilJS	5	True	504.1 ms	20.257 ms	1.1445 ms	15750.0000	5250.0000	2500.0000	86.14 MB

…ormance-results/

sebastienros · 2017-12-18T18:52:20Z

There is an updated version of the article here https://rushfrisby.com/net-javascript-engine-performance-results-updated-2016/

And it looks like the author is using Jint even though it's the slowest on these scripts. It's not unexpected as these are compute intensive, and compiled scripts will shine, so V8 will always win there. But that a good set of benchmarks to find the big bottlenecks. Obviously Array is one of the main issues, which is why you will find a branch here were I tried to optimized things, without much success.

If you want to improve it, the main idea would be to have different implementations of the prototype methods (push, iterate, sort, ...) based on the type of array, like sparse, range, ... This is what v8 does too. The idea is that the specification states an array can be sparse and it uses string indices. But it makes the algorithm and implementations slow. So to optimize it we can store some flags and detect best cases, then not use the exact specification but optimized algorithms. I think V8 supports three different types.

sebastienros · 2017-12-18T19:01:43Z

A few comments:

Seems weird that the allocated memory is the same whether the engine is reused or not.
I understand it's more work, but I think it would be much more valuable to see more granular modifications to understand how each change impact the perf. There are some changes I'd prefer not to make if they don't really provide much value. Or just to understand what as much impact to be able to reproduce it in other parts of the code.
Last but not the least, thanks a lot !

lahma · 2017-12-18T20:24:44Z

Thank you for reviewing. In this pull request I'm just trying put some baseline numbers available. I did see the updated benchmark which brings the sunspider etc tests in, but thought this would be easier as first step.

As the sunspider tests seem to be part of tests already they should not be hard to reference.

What I understood I should split the performance pull requests (esprima, jint array performance) to smaller parts which I fully understand, but is this PR good as-is or should I split it to smaller parts?

I probably need to check the reused engine. It might not affect as loop is now N=1 and engine should be probably a static variable in this case.

lahma · 2017-12-18T21:04:10Z

I've updated the the benchmark and top comment with run information done with 5 repetitions per new engine instance / shared engine. The array handling at the moment is a bit slow and thus takes some time to get results.

With friendlier benchmark (not targeting array handling) we can use larger N and see a bigger difference by using the same instance.

ayende · 2017-12-18T21:11:30Z

Just some words about these kind of benchmarks. We did a whole bunch of work to see what it would look like with Jurassic on our end. The kind of benchmark you are seeing here is misleading, because you are doing a LOT of work inside the js engine. If you are using it to mostly do things outside, such as directing the operation of other code, than the cost of going in and out of the engine can kill your perf.

See: https://ayende.com/blog/179553/with-performance-test-benchmark-and-be-ready-to-back-out
And: https://ayende.com/blog/179617/js-execution-performance-and-a-whole-lot-of-effort

sebastienros · 2017-12-18T21:26:38Z

I am ok with tracking perf comparisons with Jurassic and NilJS, at least to know the differences, as long as we can see different scripts, like some sunspiders but also small ones like the one which was used initially. Can also show the scenarios that @ayende is referring to. This should be one benchmark table.

Then we can have other specific benchmarks that don't show Jurassic or NiJS which would only be used to track improvements and regressions. Arrays being one of them. In this case we really don't care about the reuse flag.

But before doing any change I want to hear what everyone has to say. You two have the most weight on the matter as you are directly impacted.

lahma · 2017-12-18T21:43:55Z

I had the different engines there as the original benchmark project had the Jurassic one and the per comparison seemed like a way to see "how far are we". I do understand the fundamental differences between engines.

For me this more of fun optimization exercise that hopefully could benefit both Jint and RavenDB. My journey began with some JSON projection trouble that seemed to use a lot of memory. I'm open to drop comparison between engines or change to method level benchmarks. I just need some torture baseline like the array case which is worst case scenario, but showed nicely an area to improve on and a number for PR to reflect against.

lahma · 2017-12-20T06:50:16Z

I've now put the other engine benchmark behind conditional compilation and they won't be present by default. I've also added the sun spider benchmarks as separate benchmark that gives results for each file, see results below. Is there something more to do/change before this could be merged?

Method	FileName	Mean	Error	StdDev	Gen 0	Gen 1	Gen 2	Allocated
Run	3d-cube	1,270.4 ms	693.672 ms	39.1937 ms	210000.0000	250.0000	-	842.35 MB
Run	3d-morph	1,331.6 ms	118.247 ms	6.6812 ms	191333.3333	51833.3333	6750.0000	758.14 MB
Run	3d-raytrace	1,054.4 ms	58.666 ms	3.3148 ms	203500.0000	2250.0000	750.0000	818.78 MB
Run	access-binary-trees	469.8 ms	123.501 ms	6.9780 ms	92750.0000	1000.0000	-	373.94 MB
Run	access-fannkuch	3,690.9 ms	797.337 ms	45.0510 ms	794583.3333	250.0000	-	3178.96 MB
Run	access-nbody	1,203.7 ms	387.911 ms	21.9177 ms	179000.0000	-	-	716.74 MB
Run	access-nsieve	1,932.4 ms	1,011.797 ms	57.1684 ms	256583.3333	74583.3333	9666.6667	1058.6 MB
Run	bitops-3bit-bits-in-byte	928.6 ms	421.755 ms	23.8300 ms	167500.0000	-	-	670.44 MB
Run	bitops-bits-in-byte	1,393.3 ms	292.475 ms	16.5254 ms	240500.0000	-	-	962.87 MB
Run	bitops-bitwise-and	883.0 ms	363.535 ms	20.5404 ms	101750.0000	-	-	407.43 MB
Run	bitops-nsieve-bits	1,717.5 ms	1,442.951 ms	81.5294 ms	296000.0000	31750.0000	1750.0000	1188 MB
Run	controlflow-recursive	790.6 ms	39.963 ms	2.2580 ms	147750.0000	2750.0000	-	599.25 MB
Run	crypto-aes	1,324.2 ms	3,698.428 ms	208.9680 ms	259750.0000	250.0000	-	1042.38 MB
Run	crypto-md5	674.4 ms	311.339 ms	17.5912 ms	123000.0000	2500.0000	500.0000	494.57 MB
Run	crypto-sha1	694.7 ms	25.975 ms	1.4676 ms	126250.0000	1250.0000	250.0000	506.9 MB
Run	date-format-tofte	867.6 ms	1,393.672 ms	78.7450 ms	153000.0000	250.0000	-	614.79 MB
Run	date-format-xparb	575.9 ms	364.477 ms	20.5936 ms	51000.0000	250.0000	-	205.97 MB
Run	math-cordic	1,840.5 ms	957.835 ms	54.1195 ms	297750.0000	-	-	1191.14 MB
Run	math-partial-sums	581.2 ms	103.820 ms	5.8660 ms	73000.0000	-	-	292.71 MB
Run	math-spectral-norm	810.1 ms	77.500 ms	4.3789 ms	144000.0000	-	-	576.42 MB
Run	regexp-dna	339.1 ms	4.092 ms	0.2312 ms	2500.0000	2000.0000	1500.0000	21.48 MB
Run	string-base64	822.8 ms	150.389 ms	8.4972 ms	350000.0000	250.0000	-	1403.56 MB
Run	string-fasta	1,071.1 ms	41.007 ms	2.3169 ms	204750.0000	-	-	819.47 MB
Run	string-tagcloud	875.0 ms	210.431 ms	11.8898 ms	199666.6667	123500.0000	116583.3333	1080.07 MB
Run	string-unpack-code	333.0 ms	3.888 ms	0.2197 ms	61500.0000	4000.0000	1250.0000	261.57 MB
Run	string-validate-input	2,893.0 ms	839.161 ms	47.4142 ms	1645583.3333	1562333.3333	1561083.3333	6460.62 MB

lahma · 2017-12-20T19:02:34Z

I've removed the permutation where engine is not being reused, always using the same engine to reflect sane real world usage
I've added benchmark for the projection case that led me to this endeavour

sebastienros · 2017-12-20T19:06:20Z

Currently reviewing it. I might refactor it to my taste if you don't mind, I don't want to ask you to do more changes and get you frustrated, you already did a lot ;) Only details, don't worry.

sebastienros · 2017-12-20T19:07:31Z

Jint.Benchmark/UncacheableExpressionsBenchmark.cs

+        }
+
+        [Params(500)]
+        public int N { get; set; }


I don't understand this. Why is it necessary when Benchmarks.NET can already do that by itself?

Using Params adds it to benchmark report and clarifies the benchmark case's scenario. Benchmark.NET runs the target method X times which it finds best to get good confidence intervals etc. So here I want to ad to report "The target Jint function was called 500 times inside the test case". We could also add more numbers that would show how it behaves depending on iteration count (if it would be costly to call once due to caches but really fast after that).

lahma · 2017-12-20T19:33:06Z

@sebastienros please do any necessary changes, I'm more than OK with that, thanks.

sebastienros · 2017-12-20T19:07:55Z

Jint.Benchmark/UncacheableExpressionsBenchmark.cs

+
+        private static void InitializeEngine(Options options)
+        {
+            options


~~Why these options?~~

Ignore, didn't see it was specific to this benchmark

Yes and ones that RavenDB uses, MaxStatements especially trips here to error when the smaller default that RavenDB uses.. something like 50 in there and far from enough.

sebastienros · 2017-12-20T19:10:34Z

Jint.Benchmark/UncacheableExpressionsBenchmark.cs

+    /// Test case for situation where object is projected via filter and map, Jint deems code as uncacheable.
+    /// </summary>
+    [Config(typeof(Config))]
+    public class UncacheableExpressionsBenchmark


Looks like a special case benchmark. If there is a bug (like something should be cached and is not) we should have a unit test. If this is just to make a specific case fast enough, then it shouldn't be in the repos, or at least on a custom branch where the work for improvement will happen, then be removed once it's done.

In a way a special case, but this should reflect function invocation with arrays in quite generic way. Arrays struggle with these kinds of data where index access/traversal goes via dictionary instead of pure array indexing and produces more arrays.

There isn't a bug per se, but a performance problem when you try to filter/project large nested arrays. But feel free to remove if you want.

map and filter are very common things to do on arrays.
And they can usually be lifted quite easily, that is not just a generic scenario, that is something that I think should be optimized specifically

Then these benchmarks should be in the same table as the sunspider ones, unless there are already included there, or within one dedicated to Array operations.

In an Array benchmark class that would look like this http://jsben.ch/k3QoV. It doesn't have to be in this PR. We can do it later or as part of a perf PR that would focus on arrays.

…chmark

lahma · 2017-12-21T06:09:37Z

I've renamed the old ArrayBenchmark to ArrayStressBenchmark and created new ArrayBenchmark for the latest link that you gave. I included the the operations that are supported.

There might be problems combining benchmark suites and I personally don't follow the same mindset that you would with unit/integration tests. Ideally benchmark case would have two methods: old algorithm and new algorithm which would create us the tables etc nicely. This however, is not easy to achieve when you are trying to make larger changes/refactorings - then I usually run the suite in "master" (or base branch containing only new benchmark against master), copy result directory, checkout to my new branch and run same test and then create the comparison results by showing the before and after reports.

It's also problematic when you have too many tests in benchmark, it becomes slow to run when you just want to optimize some very specific thing (maybe one test method shows that one). Unlike tests and the actual shipped code, benchmarks are more in supporting role and might not be reused that much depending on the case.

I hope we find some middle ground soon, I'd need the two optimizations to advance other analysis, it's a bit painful to cherry-pick between branches and trying to mentally keep up what's unfixed problem and what is fixed somewhere else.

lahma · 2017-12-21T22:10:27Z

Thank you for handling the PR fast , my eagerness to improve may show up as impatience but I hope I can help the best I can.

lahma · 2017-12-22T07:07:03Z

I've created a new issue #451 to track overall progress, I'll update results there between pull requests.

add benchmarks from https://rushfrisby.com/net-javascript-engine-perf…

8c4013d

…ormance-results/

tweak N for ArrayBenchmark to show reusing of engine

2be38c4

put other engines behind features toggle, make array test less stressful

4e2abd3

add SunSpiderBenchmark

b7cdb34

lahma force-pushed the more-benchmarks branch from 323ea8c to b7cdb34 Compare December 20, 2017 07:16

lahma added 2 commits December 20, 2017 20:44

remove reuse flag from benchmark, always reuse

1ce9d37

add benchmark that imitates RavenDB projection

6606c8b

sebastienros reviewed Dec 20, 2017

View reviewed changes

add ArrayBenchmark for basic operations, rename old to ArrayStressBen…

008b9a5

…chmark

sebastienros merged commit 4a7fed1 into sebastienros:dev Dec 21, 2017

lahma deleted the more-benchmarks branch January 5, 2019 23:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add more benchmarks #447

add more benchmarks #447

lahma commented Dec 18, 2017 •

edited

Loading

sebastienros commented Dec 18, 2017

sebastienros commented Dec 18, 2017

lahma commented Dec 18, 2017

lahma commented Dec 18, 2017

ayende commented Dec 18, 2017

sebastienros commented Dec 18, 2017

lahma commented Dec 18, 2017

lahma commented Dec 20, 2017

lahma commented Dec 20, 2017

sebastienros commented Dec 20, 2017

sebastienros Dec 20, 2017

lahma Dec 20, 2017

lahma commented Dec 20, 2017

sebastienros Dec 20, 2017 •

edited

Loading

sebastienros Dec 20, 2017

lahma Dec 20, 2017

sebastienros Dec 20, 2017

lahma Dec 20, 2017

ayende Dec 20, 2017

sebastienros Dec 20, 2017

sebastienros Dec 20, 2017

lahma commented Dec 21, 2017

lahma commented Dec 21, 2017

lahma commented Dec 22, 2017

add more benchmarks #447

add more benchmarks #447

Conversation

lahma commented Dec 18, 2017 • edited Loading

sebastienros commented Dec 18, 2017

sebastienros commented Dec 18, 2017

lahma commented Dec 18, 2017

lahma commented Dec 18, 2017

ayende commented Dec 18, 2017

sebastienros commented Dec 18, 2017

lahma commented Dec 18, 2017

lahma commented Dec 20, 2017

lahma commented Dec 20, 2017

sebastienros commented Dec 20, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lahma commented Dec 20, 2017

sebastienros Dec 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lahma commented Dec 21, 2017

lahma commented Dec 21, 2017

lahma commented Dec 22, 2017

lahma commented Dec 18, 2017 •

edited

Loading

sebastienros Dec 20, 2017 •

edited

Loading