-
-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[js] rework enum implementation #5109
Comments
The benchmarks seem to confirm my gut feelings. Objects are generally PS I really need to find better characters for the table generation that On Tue, Apr 12, 2016 at 9:30 AM, Dan Korostelev [email protected]
|
Yeah, it's not only better for debugging, but for general interop too. In fact, I was working on something similar for the C#/Java targets: https://gist.github.com/nadako/5dcc9ba86343ed3795d3 |
I hacked in a simple anon-object based implementation (no reflection/toString for now) and measured like this: https://gist.github.com/nadako/793960bff7c496786174589499fd3d8d Creation is a lot faster, matching is on par, it seems. |
You may want to measure it with multiple constructors, with different numbers of arguments. Another plus is that this representation also maps much better to TypeScript (at least if we have the constructor name?): microsoft/TypeScript#1003 You're probably going to need inheritance anyway, to know the enum. Unless of course you wish to store it in a field instead. Unless the latter is critically faster, I think the former is preferable, since it's maps to native semantics a bit better. But I agree that this the way forward ;) |
I don't see much of a point in using inheritance for this. Sure it looks nicer if you imagine the code being in Haxe, but what sense does it make in Javascript? Having an extra field that "points" to all the reflection details (which are the same for all instances of a given constructor anyway) looks like the better approach to me. |
I agree with Simon on the inheritance, in the original benchmark I actually used inheritance (see object enum constuction/matching) and it also turned out to be slower (I assume that's because of extra ctor/super calls). So simple anon object with an As for the constructor name - I don't see the point in putting enum string tag in every instance and I think it hurts performance without real gain. "normal" haxe code works with |
Ok, after giving it some though, let me argue that we should use names instead of indices: using the string instead of an index yields no performance degradation on JS: http://try.haxe.org/#BB98D. It does however make things more
And as mentioned before, it will give Haxe enums a straight forward representation in type script, which is a nice plus. For compatibility reasons a name->index mapping can be used to determine the index (if needed). As for inheritance: Given that we can put the name in the
Only reflection is invoked through the inheritance chain, which tends to be slow anyway. For other access, you can see the difference is negligible here: http://try.haxe.org/#3241e This is not particularly important, but I don't see a reason to not get it right. The most important question though: How will we represent arguments? Consider this: enum Example {
Foo(s:String, d:Double);
Bar(s:Something, d:Different);
} What is the representation of The difference needs to be measured, but we should also take readability and robustness of each approach into account. For example, if you rename a constructor argument, it should not break anything. So maybe |
You can already use name-matching by adding Your try haxe does indicate a difference for me:
Also I don't think this is a valid benchmark for switches. An int-switch is not necessarily a sequential equality check, it can be implemented more efficiently (though I don't know what V8 does exactly). |
Right you are. So here's a benchmark that sort of does the real thing: http://try.haxe.org/#932A8 The output (ints, strings, good old enums) on chrome is sad:
Conceptually, there is absolutely no reason why comparing ints should be faster than comparing strings that stem from equal literals, as the references should point to the same thing (somewhere in the constant pool), which should lead to a decent fast path. If I run the same benchmark on Firefox, the output coincides with my claim, for whatever that's worth:
Yep, that's right, on Firefox the optimized version with ints actually is slower than enums ... oO I mostly use nodejs, so the V8 is what I care about most. Still, I think it's wiser to make this "right" and wait for the V8 to catch up. The alternative would be to change the implementation again once the numbers look more like they do on Firefox (TBH there's no sane reason why strings should be faster than ints, but I hope you know what I'm getting at), which seems a little silly, considering how many things can break with such a change. With such discrepancies between JS runtimes, maybe it's simply not a good time to optimize performance here. And maybe better interoperability with TypeScript and JavaScript alone are not worth the hassle either, although believe the contrary and I'm sorry for hijacking a thread motivated by performance to discuss a completely different aspect, but changes to the representation affect both of them alike. To me this kind of seems like a big deal, and making the decisions based on some numbers that may look exactly the opposite way a few months from now, seems rather short-sighted. Before letting any numbers govern optimization decisions, we must first decide what cases we want to optimize for and then optimize for them. Do we want small enums like Option to be fast? Do we want big enums like I for one would be very interested to see how this affects something real, such as haxeparser. |
This sounds more friendly to document-based databases too. |
I've done some benchmarks on this too a couple of years ago, though can't find them. I started with objects using different keys and found out there was not much to tingle with further. However the inheritance is pretty interesting topic. Pointer to enum is quite fast, however it increases memory usage and construction time a bit. Some things may have changed in VM since then, but I'd be happy if you consider this last option, may be quite interesting. It was going like this: enum E{
A;
B(b:Int);
C(c1:String, c2:String);
}
function test(v){
switch(v){
case A: trace('A');
case B(b): trace('B', b);
case C(c1,c2): trace('C', c1, c2);
}
}
var a = A;
var b = B(1);
var c = C('z', 'x');
test(a);
test(b);
test(c);
trace(name(a));
trace(name(b));
trace(name(c)); translates to something like var E = {};
var E_A = function(){
};
E_A.prototype.idx = 0;
E_A.prototype.name = "A";
E_A.prototype.enum = E;
var E_A_inst = new E_A();
var E_B = function(b){
this.b = b;
}
E_B.prototype.idx = 1;
E_B.prototype.name = "B";
E_B.prototype.enum = E;
var E_C = function(c1, c2){
this.c1 = c1;
this.c2 = c2;
}
E_C.prototype.idx = 2;
E_C.prototype.name = "C";
E_C.prototype.enum = E;
function test(v){
switch(v.__proto__){
case E_A.prototype:
console.log("A");
break;
case E_B.prototype:
var b = v.b;
console.log("B", b);
break;
case E_C.prototype:
var c1 = v.c1;
var c2 = v.c2;
console.log("C", c1, c2);
break;
default: console.log('fail', v);
}
}
function name(v){
return v.name;
}
var a = E_A_inst;
var b = new E_B(1);
var c = new E_C('z', 'x');
test(a);
test(b);
test(c);
console.log(name(a));
console.log(name(b));
console.log(name(c)); The only problem I see here is the ability to get argument by index, but it since we'd want to keep argument naming in objects if possible, we'd probably want to go about it somewhat like this: E_A.prototype.args = [];
E_B.prototype.args = ['b'];
E_C.prototype.args = ['c1', 'c2'];
function argument_by_index(val, idx){
return val[val.args[idx]]
} Not tat quite a lot of access methods can actually be inlined. Note that something may be moved to some additional object in prototype(however it's faster to keep everything in prototypes) to "hide" from object, but I don't believe it's needed, because you can't get to it by unless using various magics. Also it's possible to compress all this in one factory function to generate all enums by declarations. It breaks debugger naming, however we don't care about it that much anyhow. |
Oh, And if we do E = function(){}
E_A.prototype = new E();
E_B.prototype = new E();
E_B.prototype = new E(); in corresponding places, we get |
I did something similar for just js package libnoise;
//Dirty hack to expose enum in js
#if js
enum QualityMode {
LOW;
MEDIUM;
HIGH;
}
@:expose("libnoise.QualityMode")
class E_QualityMode {
public static var LOW : QualityMode = QualityMode.LOW;
public static var MEDIUM: QualityMode = QualityMode.MEDIUM;
public static var HIGH : QualityMode = QualityMode.HIGH;
}
#else
enum QualityMode {
LOW;
MEDIUM;
HIGH;
}
#end
|
I have run some fairly thorough micro-benchmarks on enums, and here are the results:
All benchmarks were done on Node version The number is the amount of times it called the function within 10 seconds, therefore bigger number = faster The results are sorted so that the fastest is at the top, and the slowest is at the bottom These are the four things I tested:
And these are the various implementations I tested:
Here is my summary of the data:
|
As for strings vs integers, I vote for integer tags:
|
After reading @stroncium's post, I decided to redo the benchmarks. I added in an
The first column is the speed when running the benchmark for 100 milliseconds, the second column is the speed when running the benchmark for 10,000 milliseconds. Bigger = faster. The 100 millisecond column is supposed to represent a program which runs quickly (e.g. a shell script), whereas the 10,000 millisecond column is supposed to represent a program which runs for a long time (e.g. a server or web app). Checking That's a shame, I was hoping they would be fast, because the JS VMs should be able to just grab the The microbenchmarks are nice and all, but I want to take a look at IR Hydra to see what the actual compiled code looks like. That should help settle the debate. |
Enums are one of the greatest features in Haxe, but its implementation often performs slower than other solutions, so I think we should maximize its performance.
I decided to start with JS target and did some benchmarking for the different possible implementations of enum: https://github.com/nadako/haxe-js-enum-benchmark
I ran the benchmark on Node 5.10.1 (V8 4.6.85.31) which I think is pretty indicative, since V8 nowadays is the most used js runtime (chrome, node, electron, etc). It's safe to assume that other popular engines are not very different from it.
Unless I did the benchmark wrong (please look at the (generated) source), it shows that using simple anon objects for enums is the fastest option for both creation and matching, so we might consider using those instead of arrays for enums.
I think this is related to how V8 optimizes objects by using hidden classes, and how it fails to optimize arrays which elements are of different type (like our current enums).
also cc @back2dos, @fponticelli
The text was updated successfully, but these errors were encountered: