-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Version 3.10.x uses significantly more memory and is significantly slower than 3.9.0 #3434
Comments
That's odd. What is your Node version? |
@PatSmuk360 Could you test and get a memory profile of 3.10.0 and see if it differs? |
Also I tried on Node 12.10.0 and it seems a lot worse, hitting 587 MB of memory usage at one point in the sequential build. |
CPU profile before: CPU profile after: |
@PatSmuk360 So, the long and short is that the difference between those versions is that the codebase was transformed to ES6 syntax. Which isn't technically a breaking change, which is why it wasn't a major version. BUT... my suspicion is that some of the Babel conversions for things like object / array spread syntax are less efficient than the more verbose ES5 versions. That's why I was asking if you could test 3.10.0, because originally I was exporting a transpiled package that was compatible with Node 6 and up, but it broke an integration with a particular library that couldn't handle class syntax. So I went down to Node 4 with no class constructs. If we can figure out exactly which ES5 transform is not performing well, we could theoretically more granularly set those Babel export settings to a more performant export. |
@PatSmuk360 Incidentally, what's that |
@matthew-dean That seems to be the |
This line in specific less.js/lib/less/source-map-output.js Line 78 in cae5021
|
I don't have much experience with heap profiles, but the thing that stands out to me is the increase in the amount of |
@kevinramharak In general, converting an AST like Less's with n depth into a serialized, flatted output tree like CSS involves lots of temporary object creation. So, what could be happening is that with certain transforms, it's adding an additional x amount of objects. Even temporarily, you can get an exponential effect where each node creating even just 2-3x the objects, multiplied by the number of nodes multiplied by the number of times it has to flatten rules... I could see it adding up. We were probably naive in general to essentially think of ES6 syntax as essentially syntactic sugar of ES5 syntax. (JavaScript developers are probably guilty of this in general.) In reality, transpiling newer syntax can create ES5 patterns that are not very performant. For 99% of devs, this isn't a big deal because they're not iterating through that code hundreds or thousands of times per second. But this is my guess as to what's happening, because there were no other major changes. |
@kevinramharak Re: the source lines - this is because the original Less parser doesn't track lines / columns of input, so when source mapping was added, it needed to essentially chunk input into lines to figure out how it should map to the original source. This won't be an issue in 4.x+, but that makes sense why it would spend a lot of time there now. |
I'm using less.min.js in a browser and 3.10.3 is two times slower than 2.7.3 I was using before. Both in Chrome and Firefox. |
@PatSmuk360 Can you check out this branch? https://github.com/matthew-dean/less.js/tree/3.11.0 In short, Babel's transpilation to ES5 is kinda awful, and uses tons of Object.defineProperty calls to transpile classes. I switched transpilation to TypeScript, which has much saner output of function prototypes. This is all fine and dandy, but after doing this, Less's browser tests won't run because it uses the very old and outdated PhantomJS, and thus far, no one (including me) has been able to successfully migrate the tests off of PhantomJS onto Headless Chrome. But, one problem at a time: if the dist source on that branch doesn't have the memory issues, then maybe we could tackle the mess that is browser testing. |
I've managed to migrate Less browser tests to Headless Chrome, but in terms of performance / stability, I need a lot of feedback from users before it's safe to merge, because of changing the transpilation pipeline entirely from Babel to TypeScript. Current branch(es) can be found here: #3442 |
Still 2x slower than 2.7. |
@alecpl That's good information, but really I want to know if 3.11.0 is an improvement over 3.10. |
The odd thing about this is that, in theory, the original Less code was converted with Lebab, which should be the opposite of Babel. Which means that ES5 -> ES6 -> ES5 should be nearly identical, but that's obviously not the case. So I'll need to investigate (unless anyone else has the time, which is welcome support) just how the ES6->ES5 code differs from the original ES5 code. |
So, I've run a variety of tests, and spent the time doing benchmarking of 3.11.0 vs. 3.10.3 vs. 3.9.0 vs. 2.7.3 in Headless Chrome I've found no evidence that the Less compiler is slower for any version, let alone 2x slower. It could still be true that 3.10.0 has more memory overhead because of the transpilation settings, and maybe if a system was constrained for space and doing more memory swaps or more GC as a result, it might result in a slowdown? But I can't verify it. You can test yourself running @PatSmuk360 Have you been able to test the memory overhead of 3.11.0? |
I do not build your code by myself. I don't use nodejs. I just take less.min.js file from dist folder for two different versions I mentioned and use them on my page. Then in console I see timings printed by Less code. My less code uses multiple file and the output file is about 100kB of minified css. That's a "real life" benchmark. I'm talking about Roundcube code. Chrome is much faster than Firefox, but in both the difference between versions is similar. |
@alecpl Hmm....maybe it's a particular difference for that particular Less code. It's true that the benchmark is arbitrary. If I have time, I'll add those Less files to benchmarking. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Just wanted to post a quick update: I tried 3.11.1 and it was just as slow as 3.10.3. I'll see if I can cook up a representative benchmark for testing/profiling this. |
I've upgraded to 3.11.1 and have the same memory consumption problems now. I've taken a heap allocation timeline resulting in this; Something is causing crazy huge allocations kept alive by [EDIT] |
@matthew-dean In 3.11 one of the listed retainers for RuleSet is the ImportManager. If everything is kept alive by ImportManager and ImportManager is a singleton across the entire compilation process. Well... yes; that would significantly inflate memory use because nothing can be garbage collected. Not even intermediary rulesets. |
@rjgotten Hmm...... if an object has a reference to another object, why would that prevent GC? I mean, technically, all objects retain references to the public API nodes via the prototype chain. It would only prevent GC if the reverse is true i.e. some object retaining references to every ruleset instance. |
Note that this is a 4.0 build so you might encounter errors if your code has these breaking changes:
|
So, while I'm now much more optimistic I've fixed the issue, based on the latest benchmarks, I have yet to figure out why this issue has been happening. If performance holds up for everyone else, I can give a run-down of how I finally narrowed down the issue and what I found. In other words, I think I found where the issue is/was, but not necessarily why. |
The result of the same test suite as #3434 (comment):
I can confirm that for my specific test suite, the performance level has improved to as fast as v3.9.0. Very much appreciated Matt! While I'm not sure about the break change on math mode. Changing this may lead to breakage of many of our applications so we may be stuck on v3.9.0 even the performance issue is solved. |
You can explicitly compile in |
Breakdown of the issueTL;DR - Beware of the class pattern The problem was in the transpilation of classes in both Babel and TypeScript. (In other words, both had the same performance issues with the transpiled code, with Babel being slightly worse.) Now, years ago, when the class pattern was introduced, I was told--and until this issue, believed--that class inheritance in JavaScript is syntactic sugar for functional inheritance. In short, it isn't. (Edit: well.... it is and it isn't. It depends on what you mean by "inheritance" and how you've defined it as you'll see shortly.... i.e. there are several patterns to create "inheritance" in the prototype chain in JavaScript, and classes represent just one pattern, but that pattern is somewhat different from all the others, hence TS's / Babel's need for helper code to "mimic" this pattern using specialized functions.) Less code looked like this: var Node = function() {
this.foo = 'bar';
}
var Inherited = function() {
this.value = 1;
}
Inherited.prototype = new Node();
var myNode = new Inherited(); Now say you wanted to re-write this using "modern" JS. In fact, you can automate this process, which I did, but either way I would have written it the same, which was: class Node {
constructor() {
this.foo = 'bar';
}
}
class Inherited extends Node {
constructor() {
super();
this.value = 1;
}
}
var myNode = new Inherited(); Same thing, right? Actually, no. The first one will create an object with a property of The second one, because it will call the constructor on both Now, for the user, this really doesn't matter, because you can access both Here's where I enter pure speculationFrom what I remember in articles about JIT engines like V8, object structures are actually very important. If I create a bunch of structures like BUT if I add different properties to this each time, like: AFAIK, what's also true is that the more properties you have on an object, the longer that an individual property lookup takes. Again for end users, this rarely matters because JS engines are so fast. Buuuuut say you're maintaining an engine that creates and looks up properties / methods on LOTS of objects as quickly as possible. (Ding ding ding.) Suddenly these small differences between the way TypeScript / Babel "extend" objects and native functional prototypal inheritance adds up very quickly. I wrote a bare-bones implementation of a Node and an inheriting function, using the old Less syntax and the class pattern transpiled with TS. Right out of the gate, the inherited node consumes 25% more memory / resources, and that's BEFORE any instances of the inherited node have been created. Now consider that some nodes inherit from other nodes. That means, the in-memory representation of the classes starts to multiply, as does their inherited instances. Again, this is speculationI have to emphasize taking all this with a grain of salt, because I've never heard of a conversion to classes being this disastrous for performance. What I think is happening is that there's some special combination of object / instance lookups that Less is using that is seriously de-optimizing the JIT. (If some JavaScript engine expert knows why transpiled classes perform so much worse than native JS inheritance methods in this case, I would love to know.) I tried to create a performance measure of creating thousands of inherited objects using both methods, and I could never see a consistent difference in performance. So, before you walk away with "classes in JavaScript are bad", it could be some other really unfortunate pattern in the Less codebase coupled with the difference in transpiled classes vs. native JS that created this drop in performance. How I finally figured it outHonestly, I never ever suspected the class pattern. I knew that the original ES5 code ran faster than the reverse-Babelified code, but I suspected something around arrow functions or spread syntax somewhere. I still had the up-to-date ES5 branch, so one day I decided to run A cautionary taleSo! Lesson learned. If your project is on old ES5 code, and you're craving that "modern" Babel-ified or TypeScripted goodness, remember that whatever you transpile, you didn't write. I was still able to re-write classes into something more "class-like", with a slightly more maintainable pattern, but more or less retaining Less node's original functional inheritance pattern, and I'll be leaving a strong note in the project for a future maintainer to never do this again. |
I’m aware of this. We have many Less codebases across many different teams so break changes even in default compilation options would increase communication costs. I’m not sure whether the benefits are worth the breakage. |
Thank you for the detailed breakdown! I saw the usage of At the same time, I think it's better if we can separate the fix of performance degradation and the breaking changes, which means to land the performance fix in v3 and include the breaking changes only in v4. |
@matthew-dean Goes to show: composition over inheritance 😛 Though FYI; if you want a cleaner protypal inheritance chain you really should do it a bit different than your example. var Node = function() {
this.foo = 'bar';
}
Node.prototype = Object.create();
Node.prototype.constructor = Node;
var Inherited = function() {
Node.prototype.constructor.call( this );
this.value = 1;
}
Inherited.prototype = Object.create( Node.prototype );
Inherited.prototype.constructor = Inherited;
var myNode = new Inherited(); then you flatten the properties onto the instances themselves which then share the same shape; the prototypes share shape; and you avoid having to crawl up the prototype chains for each property access. 😉 |
I don't think that's quite right i.e. Object.assign landed shortly before classes implementation but your point is taken. I was just hoping to avoid writing Technically, everything's still transpiling, so I dunno. The original goal was a more maintainable / readable code-base. If you need an Object.assign polyfill in some environment, so be it. It would be on a version of something that Less doesn't support.
I've been thinking of that, and I guess that's also a fair point. I'm just trying to make my head not explode building / maintaining 3 major versions of Less. @rjgotten AFAIK that's what a class pattern is supposed to do, exactly as you've defined it. Unless I'm not seeing a critical difference. So 🤷♂️ . I'm not going to try to alter Less's Node inheritance pattern again if it's performing well (other than I may remove this |
@Justineo @rjgotten Can you try this: |
Just tried it:
Ps. Tested with Node.js v12.13.1. |
I'm.... what. |
Didn't have the time to set up seperate benchmarks.
For me the 3.13.0 gives the best results... 🙈 3.11.3 also seems to not have the runaway memory use that I saw before with the 3.11.1 version. The 3.13.1 which restored the cache doesn't help with improving my real-world compile times and just inflates memory use. |
@rjgotten Yeah, I think the problem is that, in this thread, people are measuring different things, and everyone thus far essentially has private data which is impossible to replicate (without submitting to Less benchmarking, or translating that into a PR). Less does have a benchmark test, which I could use against different clones of the repo, to verify for myself differences in parse / eval time, but where caching doesn't (currently) apply. Here is a published build with the inheritance changes on the tree, and with the parse tree cache restored: |
Test casehttps://github.com/ecomfe/dls-tooling/tree/master/packages/less-plugin-dls Results
Node.js versionv12.13.1 To me this version seem to work perfectly. I'll ask my colleagues to try out and verify. |
Works slightly worse than the
|
@matthew-dean, it looks like my code is not yet compatible with the |
@rjgotten Yeah, the difference seems minor for your test case. I think the outcome of this is that I might prepare both a 3.x and a 4.0 release. I wasn't itching to push 4.0 with this issue unresolved. Thankfully, it looks like it has been. The other item I'll do is open an issue for anyone to grab which is to put in a CI pipeline that fails Less if it falls below a certain benchmark threshold. |
@jrnail23 It would be great if you can modify to run the 4.0 alpha, but can you in the meantime try |
@matthew-dean, I'm getting the following for my build:
So it looks like you're back down to where you want to be! Nice work! |
Yeah, I'm going to second; third; and fourth that. Job very well done. |
Okay team! I'm going to publish the 3.x build, and sometime later, maybe next week, publish 4.0. Both should now have the performance fixes. Thanks to everyone who helped debug! |
By the way, I'm currently in the process of converting the Less.js codebase to TypeScript, and plan to use the opportunity to do some performance tuning / refactoring. I'm open to help if anyone is interested! matthew-dean/less.js@master...matthew-dean:next |
Anything specific where you would prefer some extra eyes? The PR is very large, so I don't really know where to start |
@kevinramharak That's a fair question. To be honest, I ran into unexpected roadblocks with converting to TypeScript (errors that became hard to resolve), and am now re-thinking doing it at all. Less is running fine as it is, and a lot of the reasons I wanted to convert (to make refactoring / adding new language features easier) are now irrelevant as I've decided to move my ideas for new language features to a new pre-processing language. I don't want to over-self-promote, so DM me on Twitter or Gitter if you want details. |
Our builds recently started failing because we run about 80 Less builds in parallel during our project's build process and the new version of Less.js uses so much memory that Node crashes. We traced the crash to upgrading from Less.js from 3.9.0 to 3.10.3.
I changed our Less script to compile the files sequentially (building 2 files at a time) and sampled Node's memory usage during the process and got the following results:
Less.js seems to use 130% more memory now and takes about 100% longer to compile for us.
Just wondering if you've benchmarked Less.js and whether you see similar results
The text was updated successfully, but these errors were encountered: