-
-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update: Providing Rule Metadata to Formatters #10
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating an RFC! I'm supportive of the general idea here, but I have a few suggested changes to the details of the API.
Co-Authored-By: EasyRhinoMSFT <[email protected]>
Co-Authored-By: EasyRhinoMSFT <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general - I'm in favor of adding this. My only concern is that we've had instances where formatters were running out of memory on a large code-bases where a lot of errors were found. This change will significantly increase the payload of the information that is being sent to formatter, so the changes of running out of memory increase as well. We've never profiled and checked for the number of errors that would make the process run out of memory before, so there's nothing to compare to, but it would be nice to somehow figure out by how much would this change affect memory issue.
I'll take a shot at profiling before and after memory consumption. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also supportive of this change, just left a few comments on ways I think we could improve this proposal.
Thanks for writing it up!
return false; | ||
} | ||
|
||
const rulesMetadata = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd be better off using a Map
here. It should (in theory) be more performant because we won't be changing the shape of an object by adding hundreds of keys to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an object would be better because it would be more easily serializable by formatters like json-with-metadata
(and anyone using json-with-metadata
would be able to use almost the same interface as regular formatters).
I suspect there wouldn't be too much of a performance impact since the final shape of the object would be roughly the same on each iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a simple memory usage profile and it averages about 125 KB more with this change. I analyzed the three benchmark test scripts with the default ruleset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EasyRhinoMSFT can you clarify that a bit? Are you saying Map
used more memory or using an object did?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EasyRhinoMSFT this is the comment I referred to in my other comment. :)
I'd also like to hear a few other opinions or whether the rule meta should be in an object or a map. I'm of the mind that when we expose an object in our API that is intended to be used as a map, we should just be using a map. Most formatters will not be formatting the result as JSON, so I don't think optimizing for that case makes a lot of sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't actually do the Map case, just object vs. nothing. I'll profile the Map version today, though there was enough variance across trials that I doubt we'll see an appreciable difference.
The method was to look at process.memoryUsage() -- really basic and probably naive, so let me know if there is a different approach I should use. I ran trials with and without the change for each of the three bench scripts in .\tests\bench.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Map is easier to insert data into, object is easier to deal with for consumers (serialization), so in my opinion, unless there's a rather large performance gain by using Map, I would go with an object.
After giving this a bit more thought: If #9 gets merged, will it be possible for different rules to be defined for different files? If so, providing a single rule metadata object for all of the results might not make sense, because a single rule name could correspond to multiple metadata objects for different files. |
Metadata or not, you'd still have rule ID collisions. For our formatter (SARIF), there's no problem as long as the ruleIds are unique in a given set of results (i.e., each time the formatter is invoked). |
What I mean is that after/if #9 is implemented, rule IDs might not be unique in a given set of results. Rule IDs would be unique per-file, but a set of results can contain multiple files. |
In that case, is there a way to determine which rule a Result.ruleId refers to? I haven't had time to read #9. |
ESLint core would have that information, so presumably it would be possible to convey it to formatters in some form. However, the structure proposed in this RFC probably wouldn't work for that case because it assumes that rule IDs are globally unique. |
#9 will not introduce rule naming collisions. The latest changes mean that rule IDs must be unique within a config. |
Hey guys, have I taken care of all the feedback so we can move forward? |
@EasyRhinoMSFT there was one question outstanding that I asked, which was if you could explain the memory profile test you did. |
Hi Nicholas, wanted to make sure you saw my response up above regarding the memory consumption benchmarks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
@EasyRhinoMSFT I did, thanks. I'm just waiting to hear what other team members think about the object vs. map question I posed. |
Okay, it seems there's consensus on an object, so I'll withdraw my
suggestion to use a Map.
…On Mon, Feb 18, 2019 at 9:15 AM Ilya Volodin ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In designs/2019-expose-rules-to-formatters/readme.MD
<#10 (comment)>:
> +The implementation of this feature is very simple and straightfoward. The code location that invokes the formatter's exported interface function already has access to the API it should use to obtain the list of all executed rules. The call to `Engine.getRules` must be made in the try block because `engine` may be null during unit testing.
+
+```js
+function printResults(engine, results, format, outputFile) {
+ let formatter;
+ let rules;
+
+ try {
+ formatter = engine.getFormatter(format);
+ rules = engine.getRules();
+ } catch (e) {
+ log.error(e.message);
+ return false;
+ }
+
+ const rulesMetadata = {};
Map is easier to insert data into, object is easier to deal with for
consumers (serialization), so in my opinion, unless there's a rather large
performance gain by using Map, I would go with an object.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#10 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AACWkp4cI4eAka1exHgRTj2l2pSeTSqyks5vOt-lgaJpZM4aYvJw>
.
--
______________________________
Nicholas C. Zakas
@SlickNet
Author, Principles of Object-Oriented JavaScript <http://amzn.to/29Pmfrm>
Author, Understanding ECMAScript 6 <http://amzn.to/29K1mIy>
|
Since it seems like that was the last open question, and this has been open for over 21 days, I think this is ready to move to the final comment period. @eslint/eslint-tsc |
@nzakas You're listed as having "requested changes" on this PR based on an earlier review, but from looking at the discussion it seems like your suggestions might have all been addressed. Would you mind double-checking that (and potentially approving this PR, if you're in favor of it and there are no outstanding concerns)? |
@nzakas Gentle, respectful ping... |
Since this has been waiting for awhile, I'm going to go ahead and merge this despite @nzakas's old "request changes" review, on the assumption that he supports the change and the review is just out of date. (In #10 (review) he mentioned that he's generally supportive of the change aside from a few comments, and of the three suggestions he made, two were applied and he explicitly withdrew the third in #10 (comment).) |
@EasyRhinoMSFT Thanks for working on this! If you're interested in submitting the change in a PR to the |
Great thank you, I'll get a PR out in the next couple of days. |
No description provided.