Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New: Lint files in parallel #11

Closed
wants to merge 1 commit into from
Closed

New: Lint files in parallel #11

wants to merge 1 commit into from

Conversation

ilyavolodin
Copy link
Member

Summary

This change would enable a new option to lint files in parallel. This should improve performance of ESLint on large repositories.

Related Issues

eslint/eslint#3565

@ilyavolodin ilyavolodin added enhancement New feature or request Initial Commenting This RFC is in the initial feedback stage labels Feb 4, 2019
Copy link
Sponsor

@ljharb ljharb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds awesome - one catch tho; eslint-plugin-import relies on building up a "dependency map", and caching it across various rules.

With this mode, how could that data/processing be shared/reused across parallel rules?

@ilyavolodin
Copy link
Member Author

I don't really know how does eslint-plugin-import shares information across various rules. ESLint was specifically designed to not allow that, so what that plugin is doing goes directly against the original design. I would have to take a look at what that plugin does to suggest work-around (if it at all possible).

@ljharb
Copy link
Sponsor

ljharb commented Feb 4, 2019

The main thing it needs is a dependency graph - a map of imports/requires and exports/module.export values.

@ilyavolodin
Copy link
Member Author

Dependency graph isn't a problem. Problem is sharing it between different rules and different instances of the same rule. ESLint was never designed to do that (more then that, we specifically said that rules cannot know about any other rules).

@ljharb
Copy link
Sponsor

ljharb commented Feb 5, 2019

Noted, but it still allows it at the moment - if there was a way to perhaps explicitly inject a “setup” function (that for the import plugin would populate the graph) that would both work well with this RFC and also would work better with the current setup.

@not-an-aardvark
Copy link
Member

I think a setup function might be a good solution (of course, more details would need to be hashed out). There has been similar discussion about needing to access the entire list of files ESLint is running on in order to run the Typescript compiler only once rather than multiple times.

@ilyavolodin
Copy link
Member Author

My original though behind settings was that it could be a mechanize to share information between rules. But it's read-only, so it can only be set manually in the config file, not populated automatically. Maybe we can extend settings to provide a communication medium for shared information?

@not-an-aardvark
Copy link
Member

not-an-aardvark commented Feb 5, 2019

It's already not very difficult for rules to communicate if they really want to (they can share an object externally even if ESLint doesn't provide that object).

The bigger problem is that a rule has no way to determine which files are being linted; it only knows the current file that it's running on. The original design of rules assumed that rules wouldn't need to know this because a rule would only check things within a single file, but rules in plugins like eslint-plugin-import can provide useful functionality by depending on information outside of the current file.

EDIT: To clarify, this refers to the current state of affairs without parallelism. Communication between rules would indeed be a problem once we add parallelism, although I don't think enhancing settings would be the best way to resolve the issue (see #11 (comment)).

@ilyavolodin
Copy link
Member Author

I'm fine with adding a list of all linting files to context, but I think that should be done outside of this proposal. We will just need to make sure that those two proposals work together correctly.

@not-an-aardvark
Copy link
Member

As it relates to this proposal, the problem is more broadly that:

  • Some rules want to perform an expensive operation once across the entire codebase, and have the result be used for multiple different files. (For example, a rule might want to run the TypeScript compiler on an entire codebase once, and use the results when linting each file.)
  • This is currently accomplished by performing the expensive operation when linting the first file, and reading the results from a cache when linting other files.
  • This will be harder to do correctly when files are linted in parallel, because files won't be linted in a total order (i.e. there is no "first file" that gets linted). It will also be more difficult to establish a shared cache between multiple invocations of the same rule. (It would still be possible to accomplish this using global locks, but it seems like this would be messy.)

In other words, having a list of all files would be nice, but it's not directly related to parallelism. The main concern introduced by parallelism is the lack of shared memory between rule invocations, because this makes it more difficult for rules to use a cache for expensive operations. (My comment at #11 (comment) might have been misleading, I've edited it to clarify.) The advantage of a setup/prelint function is that it would run once before running rules on any files, which would eliminate the need for rules to implement locks/parallel caching.

@JamesHenry
Copy link
Member

Thank you for the advocacy for that use case @not-an-aardvark! Excited about this proposal

@ilyavolodin
Copy link
Member Author

Fair enough. I think that could be solved by providing a setup function that would run before linting occurs (on the main thread) and would allow for modification of context that's passed to the rules. That would still prevent rules from talking to each other, though (which I think is fine). Question is, how does plugin author gets access to that setup function? Another config property?

@not-an-aardvark
Copy link
Member

By the way, have you seen #4? It's not necessarily a problem to have two RFCs open about the same topic, although it's not clear how the proposal in this RFC differs from that one.

@ilyavolodin
Copy link
Member Author

ilyavolodin commented Feb 6, 2019

Oh... I've seen it, but didn't have time to fully read it and assumed that, based on the title, it was talking about making ESLint async, not parallel. I'm fine with incorporating this PR into #4

@nzakas
Copy link
Member

nzakas commented Feb 6, 2019 via email

@ljharb
Copy link
Sponsor

ljharb commented Feb 6, 2019

@nzakas definitely it makes sense to make it a different RFC :-) however, it might be seen as a blocker for this one, so that parallel mode doesn't break eslint-plugin-import.

@kaicataldo
Copy link
Member

Question is, how does plugin author gets access to that setup function? Another config property?

If we did this, I think we would have to make it clear that this setup function can not be used to modify any source files, since I'm not sure how we could guarantee the order in which individual plugins run their respective function.

@felixfbecker
Copy link

I wonder how this would work with typescript-eslint

@jamiebuilds
Copy link

Parallel mode could break plugins as long as parallel mode isn't the default (it's not a breaking change if it needs to be turned on). @ljharb I'd be happy to work on fixing up the import plugin to support parallel mode

@ljharb
Copy link
Sponsor

ljharb commented Mar 1, 2019

@jamiebuilds i think having a synchronous "setup" step plugins can opt into is the simplest; then every plugin can migrate to it safely now, and it'll Just Work with parallel linting later. It would also help improve debugging how long each rule in a codebase takes when the setup tasks can be separated.

@btmills
Copy link
Member

btmills commented Jul 16, 2020

Thank you all for the discussion in this thread and particularly @ilyavolodin for putting the RFC together! A lot has changed in the core ESLint APIs as we worked toward making parallel linting possible. The TSC decided today to consolidate the parallel linting work in #42, which @mysticatea is updating to reflect a prototype implementation, so please follow along there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Initial Commenting This RFC is in the initial feedback stage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants