Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve inputs/outputs on NpmInstall tasks #157

Open
etiennestuder opened this issue Mar 19, 2021 · 10 comments
Open

Improve inputs/outputs on NpmInstall tasks #157

etiennestuder opened this issue Mar 19, 2021 · 10 comments
Milestone

Comments

@etiennestuder
Copy link

etiennestuder commented Mar 19, 2021

From the npm docs:
https://docs.npmjs.com/cli/v7/commands/npm-install#description

This command installs a package and any packages that it depends on. If the package has a package-lock, or an npm shrinkwrap file, or a yarn lock file, the installation of dependencies will be driven by that, respecting the following order of precedence:

npm-shrinkwrap.json
package-lock.json
yarn.lock

In practice, this means that when running npm install and, for example, a package-lock.json is present, that package lock file will serve as an input and not as an output.

@deepy
Copy link
Member

deepy commented Mar 19, 2021

When it comes to package-lock.json we have a tricky situation, with the current functionality the npmInstall task can be configured to run npm ci instead of npm install and in that case package-lock.json is only ever used as an input, but when it comes to npm install it will modify the file if it installs.

Now if my memory serves we opted for having this as an output to prevent npm install from running, changing the file, and then running again on the next build since the input has changed, though this might not be the optimal solution.
B

@etiennestuder
Copy link
Author

Thanks for the insights, @deepy .

but when it comes to npm install it will modify the file if it installs.

If you run npm install (with no args) and have a package-lock.json present, the package-lock.json file is read but not written (i.e. it is then not an input but an output).

It's tricky given the various behaviours of the underlying npm install command.

@deepy
Copy link
Member

deepy commented Mar 19, 2021

Is that still true if package.json has changed but not package-lock.json?

(which isn't necessarily the use-case we want to optimise for)

@etiennestuder
Copy link
Author

Is that still true if package.json has changed but not package-lock.json?

My understanding is that whenever you modify package.json, it always implicitly updates the package-lock.json.

@etiennestuder
Copy link
Author

I'm aware of the following scenarios with npm install:

  • npm install is run when package.json and package-lock.json are both not present --> both files are created (i.e. they are both outputs)
  • npm install is run when only package.json is present --> package-lock.json is created (i.e. one is an input, the latter is an output)
  • npm install is run when package.json and package-lock.json are both present --> both files are updated (i.e. they are both outputs)

The three scenarios above show that with Gradle's input/output annotations you cannot properly model all scenarios. After discussion with one of our build caching experts, we believe it is best to drop the inputs and outputs annotations on the NpmInstall task alltogether and to always run the task, and let npm decide what to run and what to avoid to run.

Marking the node_modules directory as an output is also problematic. This directory can be very, very big, leading to (performance) overhead when Gradle has to calculate the hash of the node_modules folder and when copying in and out the content of the node_modules folder. At the same time, npm itself is already doing basically the same in terms of file I/O by copying over the needed modules from a local cache. Thus, there is not much to gain by using Gradle's caching functionality on top of npm's caching. The latest npm version even has a package-lock.json in the node_modules folder to speed up calculating the state of the node_modules folder. This behavior cannot be mimicked in Gradle.

When one build tool (Gradle) calls another build tool (npm), we feel it is best not to have the invoking build tool duplicate the optimizations that are present in the invoked build tool. The mimicking of the optimization logic might not be fully correct in all cases and as the underlying build tool evolves it will become increasingly hard/impossible for the invoking build tool to provide the same optimzation logic. Also, even if one could fully correctly duplicate the optimization logic in the invoking build tool there is very likely no performance gain compared to what the underlying build tool achieves given the highly dominating I/O operations.

@nickcaballero
Copy link

nickcaballero commented Nov 2, 2021

Might be able to use new feature in Gradle 7.3 to just disable state tracking for task.

Task.doNotTrackState()

gradle/gradle#9095

@nickcaballero
Copy link

Gradle 7.3 has been released and talks about npm specifically as something that would benefit from the doNotTrackState feature.

https://docs.gradle.org/7.3/userguide/more_about_tasks.html#sec:untracked_external_tool

@deepy deepy added this to the 3.2 milestone Dec 7, 2021
@deepy
Copy link
Member

deepy commented Jan 4, 2022

Opting out of tracking requires 7.3 and that requires groovy 3 which requires spock 2.0 which breaks (see #207)

So I'm moving this to 3.3

@deepy deepy modified the milestones: 3.2, 3.3 Jan 4, 2022
@nickcaballero
Copy link

@deepy Did you have an issue upgrading to 7.3? I don't recall having to do anything special to upgrade and have been using the doNotTrackState feature.

@deepy
Copy link
Member

deepy commented Jan 6, 2022

@nickcaballero 7.3 upgrade was smooth but required upgrading to groovy 3 and that meant upgrading spock to 2.0 (from 1.3) which broke the RunWithMultipleVersionsExtension I looked into it briefly but didn't have enough time to fix it

So no issues with Gradle, but our test suite broke with the spock upgrade

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants