Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add NPM package set generated by node2nix #16886

Closed
wants to merge 1 commit into from

Conversation

svanderburg
Copy link
Member

Motivation for this change

The node-packages set is horribly broken and does not work properly with newer versions of Node.js 5.x/npm 3.x and onwards. This pull request is a WIP replacing the package set with expressions generated by node2nix that has better support for newer npm versions (and still retains compatibility with npm 2.x).

How to use the generated packages by node2nix

Installing a NPM package for node.js 4.x:

$ nix-env -f all-packages.nix -iA npmPackages_4_x.grunt-cli

Installing a NPM package for node.js 5.x:

$ nix-env -f all-packages.nix -iA npmPackages_5_x.grunt-cli

The old npm2nix infrastructure is still in place, so both versions can be used.

This patchset is meant for the release-16.03 branch. If we want to apply this to master, we need to use an updated version of node2nix that computes that output hashes of the git repositories in a different way.

How to generate/update packages?

The files live in pkgs/development/node-packages. See the README file in the corresponding folder for more instructions. They should be straight forward.

You need to have node2nix installed first: https://github.com/svanderburg/node2nix

Things done
  • Tested using sandboxing
    (nix.useChroot on NixOS,
    or option build-use-chroot in nix.conf
    on non-NixOS)
  • Built on platform(s)
    • NixOS
    • OS X
    • Linux
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

@copumpkin
Copy link
Member

copumpkin commented Jul 12, 2016

Both great and terrible, at the same time, thanks 😄

From here:

[09:18:25] in fact, we basically can't afford another haskellPackages

See also NixOS/nix#954.

As much as I love these autogenerated package sets, if every few weeks/days we get another >50000-line diff, usability of this repo will be hosed forever unless we decide to rewrite history once we solve the IFD issue. I don't know what to do, but I'd suggest we look pretty hard at whether there are better solutions than ginormous periodic autogenerated commits to nixpkgs.

Ping @edolstra @obadz @peti @shlevy @vcunat @domenkozar

@shlevy
Copy link
Member

shlevy commented Jul 12, 2016

We could fix this by improving import from derivation, like I did in #31. We could fix this by adding recursive nix, like I did in #213. We could fix this by allowing fetchTarball to work in restricted mode, like I suggested in the comments of #16130. The work has already been done, we just need @edolstra to sign off on one of these.

@copumpkin
Copy link
Member

copumpkin commented Jul 12, 2016

I'm not sure allowing fetchTarball in restricted mode is enough to really fix this, but it would definitely improve the situation over what we have today (and allow us to keep these giant commits elsewhere). Your other points definitely make sense, but I think you meant NixOS/nix#52 or NixOS/nix#31 or NixOS/nix#213?

@svanderburg
Copy link
Member Author

I'm afraid that for NPM packages, we have to cope with the fact that they are big.

In theory, I could split the generated expression from the node-packages.json file into multiple files, but the dependencies that each package requires need to be composed separately for each individual package (thanks to some awesome (not!) design decisions in npm), still making each file quite big. :-(

Also, splitting the expression into multiple files, probably means more redundant data for each of them.

@shlevy
Copy link
Member

shlevy commented Jul 12, 2016

Oops, forgot what repo we were in. Yes, NixOS/nix#52 or NixOS/nix#213 could fix this. Allowing fetchTarball in restricted mode would work as well, as we can then fetch the repo containing the npm packages (either the latest or a fixed rev).

@shlevy
Copy link
Member

shlevy commented Jul 12, 2016

(that's not as good as the other two solutions, as it requires something to regularly update the other repo, but it's also quicker)

@copumpkin
Copy link
Member

@svanderburg not saying they don't need to be big; saying they shouldn't have to be in nixpkgs. The expressions are generated by some automated process, and Nix is great at managing automated processes that produce outputs (i.e., all our thousands of packages and builds), except when the outputs need to be consumed by Nix itself 😄

Basically, if I understand correctly, the 50000+ lines in this commit diff are the result of running a much much smaller script over some external (possibly untracked right now) state. My ideal would be to minimize the NPM expression by teaching Nix how to produce it, and then using the standard caching mechanism to avoid getting everyone to produce it every time. If you want to sound really nerdy, let's do our best to get closer to the actual Kolmogorov complexity of these 56464 lines of diff.

@shlevy yeah, I get it; it would be a great first step to support fetchTarball but ideally, I'd rather we not be making automated commits to other repos either and just represent the process explicitly. It seems fine to do this in smaller steps though.

@gilligan
Copy link
Contributor

I would have preferred if this PR had replaced npm2nix instead of being introduced as node2nix. After all npm2nix does indeed need to be replaced and this way I feel it might just take even longer until this happens. Also new users will find a lot of references to npm2nix but nothing about node2nix

I am also undecided about having the generated packages. I'm afraid that if there isn't at least a concept for some automatic updates then these generated expressions will soon be as entirely outdated as those we currently have in nixpkgs which were created by npm2nix.

@svanderburg
Copy link
Member Author

@gilligan Removing the old npm2nix generated package set isn't terribly difficult, but at the same time I don't want to overrush things.

What is our point of view regarding PRs that include generated Nix expressions? Should we from now on exclude generated packages from Nixpkgs altogether and store them elsewhere? As far as I know: in Nixpkgs, the fetch-bower function needs the fetch-bower NPM package to be available.

In NPM's case, managing a set of packages outside Nixpkgs is quite simple and something I have been doing for quite a while.

@svanderburg
Copy link
Member Author

So far I observed two "negative observations". One is of them is that we now have two generated package sets that may be confusing to end-users. This problem will go away very quickly once I gathered enough feedback on the stability of my generated package set.

Gathering feedback is the main objective of this PR -- I wanted to show what the expressions generated by node2nix look like in Nixpkgs and how to "properly" package the the right set of NPM packages (that is: only end-user packages, and no libraries).

The other is a fundamental issue that applies to any generated package set. I'm not sure if this concern is a showstopper at the moment. I'm also very interested in seeing a more powerful/flexible solution making it possible to alleviate these issues, but currently none of them seem to be close for inclusion in Nix.

}} $TMPDIR/webdrvr/chromedriver_linux64.zip
'';

dontNpmInstall = true; # We face an error with underscore not found, but the package will work fine if we ignore this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think no-flags are very confusing. It's like saying "no = true".
What about using npmInstall = false?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is very ugly -- however, what I'm basically doing here is following the conventions the stdenv generic builder uses, e.g. for stdenv.mkDerivation {} you can set do* and dont* flags to enable or disable certain steps, e.g.:

stdenv.mkDerivation {
  ...
  doCheck = true;
  dontStrip = true;
}

so it's as nice (or as ugly!) the generic builder is.

Also, in the future I'm planning to create a better structured abstraction function that (for example) uses the generic builder's phases and so on.

@svanderburg
Copy link
Member Author

Due to lack of spare time, I was absent for a while in this discussion. I'm not sure if the complexity/size of the generated expressions is a showstopper, but if nobody objects, I'm planning to integrate this very soon in the upstream Nixpkgs.

For the people that are worried about the confusion of the availability of npm2nix: At the moment, the most important thing for me is to provide a practical solution (that covers my own use cases as well as a few other Nix users that I know) rather than imposing a specific solution to everybody.

We can also consider npm2nix to be "deprecated" and display a message when somebody tries to install a package from the attribute set generated by it. We can easily do this, by adding the following line to pkgs/top-level/node-package.nix:

builtins.trace "nodePackages are deprecated. Please consider using npmPackages"

This will encourage people to switch, but it does not force them. This warning will give people some space to actually make the transition.

@svanderburg
Copy link
Member Author

Today, I have been trying to implement the proposed integration pattern described in this pull request.

Unfortunately, I noticed that npm2nix again does not work for neither the master branch nor the release-16.03 branch.

Although I can spend some time in trying to fix it, I think it is not really worth the effort.

So my question is: if I would remove npm2nix's stuff altogether and replace it by node2nix's generated packages (with the library packages removed). Would anyone object?

@gilligan
Copy link
Contributor

gilligan commented Aug 5, 2016

@svanderburg as for me I advocated for replacing npm2nix with your rewrite from the get go anyway so I am 👍 on you going forward with this.

@svanderburg
Copy link
Member Author

I just updated my branch (that tracks release-16.03) to remove all the old npm2nix-related stuff and replace it by expressions generated by node2nix. Unfortunately, the sad story is that the costs of doing these generations (with node2nix) are quite expensive. Maybe a bit too expensive after what I learned by doing it a couple of times. :(

First, the time to generate these expressions (despite restricting the packages to end-user software only), takes a ridiculous amount of time. On my machine, it took about 30 minutes. Second, the code churn is huge (e.g. 50K+ lines of code change per package set).

While it is possible to proceed integrating this branch and getting rid of the broken npm2nix generated expressions, I'm slowly starting to doubt the usefulness of this planned approach and I have been thinking about something new.

Maybe it's better to restrict the set of NPM packages in Nixpkgs to the bare essentials -- node2nix and NPM packages that are in use by non-NPM projects only. For the remainder of the NPM packages, we should instruct end-users how to package them with node2nix.

I think is step is quite controversial, as some people were used to the fact that Nixpkgs offers them, which is no longer the case with the new proposed approach.

Second, if we would proceed, then the documentation becomes more important. We should clearly explain end-users what the preferred approach is. Moreover, IMO the biggest challenge is to get NPM packages packaged in Nix is that some have non-NPM dependencies. These expressions cannot be fully auto-generated and have to be augmented, something that is very confusing for people having only little hands-on experience with Nix.

I think most of you may probably agree in restricting the amount of NPM packages, but I can imagine that for some people this will be difficult to accept. Accurately mimicking NPM the Nix-way is very complicated and offering the same kinds of facilities we did earlier is simply too expensive.

What do you think?

@bennofs
Copy link
Contributor

bennofs commented Aug 9, 2016

Here's what I think about this (at least for Nixpkgs).

Disclaimer: I've only recently started using NPM, so if I made any assumptions here about NPM that are not true or overlooked some fundamental problem with NPM, please correct me!

Only package applications

The only thing that nixpkgs needs to make sure is that it is easy to package npm applications. The reason for this is that packaging libraries as nix derivations goes against what NPM itself offers: NPM has no concept an installed library, the only concept it has is "mirror all libraries into the current application and build them together", so I think nixpkgs should follow that same approach here.

Generate nix derivations from npm-shrinkwrap.json files

If we adopt the approach that we only want to node applications in nixpkgs, then we can rely on npm's dependency solver to resolve dependencies. We simply require that for each node application that we package in nixpkgs, there is a npm-shrinkwrap.json file.

If we have a npm-shrinkwrap.json file, that should mean that all dependencies are exactly nailed down, so we can write a derivation to fetch all dependencies of a npm package (this derivation does not even need to be generated, i think it could be written generically) and builds a npm cache directory by using npm cache add. The output of this derivation is the contents of the cache, and because our dependencies are fixed, this should have a fixed hash, so we can access network in this derivation.

In the derivation for the package, we then simply instruct npm to use the cache that we previously generated and run npm install. All native dependencies can be added to buildInputs to this derivation, as usual.

Developing node applications

With this approach, you can also easily develop node applications, since you just use npm install for that. Because native dependencies are buildInputs of the derivation for the application, they should also just work if you use nix-shell.

@Ralith
Copy link
Contributor

Ralith commented Aug 27, 2016

@svanderburg Forcing end-users to package npm projects themselves seems like it would make NixOS service modules for npm projects such as #17213 awkward.

It's a relief to see this stuff getting attention.

@domenkozar
Copy link
Member

Would be wonderful if we can get this done until tomorrow :D Currently node packages are really broken.

@gilligan
Copy link
Contributor

i am in full support of reducing prepackaged npm modules packaged in nix to a bare minimum. The packages have always been out of date anyway. I think having an improved tooling for handling npm packaging is more important than having a wide range of prepackaged npm packages. Also from what I can tell the userbase for those seems very small.

@adnelson
Copy link
Contributor

Hi guys, since I got asked on IRC I thought I'd chime in here. At my company we have been using nix to build NPM packages for the last 6 months or so. However, npm2nix did not fit our needs for several reasons. I developed a tool nixfromnpm that does the same functions but with a number of improvements (see the readme for details). In addition, I maintain a very large set (2075 at current count) of node packages in this repo. I haven't spent a lot of time spreading the word about this repo or tool, but I would be happy to contribute the work I've done to addressing some of the issues here.

@adnelson
Copy link
Contributor

@copumpkin you wrote

As much as I love these autogenerated package sets, if every few weeks/days we get another >50000-line diff, usability of this repo will be hosed forever unless we decide to rewrite history once we solve the IFD issue. I don't know what to do, but I'd suggest we look pretty hard at whether there are better solutions than ginormous periodic autogenerated commits to nixpkgs.

I'd just point out that the above issues are some of the prime reasons why I wrote nixfromnpm...

@gilligan
Copy link
Contributor

hi @adnelson - thank you for chiming in.

I'd just point out that the above issues are some of the prime reasons why I wrote nixfromnpm...
How long does it take nixfromnpm to parse the node packages currently provided by nix?

I have tried nixfromnpm in the past and there are indeed a couple of benefits that it provides. First and foremost the ability to do incremental builds are great. Also nix-node-packages is great.

That being said I had some issues when I tried to nixify all dependencies of our (quite large) nodejs project. Since it has been a while I could try that again though. I also wish nixfromnpm would use npm3 with flattened dependencies by default.

Either way nixfromnpm is a great project. If we are to consider it as an alternative replacement to npm2nix instead of node2nix what would the implications of that be?

@adnelson
Copy link
Contributor

adnelson commented Aug 30, 2016

The speed issues were fixed by not checking for circular dependencies (there are only a tiny number of packages which have these issues, and I included docs on how to address these). Certainly there are remaining issues, especially with semver range resolution, and if you try to nixify a large project from scratch you'll probably run into at least one of them. But most could probably be addressed just by having more community involvement. Features like flattened dependencies, easier/better npm3 usage, etc would all be great to have for sure.

I'm not sure what the implications would be. We might for example merge nix-node-packages into nixpkgs and subsequent updates would happen as pull requests to that repo. Alternatively, nix-node-packages could be made a submodule of nixpkgs -- submodules are an approach my company uses a lot (in fact we have one for nixpkgs) but come with a cost of increased complexity and harder diffs. Assuming nix-node-packages got brought into nixpkgs, the main implications would be that package definitions would be easy to tweak (similar to how python packages are now), and node-related diffs would become tiny and easy to read.

@svanderburg
Copy link
Member Author

I must admit that I have not tried nixfromnpm, so I don't know a lot about its capabilities.

From what I read, I noticed that it seems to make some trade-offs to make things faster (e.g. ignoring cyclic dependencies)

What I can say about node2nix is that I'm trying to simulate npm's behaviour as accurately as possible. Compatibility is one of the reasons I have started it, because it is highly desired by the people I work with. Many of them use npm without Nix and don't want to make tradeoffs in supporting Nix. As a matter of fact: some of them say that if it's not compatible with npm, they refuse to use any Nix-based solution.

Naturally, accurate compatibility comes at a price, such as not being able to share library dependencies among packages (as a sidenote: this is also not something the vanilla npm supports). This also results in a generation process that is slower than npm2nix. Moreover the expressions generated by node2nix are quite big.

Moreover, node2nix can also be optimised in certain areas to make regeneration less expensive -- I simply have to find more time to implement these optimisations. :)

The changes in this pull request should work fine. Its only disadvantage are the costs, i.e. the time to regenerate packages and the size of the generated expressions. Furthermore, we have to generate two expression sets: one that simulates npm 2.x's behaviour and one the simulates npm 3.x's flat module behaviour, because dependencies are organized in a completely different way.

This pull request is tracking the last stable branch 16.03, because the master/16.09-branch has a backward-incompatible change in the fetchgit{} function computing different output hashes of git checkouts.

I can easily fix node2nix to compute git hashes that are compatible with master and integrate this package set into master.

If we are willing to accept the costs that come with regeneration, then I think there is nothing holding us back integrating this patch set into master. The backwards-incompatibility with fetchgit and the costs of regeneration are the two reasons why I haven't proceeded integrating this change set sooner. :(

@svanderburg
Copy link
Member Author

Meanwhile, I have created a branch of node2nix that fixes the fetchgit problem (but makes this node2nix branch incompatible with the Nixpkgs 16.03 branch):

https://github.com/svanderburg/node2nix/tree/fetchgit_fix

Corresponding node2nix issue:

svanderburg/node2nix#7

@svanderburg
Copy link
Member Author

Tomorrow, I will generate set of NPM packages with the modified node2nix tracking the master (future 16.09) branch. If nobody objects, it can integrated in a straight forward manner.

@adnelson
Copy link
Contributor

adnelson commented Aug 30, 2016

@svanderburg I'm not sure what you mean by "if it's not compatible with npm, they refuse to use any Nix-based solution." What does it mean for a solution to be "compatible with npm"? It's still a different tool at the end of the day. The buildNodePackage function that nix-node-packages uses still uses npm under the hood (in fact it's adapted from the buildNodePackage function which I believe you wrote, although quite heavily modified by now).

There are some oddities in how npm resolves certain dependencies. These aren't properly "simulated" by npm2nix (I haven't checked how nixfromnpm deals with it, maybe it does a better job, but I don't know yet). However, so far I have mostly encountered them in certain scenarios in our private projects.

For example, we have a private project that has a dependency on mongoose 3.6.8 and another library named: ironhorse. ironhorse in turn depends on mongoose: "*". npm2nix resolves the '*' version specifier to the latest version of mongoose, while npm omits including dependency altogether, because it has been resolved already by the parent package. As a result, ironhorse uses mongoose 3.6.8, which is what we (strangely) expect. When we use npm2nix, we run into a dependency conflict because ironhorse binds to mongoose 4.4.7 (which is at that moment, the latest version).

My colleagues expect the Nix generator to do the same thing as npm, including replicating this npm oddity. node2nix supports this kind of (counter-intuitive) behaviour. That's what they mean by: "it should be compatible".

Despite the fact that we run npm inside a derivation, npm2nix (and node2nix) bypass its dependency management system. Instead, we "simulate" npm's dependency management ourselves (using Nix's features) and we run npm in the end to perform validation + running the build scripts that may have been included in an NPM package.

nixfromnpm doesn't automatically account for circularity, but with over 2000 packages that I have defined so far, only four of them are circular. I'm sure there are more examples out there, but I don't think it's at all a dealbreaker for making a solution that works.

That may be true. In our case, however, all our private projects have dependencies on packages having circular dependencies.

Second, it's a also preference -- do we prefer accuracy or performance? both sides of the coin have their pros and cons...

I'm not going to jump in and oppose this per se (I only recently was made aware of it after all), but seeing as you said`

Well, I'm just trying to figure out what we want. node2nix is obviously not perfect, as it is (for example) expensive because it tries to be accurate, whereas your solution is probably faster but makes trade-offs in accuracy.

I could (of course) impose node2nix to every Nixpkgs user, but if that only pleases me and a handful of other Nix+NPM users, then it's probably a bad choice to move it forward. That's also the reason why I'm trying to gather as much feedback as I can with this PR.

@svanderburg
Copy link
Member Author

In addition to accuracy and performance: how does nixfromnpm deal with npm 3.x's flat module installations? I also think it's desired to support newer versions of Node.js/NPM?

@Ralith
Copy link
Contributor

Ralith commented Aug 31, 2016

Tomorrow, I will generate set of NPM packages with the modified node2nix tracking the master (future 16.09) branch. If nobody objects, it can integrated in a straight forward manner.

What's the workflow look like for adding or updating a package that uses npm packages (but isn't necessarily itself listed on npm)? I'd really like to be able to get things like #17213 in, and update it regularly (since it's in heavy development).

@gilligan
Copy link
Contributor

@Ralith i'm afraid that this PR won't really change anything about the problems of #17213 - @svanderburg will correct me if I am wrong I guess.

I think the whole issue does yet again condense into the problem of including huge nix expression sets in nixpkgs - Just like for example for Haskell or other languages. @adnelson maintains nix-node-packages which I think provides great value. It would be nice to have an equivalent collection for use with node2nix but like I kept on mentioning the previously npm2nix generated set of npm packages including various libraries was utterly outdated and mostly broken.

For that reason I am in favour of reducing the amount of maintained npm packages to a minimum - however that does of course mean that if you want to include some tool with a lot of dependencies like in #17213 you end up with a huge chunk of generated nix expressions...

I don't know what the best way forward is for that ...

@svanderburg
Copy link
Member Author

@Ralith If the package is a development project, then you should use node2nix directly on the package.json file of the development project.

For experimentation purposes with unstable/unreleased versions of NPM packages, you can also compose your own pkgs.json file (that defines an array of NPM dependency specifiers) referring (for example) to a git repository. You can use node2nix to generate an expression set and use Nix to directly deploy from the generated expressions, e.g.:

node2nix -i mypkgs.nix
nix-env -f default.nix -iA mypackage

@svanderburg
Copy link
Member Author

I have created another pull request, that can be applied to the master branch: #18151

I think discussion should resume in the corresponding thread.

@adnelson
Copy link
Contributor

adnelson commented Aug 31, 2016

Despite the fact that we run npm inside a derivation, npm2nix (and node2nix) bypass its dependency management system. Instead, we "simulate" npm's dependency management ourselves (using Nix's features) and we run npm in the end to perform validation + running the build scripts that may have been included in an NPM package.

Yes, that's the same with nixfromnpm.

That may be true. In our case, however, all our private projects have dependencies on packages having circular dependencies.

I think circular dependency detection/solving is important because as you mention it's supported by npm. However, although nixfromnpm doesn't currently support it, it wouldn't be a huge challenge to do so. Due to it being a one-man project, and since as I said circularity "in the wild" is rare, I haven't spent the time tom implement this. But there's already a somewhat well-defined algorithm for solving circularity in cases where it occurs, and nixfromnpm can already detect circularity (to prevent infinite loops), so it is certainly feasible to support circularity solving in nixfromnpm. Of course, I realize that "it could support it with some work" is not the same as "it already supports it" 😉

Well, I'm just trying to figure out what we want. node2nix is obviously not perfect, as it is (for example) expensive because it tries to be accurate, whereas your solution is probably faster but makes trade-offs in accuracy.

I might be misunderstanding but I don't think that it makes any trade-offs in accuracy.

@svanderburg
Copy link
Member Author

I'm closing, because 16.09 branch is going to get created very soon. Instead, go to #18151, for further discussion, since this tracks the master branch.

@svanderburg svanderburg closed this Sep 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants