-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update knit proposal #73
base: master
Are you sure you want to change the base?
Conversation
Awesome! |
accepted/0000-yarn-knit.md
Outdated
* `yarn pack` + `yarn install dep.tgz` is similar to `file://` dependencies. The pack + install process must be re-run every time a change is made. It does correctly dedupe dependencies, however, as `node_modules` are excluded by `yarn pack`. | ||
|
||
* [Workspaces](https://github.com/yarnpkg/rfcs/pull/66) are similar but solve a different problem. Where workspaces are great for a tree of related modules (e.g. a monorepo), `yarn knit` is for linking together modules in separate trees, e.g. things that might be shared between multiple workspaces. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice overview.
accepted/0000-yarn-knit.md
Outdated
# Alternatives | ||
|
||
Another similar approach is to run `yarn pack`, which creates a tarball of the library's contents as if it were downloaded from npm, and install the tarball in the app used to test the library. This has the benefits of reducing divergence between development and production -- `library/node_modules` is not shared across apps, the developer can work on multiple versions of the library, and the `node_modules` hierarchy is faithfully represented. The downside is that everytime the developer edits a file, they need to re-pack and reinstall the library. | ||
Finally, another issue is with the way the node require resolution algorithm works. Dependencies of symlinked modules are resolved relative to the realpath, not the symlink. This means that you'll still get duplicates if both modules depend on a third dependency, or errors if that dependency is not installed in either place. This is solved by the recently added runtime option for node `--preserve-symlinks`, which skips getting the realpath when resolving modules. Something similar would need to be added to browserify/webpack to solve this there as well. I recently opened a [PR for browserify](https://github.com/substack/node-browserify/pull/1742) to support the same option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is a problem with knitting because on install the knitted dependencies should not have subdependencies packed in the "repository".
So all subdependencies should be installed fresh at referer level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that the linked module may still have a node_modules
folder where you are developing. When you require('dep')
from your app, it gets resolved using the realpath, e.g. ~/dev/dep/index.js
. If that file has require('dep2')
, it is resolved to ~/dev/dep/node_modules/dep2
, and that copy will be used instead of the one installed in your app. I verified this to be the case. It is not the case with the --preserve-symlinks
option, because the resolution algorithm uses the symlink path (~/dev/app/node_modules/dep/index.js
) instead of the realpath, so when resolving dep2
, ~/dev/app/node_modules/dep2
will be used instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does knitting mean that dependencies need run prepublish script and the files listed in package.json/files
should be copied to some temp location?
I think there would not be a symlink ~/dev/app -> ~/dev/dep
, it would probably be ~/dev/app -> (yarn-cache)/dep-knit-<uuid>
and node_modules would be empty there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying files kinda defeats the purpose, no? Then you'd have to run yarn knit
every time you make a change to a dependency. I think the symlink would be ~/dev/app/index.js -> ~/.yarn-knit/dep/index.js -> ~/dev/dep/index.js
for example. The realpath is still the last one, which may have a node_modules
folder in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then there is a contradiction we need to sort out:
The links above indicate that knitting would simulate full npm publishing/installation but without actual publish/install to npm registry.
Symlinking app -> dep
breaks this simulation because dep
will have non publishable artifacts (node_modules, .yarnignore files etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what the middle symlink is for. Only the published files would have a symlink in ~/.yarn-knit/dep
, so (as long as --preserve-symlinks
is used) non-published files would not show up to the app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
Then node_modules would not be symlinked and this is not a problem.
@devongovett, what do you think of making That might reduce the terms overload on users. |
@bestander so you mean the following?
|
Yes
Only if
And also all dependencies with |
|
cc @ide who wrote the original knit proposal |
What I tried to say that And we can do linking 3 different ways:
So knitting should be compatible with all of them |
Ah I see, sorry for the misunderstanding. Yes, I think knit could be a mode of link:
What do you think of dropping the need for |
👍
Considering that at step one,
I am worried that it will be confusing. |
Sorry for the delay in reply. Yes, I think the
So my proposal would be:
No new commands, only two new options. |
That naming proposal mostly makes sense to me. I prefer a flag other than I like the |
I think |
Another open question is what put in the lock file. I would say that installing with the |
|
As for yarn.lock - good question.
Would the second command overwrite what is inside node_modules? |
yeah its a bit weird if you commit your |
Updated the RFC with the new naming as discussed above, and renamed the file. Sorry if it makes it harder to see the diff. :/ |
@bestander @ide thoughts on the update? Would like to get started on an implementation if you think this is good. |
|
||
This is the step that simulates publishing `dep`. Running `yarn link --pack` in `dep` finds all the files that "yarn publish" would pack up and upload to npm. Crucially, this excludes `node_modules`, and would follow the same algorithm as "yarn publish" such as reading package.json's `files` field. | ||
|
||
Then it simulates publishing `dep`: it creates a directory named `dep-X.Y.Z` (where `X.Y.Z` is the version of `dep` in its package.json) inside of a global directory like `~/.yarn-link`. A symlink is created for each file or directory that `yarn publish` would normally have packed up. This step shares some conceptual similarities with publishing to a registry, except it uses symlinks and it's local on your computer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use cache folder for this instead?
If we opt in to a new ~/.yarn-link it will fall out of OS cache radars and becomes a risk of growing out of control.
How about <yarn cache>/links/
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially, though there should only be symlinks in the global registry anyway, so I don't think it should get that large. Putting it in the cache folder might cause your linked modules to suddenly stop working e.g. after a reboot, which would be weird IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we'll defer this discussion once we agree on the model
|
||
### Running "yarn link --pack" inside of `dep` | ||
|
||
This is the step that simulates publishing `dep`. Running `yarn link --pack` in `dep` finds all the files that "yarn publish" would pack up and upload to npm. Crucially, this excludes `node_modules`, and would follow the same algorithm as "yarn publish" such as reading package.json's `files` field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The disadvantage is that you need to run yarn link --pack
every time you want to iterate.
What do you think of running the --pack
at the time of link consumption?
yarn link
would registerdep
as "linkable"yarn link dep --pack
would run the packing and installation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite. You have to run yarn link --pack
once to set up, and only again if you add/remove a file in the root. If you add a file to a subdirectory (e.g. src/
) it should automatically show up everywhere since we symlink directories in the root. I think this is reasonable. I definitely don't want to have to run a command every time any change is made: might as well use yarn pack
+ yarn install
in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I've missed that point.
I understand that your goal for this change is to improve iteration speed when you are actively developing dep
that is used in one of your projects.
I think the goal should be to achieve correctness of dep
and simulate full publish/install cycle.
We could think of performance tweaks in the next iteration and people who absolutely need speed will just set up their environment to use vanilla yarn link
.
Am I right you want to have this setup?
/path/to/app/node_modules/dep ->(symlink) ~/.yarn-links/dex-x.y.z ->(multiple symlinks) /path/to/dep
I think you won't be able to achieve the right isolation with symlinks inside ~/.yarn-links/dep-x.y.z -> dep
. You'll end up cherry-picking symlinks of files and folders while respecting rules in .yarnignore/.npmingore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the right idea. The dir in the app symlinks to a global directory which contains symlinks to individual files/directories in the root of the dependency.
My goal for this feature is to have something that works as closely to yarn link
as possible but avoids the problem of having multiple copies of dependencies. This is solved by not including node_modules
in the list of symlinks, and installing the dependencies locally in the project you're linking into.
It's a benefit that it happens to be closer to what you'd actually publish, but not the main goal for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a big fan of multiple symlinks that work around node_modules and .yarnignore.
That would be quite hard to debug in non trivial cases IMHO.
And yeah, it goes the other direction with the --pack
idea which should be about simulating a publish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason it make it global is so when you re-"publish" (e.g. after adding a new file) you don't also have to re-"install" in every project that uses that dependency. This is because those projects point to the global directory, not to the individual files, so updating the global cascades the updates everywhere.
As for cleaning up references, we have a couple options:
-
We could clean them up when you run
yarn link --pack
again - we'd blow away everything inside the global dir for the package and create new symlinks. This way anything that didn't exist anymore would go away. -
We could do (1) but across all packages in the global dir as a cleanup step: find any broken links and delete them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense if you have multiple projects refer the same package.
Is it a common use case for you?
I though that yarn link
is something that you do temporarily while testing a dependency and don't leave as a link for long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah at work we tend to build our apps as a composition of a bunch of libraries, and we have several different apps which all use some of the same libraries. Setting up and maintaining the links between all of our repos has been a huge pain for us, which is why I'm invested in this feature. Our devs work on different parts of our apps which live across repos all the time, so we want to keep things linked generally. I've written some automation to setup the repos locally for everyone and link everything together using the POC of this yarn feature, and it seems to work pretty well. Ideally we'd use workspaces, but since we have many libraries that are used across different applications, it would be hard to go the monorepo route for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying and investing in this feature, @devongovett.
I agree in general that Yarn is in the position to solve your problem but before merging your RFC I want us to sort out a few things that bother me.
-
I want to keep
knit
proposal separate from this one.
I thinkknit
is about runningprepublish
scripts and making files available to projects that use adep
.
This proposal is about improvinglink
command with a trick around symlink so that node_modules and dev files don't leak fromdep
into a project using it. -
Semantics.
After thinking more,--pack
seems misleading.
I would sayyarn link --exclude-ignored-files
or something like that, I don't insist on the actual flag name but it should not imply that some "packing" or prepublish will be happening. -
I am still not convinced that symlinks should be created in
~/yarn-links
or any other global folder.
Symlinks are cheap to create and any issues with the implementation would be contained inside a project that uses this feature. -
Another thing is this issue [docs] create a page for working with non npm dependencies website#573.
We have quite a few way of working with npm packages locally and I am concerned that we are getting too many very similar solutions that approach the problem from different angles.
I suggest hiding this RFC implementation behind a flag, similar to "workspaces-experimental true", and running with it for a while as an opt-in feature.
So that if we find a better solution we could remove or change it without concerns of breaking the world.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW the linking behavior is what we (Expo) want, whatever the name of the feature ends up being. These are our needs:
(a) we have a repo with multiple apps and multiple libraries
(b) we want the apps to be able to use the latest code in the libraries
(c) we want the libraries to be installed/linked as if we had published them to npm and then installed them in the apps. npm link
/yarn link
today has very different behavior than installing published code and we can't trust those commands.
(d) we want to be able to write code in the libraries and easily see how it affects an app. symlinks seem like a good way to do this.
Separately, we also do have a need for packing up a library (to simulate publishing) and installing in an app -- this happens in our continuous integration systems. This behavior is even closer to publishing+installing a library, which is why we use it in CI. We run something like: cd library && yarn pack && cd app && yarn add library.tgz
.
However, in CI we don't need a tight feedback loop between editing library code and seeing the effects of those edits in an app. During development, we want to edit library code and see the app change immediately, which is why symlinks are appealing. We have a solution for CI already (yarn pack && yarn add
) but no good one for development today.
@devongovett, I think we are almost there, thank you for your work on this feature, it should be really useful. |
Looks like this PR got stuck :( |
@bestander sorry about that. I got really busy, and a bit confused about what the next steps are for this proposal. However, we really are in desperate need of this feature, so I am happy to work on it if we can come to a well-defined agreement about what we need. Here is my proposal to move things forward:
The global directory is needed so that Please let me know if there are other things that we should cover. |
Hey, @devongovett, no worries, I can totally understand. I wish Yarn would have a plugin API so that people could code their needs to get unblocked. On your proposal.
|
|
||
#### Isolating `node_modules` correctly | ||
|
||
You can install `dep` in two different apps without sharing the `node_modules` of `dep`. This is a problem with Electron apps, whose V8 version is different than Node's and uses a different ABI. If you have `node-app` and `electron-app` that both depend on `dep`, the native dependencies of `dep` need to be recompiled separately for each app; `node-app/n_m/dep/n_m` must not be the same as `electron-app/n_m/dep/n_m`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the proposal will solve this issue. Node.js calls realpath
on every "module". So even when files are linked instead of the whole folder, any lookup origining from the linked module will resolve in the realpath
-> in the node_modules
folder of the linked dependency and not in the app's node_modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not when run with --preserve-symlinks
: https://nodejs.org/api/cli.html#cli_preserve_symlinks. See drawbacks section. I think webpack already supports such an option as well, and I tried to patch browserify. browserify/browserify#1742
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, that makes sense.
How will
Where |
This rfc is pretty old, and predates our workspace feature which is a much safer alternative to |
@arcanis: workspaces may be safer, but they seem to be fairly explicitly intended for planned usage, not ad-hoc "oh man, there's an issue in one of our dependencies, let me go make a PR for that, testing it in our thing so I can easily iterate on fixing our actual problem" use cases. So, features that would actually help with those ad-hoc use cases are still desirable. Since Note: I have no position on whether any given [Hmm. I'm really not good at writing short comments, am I?] |
Based on my experiments on yarnpkg/yarn#3753, I'm updating the knit RFC with a few additional sections as requested by @bestander:
yarn install --knit
option to automatically link dependencies when installing.--preserve-symlinks
option in node for this to work correctly.