Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(jest-haste-map): Enable crawling for symlink test files #9351

Merged
merged 14 commits into from
Apr 2, 2021

Conversation

mrmeku
Copy link
Contributor

@mrmeku mrmeku commented Dec 26, 2019

Summary

Jest haste maps test file crawler (see packages/jest-haste-map/src/crawlers/node.ts) uses the unix find command by default to find files whose name matches the testMatch pattern defined in jest.config.js. This works great for most cases, but works poorly when executing jest under bazel.

In bazel's execution environment, all source/test files are symlinked into a sandboxed directory rather than copied over directly. Jest's crawler is currently set up to use find's type flag to filter only for files. This excludes symlinked files

Under bazel I would expect my symlinked test files to be able to be crawled. This merely involves tweaking the command line args to find to include both files and symlinks --type f ---> (--type f -o --type=l)

Test plan

I have already tested my fix against multiple bazel environments to verify that the fix enables jest to work properly for that test runner. The only risk of this change is that my adding support for crawling symlinks hampers performance of file crawling. If that is the case this behavior might be better off to be enabled behind a flag. But I do not suspect that there will be a performance hit. If reviewers have a mechanism for running jest bench marks I'm happy to create a repo filled with both files and symlinks and benchmarking the change.

@facebook-github-bot
Copy link
Contributor

Hi mrmeku! Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file.In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@mrmeku
Copy link
Contributor Author

mrmeku commented Dec 26, 2019

CLA has been signed

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@jeysal
Copy link
Contributor

jeysal commented Dec 27, 2019

Hi @mrmeku, thanks for the issue and PR! Unfortunately I think the fix is not as simple. This will break e.g. if ln -s /etc etc.test.js - EISDIR: illegal operation on a directory, read. There's a couple of issues/PRs for proper symlink support on the tracker but it's not a simple task.

@mrmeku
Copy link
Contributor Author

mrmeku commented Dec 27, 2019

@jeysal If I handled the case for filtering out symlinked directories would you be satisfied with this change?

For background:

  • I've seen the other changes related to supporting symlinks (feat: opaque symlink support #7549) but they seem to have been abandoned. Looking at the approach of that existing PR I wasn't convinced it was correct.
  • In general, the approach in the PR I linked to was to track files/symlinks separately so that symlinks could be resolved to real file paths at a later stage after crawling. However that seperation does not seem necessary to me. It adds complexity to the code base and the crawler is already equipped to resolve the symlinks as it immediately calls fs.stat on those files identified by calling find.

Given that I've identified that this change does solve a bug with executing jest under bazel and prior attempts aren't showing much traction, I'd like to continue this PR. Can we work together to identify the short comings of this approach so that I can resolve those short comings?

@jeysal
Copy link
Contributor

jeysal commented Dec 27, 2019

With a check that the linked node is a file I couldn't come up with an example that breaks anymore, I'm just not sure that I wholly understand the implications of the change because I'm only partially familiar with the low level parts of Jest. Unfortunately, I don't know who else would, since there is no one from FB working on Jest atm, and @SimenB, who has spent the most time on Jest, didn't do much with haste-map afaik. I only know @cpojer looked at symlink handling more broadly for Metro and Jest some time ago - not sure what the outcome was?

@mrmeku
Copy link
Contributor Author

mrmeku commented Dec 30, 2019

@jeysal I'll update my PR to do the isFile check while we search for the right code maintainer to review the change. In the worst case, I can always hide this logic behind a flag so that it's not on by default. I'll defer to the opinions of @SimenB / @cpojer on that decision.

Just my two cents, so far as I can tell, I think that the majority of user's will not be impacted by a change allowing symlinks to be crawled since the majority of users do not use symlinks. Only special workspaces / runtime environments make use of symlinks directly in their source/test code. And at least for Jest execution in Bazel's runtime env, the current default behavior is a bug and this change eliminates that bug.

@mrmeku mrmeku force-pushed the crawl-symlinks branch 3 times, most recently from 4497c98 to 3888ea9 Compare December 30, 2019 17:56
@codecov-io
Copy link

codecov-io commented Dec 30, 2019

Codecov Report

Merging #9351 into master will decrease coverage by 0.38%.
The diff coverage is 26.31%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9351      +/-   ##
==========================================
- Coverage   65.09%   64.71%   -0.39%     
==========================================
  Files         287      280       -7     
  Lines       12145    11998     -147     
  Branches     3007     2961      -46     
==========================================
- Hits         7906     7764     -142     
+ Misses       3604     3601       -3     
+ Partials      635      633       -2
Impacted Files Coverage Δ
packages/jest-haste-map/src/crawlers/node.ts 72.41% <26.31%> (-14.26%) ⬇️
packages/pretty-format/src/plugins/ReactElement.ts 86.48% <0%> (-6.02%) ⬇️
packages/jest-environment-node/src/index.ts 51.16% <0%> (-5.25%) ⬇️
packages/jest-matcher-utils/src/index.ts 82.69% <0%> (-4.13%) ⬇️
packages/jest-reporters/src/coverage_reporter.ts 51.21% <0%> (-2.68%) ⬇️
packages/jest-resolve/src/index.ts 45.25% <0%> (-2.46%) ⬇️
packages/jest-snapshot/src/inline_snapshots.ts 80.39% <0%> (-1.61%) ⬇️
...es/pretty-format/src/plugins/ReactTestComponent.ts 81.81% <0%> (-1.52%) ⬇️
packages/jest-transform/src/ScriptTransformer.ts 68.63% <0%> (-1.01%) ⬇️
packages/expect/src/jestMatchersObject.ts 75% <0%> (-0.87%) ⬇️
... and 60 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ee214d...24d8b03. Read the comment docs.

Copy link
Member

@SimenB SimenB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test (e2e I guess due to the nature of the change)?

CHANGELOG.md Outdated Show resolved Hide resolved
@mrmeku
Copy link
Contributor Author

mrmeku commented Jan 1, 2020

@SimenB e2e test added successfully

@mrmeku mrmeku force-pushed the crawl-symlinks branch 3 times, most recently from 43f5092 to d16f677 Compare January 2, 2020 21:15
@mrmeku
Copy link
Contributor Author

mrmeku commented Jan 2, 2020

@SimenB I've updated the commit to fix lint errors, include copywrite for the test file I added and also augmented the non-native file crawler to pick up symlink files as well.

e2e/__tests__/crawlSymlinks.test.ts Outdated Show resolved Hide resolved
e2e/__tests__/crawlSymlinks.test.ts Outdated Show resolved Hide resolved
@cpojer
Copy link
Member

cpojer commented Jan 3, 2020

Thank you so much for adding this feature. Since you are using Bazel, is this work related to Google's adoption of Jest or is this separate?

The only concern I have is adding features to the node crawler that the watchman one does not support. If somebody has watchmen installed but falls back to node for some reason, Jest should be able to reconcile between the two. If we add this feature, different amounts of tests may be found depending on the current state of the machine or other environment factors.

@mrmeku
Copy link
Contributor Author

mrmeku commented Jan 13, 2020

@cpojer Sorry for the late reply, I went on a vacation last week :)

This isn't directly related to Google's adoption of a new test runner (I don't work there anymore so I'm a bit out of the loop). I got started on this project as a means of making the Angular CLI run under bazel more or less without any special customization. The Angular CLI is capable of running jest when using the @nrwl/jest package that my company I work for maintains. And in addition to that, I just like jest and want it to be the defacto test runner for running JS tests under bazel.

@cpojer One question. I'm willing to augment watchmen to have the same behavior if need be (it doesn't affect the bazel use case, but if its a requirement for getting the PR merged I could be persuaded to make a PR). So my question to you is, is that a hard requirement for getting this logic merged? Could it be made to be follow up work after merging this PR?

@cpojer
Copy link
Member

cpojer commented Jan 14, 2020

@mrmeku thanks for the reply and hope you enjoyed your vacation! I'd much prefer if we could merge watchman support for this given the concerns I outlined above that will lead to inconsistency and a poor experience with Jest when the underlying system is flaky. I am happy if you add that in a separate PR and would prefer holding off on merging this one until most of the watchman support is available or once we at least get the biggest blockers out of the way – or otherwise you make a big promise that you will do that. Otherwise we may need to disable or unland this change to not cause issues in Jest.

@SimenB
Copy link
Member

SimenB commented Feb 4, 2020

@mrmeku were you able to look into the watchman implementation? This'll also conflict with #9514, but that shouldn't really affect your implementation

@mrmeku
Copy link
Contributor Author

mrmeku commented Feb 5, 2020

@SimenB I'm taking a look into it tonight. Life got a little hectic so I had to delay things for a little while

@mrmeku
Copy link
Contributor Author

mrmeku commented Feb 5, 2020

@SimenB @cpojer It looks like watchman's symlink's support is off by default, but it does support a flag to enable crawling symlinks: https://github.com/facebook/watchman/blob/b3895be21e4502fc1fcaae75b146a703bc134fce/root/stat.cpp#L232

That being the case, I'm not exactly sure what the best change to make is. So far as I can tell, we don't have a way to override the settings within a user's watchmanrc (nor do I think we would want to). With all that context in mind, here are a couple of options that I've though of:

  1. Print a warning if a symlink was followed and the project's watchmanrc file does not have "watch_symlinks": true
  2. Make following symlinks opt-in via a flag to mirror the behavior of watchman

I think I'm partial to #2 since its the least complex option.

@SimenB
Copy link
Member

SimenB commented Feb 13, 2020

I'm down with adding a symlink option. @cpojer @jeysal @thymikee thoughts?

Copy link
Member

@SimenB SimenB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think this is finally good to go. To be safe I'll wait to land it until we start landing breaking changes (mostly because of the find changes, I've broken that before without knowing). That will start landing this week though, so not a long wait 🙂

Thanks for sticking with this, it's taken a while with much back and forth!

(I landed parts of the docs changes in 038d8be, btw)

@mrmeku
Copy link
Contributor Author

mrmeku commented Nov 2, 2020

Thanks for all your patience :)

@alexeagle
Copy link

Ping @SimenB could you find time to get this in? The community of Bazel users is all patching this change in, after tripping over the symlink problem. It's becoming a serious barrier to adoption since Jest is so popular :)

dae added a commit to ankitects/anki that referenced this pull request Mar 28, 2021
@hgiesel the reason no files were being found is because Jest ignores
symlinks by default. The Bazel example includes a patch we can use
to work around it, and Jest plan to add symlink support in a future
update.

https://github.com/bazelbuild/rules_nodejs/blob/stable/examples/jest/patches/jest-haste-map%2B24.9.0.patch

jestjs/jest#9351
@alexeagle
Copy link

sorry @SimenB I know we're all busy OSS maintainers, but this one would be a big help for the Bazel community. it trips all novice users on their way in.

@SimenB
Copy link
Member

SimenB commented Apr 1, 2021

Hey @alexeagle, sorry about the radio silence here!

We discussed this a bit, and got some pushback since it necessitates turning off watchman, which in theory leads to worse performance, especially during startup/initial crawl and watch mode re-runs. This led us down the rabbit hole of trying to benchmark the crawler implemented with node fs vs find vs watchman. And we actually found node crawler and watchman to be pretty similar, while find is way slower (at least on the macs and linuxes we tested), which is a change from when it was implemented (at the time find was about 10x faster than the crawl). But that means we should possibly not use find by default when disabling watchman since it's slower. We ended up agreeing that we should probably write an issue about this and get more people to test the hypothesis, then it just stalled since nobody had the energy to write up that post. Then we (or at least I) just forgot the whole thing 😅

swtiK9jRfE0zS


Aaaaaanyway, I didn't know you were actually using a patch to get this. I assume that means that this change works for Bazel users?

@liamjones
Copy link

@SimenB thank you for the extra info. When you were talking about node fs vs find vs watchman is that the decision that's made here?: https://github.com/facebook/jest/blob/40d04d8aaca5f6add7398345f6821af2cad24e69/packages/jest-haste-map/src/index.ts#L779

E.g. is the selection of FSEventsWatcher the find you were talking about or am I misunderstanding?

@alexeagle
Copy link

I didn't know you were actually using a patch to get this. I assume that means that this change works for Bazel users?

yes, we have the patch in all our examples (e.g. https://github.com/bazelbuild/rules_nodejs/blob/stable/examples/jest/patches/jest-haste-map%2B24.9.0.patch) and have instructed all users to install it as well.

@SimenB
Copy link
Member

SimenB commented Apr 2, 2021

#yolo

@SimenB SimenB merged commit 6d45caa into jestjs:master Apr 2, 2021
@SimenB
Copy link
Member

SimenB commented Apr 2, 2021

27.0.0-next.7 is released with this 🙂

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.