jest-runtime: atomic cache write, and check validity of data #4088

jeanlauliac · 2017-07-20T10:34:34Z

This change tries to address what may be a cause for #1874, where I posted some details on the approach. By doing atomic writes we ensure there's no cache file ending up being a mix of two transformed code files, and we limit the concurrency issues of having a file read and written at the same time. A prior PR #3561 tries to address the problem using a lock, but locks bring additional complexity that this change tries to avoid (ex. deadlocks).

This change also adds a checksum, because there can be other processes still writing non-atomically to cache files. Additional, not all filesystems may support atomic renames (what atomic-write-file relies on).

This change incurs a slight performance cost that is the additional I/O call to rename the files. However I believe it is less costly than a lock solution. I don't think this should have much effect even on processing several thousands of files.

Test plan

It is quite challenging to test for concurrency issues, so this change relies on the knowledge that writeFileSync is not atomic and that corruption could have the observed effects. I rely on the existing automated testing to ensure that the caching behaviour and jest-runtime in general is working as expected.

jeanlauliac · 2017-07-20T10:37:33Z

Tests are failing, I'll fix!

cpojer

Approving, pending the issues with CI.

BYK · 2017-07-21T15:21:39Z

packages/jest-runtime/src/script_transformer.js

@@ -43,6 +44,9 @@ const configToJsonMap = new Map();
 // Cache regular expressions to test whether the file needs to be preprocessed
 const ignoreCache: WeakMap<ProjectConfig, ?RegExp> = new WeakMap();

+// To reset the cache for specific changesets (rather than package version).
+const CACHE_BREAKER = '1';


CACHE_VERSION may be a better name

jeanlauliac · 2017-07-24T22:55:39Z

Looking into this today. Not sure why it fails CI still

jeanlauliac · 2017-07-24T23:22:28Z

I don't know how it's possible, but apparently adding write-file-atomic as a dependency affects other, unrelated dependencies, making the test to break because the coverage file is written a bit differently. Maybe it's an unsafe test in a sense it reads "private" data? I'm not familiar with the coverage-final.json file. @cpojer normally this version should now pass CI. But, I had to fix the unrelated test, do you have context about it by any chance?

jeanlauliac · 2017-07-24T23:23:48Z

integration_tests/__tests__/coverage_remapping.test.js

@@ -29,7 +29,7 @@ it('maps code coverage against original source', () => {

  // reduce absolute paths embedded in the coverage map to just filenames
  Object.keys(coverageMap).forEach(filename => {
-    coverageMap[filename].path = path.basename(coverageMap[filename].path);
+    coverageMap[filename].data.path = path.basename(coverageMap[filename].data.path);


That's the shady change. It's like we were using a completely different version of the "coverage module", whichever it is; except that this changeset doesn't make any change related to coverage. See also how the snap file is pretty different.

jeanlauliac · 2017-07-24T23:48:32Z

I changed my mind, I think my code genuinely breaks something. Trying to figure it out.

jeanlauliac · 2017-07-24T23:58:19Z

Got it, it's because we use return paths from inside the jest cache itself for the source maps. When code outside tries to read these files, naturally it fails because of the hash header. To fix this we can write hashes to different files. The problem is that it'll cause the number of files to double, something I'm not too sure is acceptable for some of our big repos.

EDIT: actually it's not acceptable because when reading source maps we should also check the hash.

cpojer · 2017-07-25T00:44:08Z

What's the action plan there?

jeanlauliac · 2017-07-25T01:15:09Z

Ideally, get rid of the code that return a path, and return a memory-based hash table instead. But, this will force us to publish a new major release of jest-runtime I think, because that'll be a breaking change.

To avoid a breaking change, we can just write the source map to a temporary file, but this will clutter more and more the temporary file and it never gets cleaned.

cpojer · 2017-07-25T02:56:49Z

I'm fine with the breaking change. Since the transformer in jest-runtime includes the version number, it shouldn't really matter from any one version to the next what the transform cache looks like. Do you think you could work on this tomorrow so we can tag a release and unblock shipping a bunch of fixes? :)

jeanlauliac · 2017-07-25T04:39:58Z

Yes, I'll ship that tomorrow. I'll reduce this PR's scope so that it only has the atomic cache write, that is urgent to ship soon, and I'll do the checksums and the source map fixes in another PR.

jeanlauliac · 2017-07-25T17:04:10Z

I figured out a way to do it without breaking change. The source maps will not be covered by the checksum verification, but to make them covered we'd have to change the model all across the stack I think, that'd take some more research and time. The transformed code itself is the most important thing that I hope this changeset gets right.

I'll land once CircleCI shows green.

codecov-io · 2017-07-25T18:51:54Z

Codecov Report

Merging #4088 into master will increase coverage by 0.04%.
The diff coverage is 88.23%.

@@            Coverage Diff             @@
##           master    #4088      +/-   ##
==========================================
+ Coverage   60.32%   60.37%   +0.04%     
==========================================
  Files         195      195              
  Lines        6757     6768      +11     
  Branches        6        6              
==========================================
+ Hits         4076     4086      +10     
- Misses       2678     2679       +1     
  Partials        3        3

Impacted Files	Coverage Δ
packages/jest-runtime/src/script_transformer.js	`89.92% <88.23%> (+0.08%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ffd43f7...edde90d. Read the comment docs.

screendriver · 2017-07-26T05:37:47Z

@jeanlauliac thank you for the great work! Any chance to make a hotfix patch release v20.0.5 for that? We are encountering flaky unit tests (#1874) all the time in our CI and can't really rely on Jest at the moment 😕

thymikee · 2017-07-26T06:50:15Z

@screendriver if I'm not mistaken, it should already be available under jest@test tag (there may be some breaking changes for you)

jeanlauliac · 2017-07-28T00:28:49Z

@screendriver did you try to upgrade to jest@test, and did it work for you?

screendriver · 2017-07-28T05:04:28Z

Unfortunately I could not test it until yet because it is not welcome to use unstable dependencies in our product. I try to test it in a separate branch and give you feedback about it.

screendriver · 2017-07-28T14:21:49Z

Just a little update: I couldn't test it because ts-jest does not allow the test version as peerDependency. I opened an issue here kulshekhar/ts-jest#282

…4088)

gdborton · 2017-08-29T20:37:35Z

@cpojer Is there any eta on when a release will be cut that includes this fix?

cpojer · 2017-08-29T20:42:18Z

You can use jest@test for now.

adamdicarlo · 2017-11-01T22:32:55Z

Thank you so much for this fix!

We have a project that was manifesting this issue extremely often on CodeShip under docker. It was behaving as if input source code from node_modules was truncated on a multiple of 4096-bytes boundary, but only on a first run (no transform cache). The problem only seemed to start happening once we introduced a custom transformer, so tracking it down to finally realizing "hey wait jest has multiple processes and it's acting like it's reading a partly written cache file..." was an odyssey. (Spent an entire day on it with a coworker.) We finally came to the issue tracker with the right search terms... such a relief.

❤

github-actions · 2021-05-13T03:14:21Z

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

facebook-github-bot added the CLA Signed ✔️ label Jul 20, 2017

jeanlauliac requested a review from cpojer July 20, 2017 10:58

jeanlauliac force-pushed the atomic-cache-writes branch from fc68a1d to bef5880 Compare July 20, 2017 11:03

cpojer mentioned this pull request Jul 20, 2017

Fix #1874 race condition #3561

Closed

jeanlauliac force-pushed the atomic-cache-writes branch from bef5880 to f102dba Compare July 20, 2017 15:29

cpojer approved these changes Jul 20, 2017

View reviewed changes

BYK reviewed Jul 21, 2017

View reviewed changes

jeanlauliac force-pushed the atomic-cache-writes branch from 9752fe4 to 91a5f31 Compare July 24, 2017 23:19

jeanlauliac commented Jul 24, 2017

View reviewed changes

jeanlauliac force-pushed the atomic-cache-writes branch from 91a5f31 to c5f9768 Compare July 25, 2017 17:01

jest-runtime: atomic cache write, and check validity of data

edde90d

jeanlauliac force-pushed the atomic-cache-writes branch from c5f9768 to edde90d Compare July 25, 2017 17:24

jeanlauliac merged commit 2607f2b into jestjs:master Jul 25, 2017

screendriver mentioned this pull request Jul 28, 2017

Support jest@test version kulshekhar/ts-jest#282

Closed

tushardhole pushed a commit to tushardhole/jest that referenced this pull request Aug 21, 2017

jest-runtime: atomic cache write, and check validity of data (jestjs#…

46ac4ba

…4088)

SimenB mentioned this pull request Oct 8, 2017

cache write race condition across processes #4444

Open

github-actions bot locked as resolved and limited conversation to collaborators May 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jest-runtime: atomic cache write, and check validity of data #4088

jest-runtime: atomic cache write, and check validity of data #4088

jeanlauliac commented Jul 20, 2017 •

edited

Loading

jeanlauliac commented Jul 20, 2017

cpojer left a comment

BYK Jul 21, 2017

jeanlauliac commented Jul 24, 2017 •

edited

Loading

jeanlauliac commented Jul 24, 2017

jeanlauliac Jul 24, 2017 •

edited

Loading

jeanlauliac commented Jul 24, 2017

jeanlauliac commented Jul 24, 2017 •

edited

Loading

cpojer commented Jul 25, 2017

jeanlauliac commented Jul 25, 2017

cpojer commented Jul 25, 2017

jeanlauliac commented Jul 25, 2017

jeanlauliac commented Jul 25, 2017 •

edited

Loading

codecov-io commented Jul 25, 2017

screendriver commented Jul 26, 2017

thymikee commented Jul 26, 2017

jeanlauliac commented Jul 28, 2017

screendriver commented Jul 28, 2017

screendriver commented Jul 28, 2017

gdborton commented Aug 29, 2017

cpojer commented Aug 29, 2017

adamdicarlo commented Nov 1, 2017

github-actions bot commented May 13, 2021

jest-runtime: atomic cache write, and check validity of data #4088

jest-runtime: atomic cache write, and check validity of data #4088

Conversation

jeanlauliac commented Jul 20, 2017 • edited Loading

jeanlauliac commented Jul 20, 2017

cpojer left a comment

Choose a reason for hiding this comment

BYK Jul 21, 2017

Choose a reason for hiding this comment

jeanlauliac commented Jul 24, 2017 • edited Loading

jeanlauliac commented Jul 24, 2017

jeanlauliac Jul 24, 2017 • edited Loading

Choose a reason for hiding this comment

jeanlauliac commented Jul 24, 2017

jeanlauliac commented Jul 24, 2017 • edited Loading

cpojer commented Jul 25, 2017

jeanlauliac commented Jul 25, 2017

cpojer commented Jul 25, 2017

jeanlauliac commented Jul 25, 2017

jeanlauliac commented Jul 25, 2017 • edited Loading

codecov-io commented Jul 25, 2017

Codecov Report

screendriver commented Jul 26, 2017

thymikee commented Jul 26, 2017

jeanlauliac commented Jul 28, 2017

screendriver commented Jul 28, 2017

screendriver commented Jul 28, 2017

gdborton commented Aug 29, 2017

cpojer commented Aug 29, 2017

adamdicarlo commented Nov 1, 2017

github-actions bot commented May 13, 2021

jeanlauliac commented Jul 20, 2017 •

edited

Loading

jeanlauliac commented Jul 24, 2017 •

edited

Loading

jeanlauliac Jul 24, 2017 •

edited

Loading

jeanlauliac commented Jul 24, 2017 •

edited

Loading

jeanlauliac commented Jul 25, 2017 •

edited

Loading