Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: relative cache support #114

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

cs6cs6
Copy link

@cs6cs6 cs6cs6 commented Oct 5, 2023

…velopers or into ci machines.

Summary

Related Issues

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 5, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@eslint-github-bot
Copy link

Hi @cs6cs6!, thanks for the Pull Request

The pull request title isn't properly formatted. We ask that you update the pull request title to match this format, as we use it to generate changelogs and automate releases.

  • The commit message tag wasn't recognized. Did you mean "docs", "fix", or "feat"?
  • There should be a space following the initial tag and colon, for example 'feat: Message'.
  • The first letter of the tag should be in lowercase

To Fix: You can fix this problem by clicking 'Edit' next to the pull request title at the top of this page.

Read more about contributing to ESLint here

@cs6cs6 cs6cs6 changed the title This feature is to create an eslintcache that can be shared across de… feat: relative cache support Oct 5, 2023
@cs6cs6
Copy link
Author

cs6cs6 commented Oct 9, 2023

Hello. How do I draw attention to this PR? I believe I am only awaiting feedback.

@nzakas nzakas added the Initial Commenting This RFC is in the initial feedback stage label Oct 10, 2023
Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together and sorry for the delay -- we are a small team and have a lot to do.

Overall I think this looks good. I left some feedback throughout.

designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved
designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved
designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved
designs/2023-relative-cache/README.md Show resolved Hide resolved
designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved
designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved
@cs6cs6 cs6cs6 force-pushed the feat/16493_relative_cache branch from 14475ae to 1bcafba Compare October 11, 2023 19:16
@cs6cs6 cs6cs6 force-pushed the feat/16493_relative_cache branch from a05d10b to 0b7542c Compare October 18, 2023 13:32
@cs6cs6 cs6cs6 requested a review from nzakas October 19, 2023 19:34
Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good. Left comments throughout on areas we can clarify and clean up.

designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved
designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved
designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved
designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved
designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved
designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved

I originally planned the shareable-cache flag so that users of eslint could not be surprised by changed cache behavior, and would also not have to regenerate their cache
for a patch version change. But I discovered by testing that there is no way to NOT force people to regenerate their eslint cache with this change. That's because by adding
the 'shareable-cache' flag, it adds a property to the ConfigArray object. Even if it's set to false (the old behavior), the object's structure changes with a new property
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this changing the ConfigArray? This setting is at the CLI level, not at the config level, so there shouldn't be any changes to the ConfigArray.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The object itself, along with all of its properties (set or not) is rolled into the computed hash code that goes into the file. So, simply by adding a new property, the hash computed of the object changes, even if that property is deliberately set to create no change in behavior.

designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved
designs/2023-relative-cache/README.md Outdated Show resolved Hide resolved
cs6cs6 and others added 4 commits October 25, 2023 12:28
Accepting code review suggestion where backticks are added for readability.

Co-authored-by: Nicholas C. Zakas <[email protected]>
Per code review, adding backticks for readability.

Co-authored-by: Nicholas C. Zakas <[email protected]>
- `lib/cli.js`: Add the `shareableCache` option, defaulted to false, in the `translateOptions `function.

### Changing cache file serialization
- `lib/cli-engine/lint-result-cache.js`: Add the properties `cwd` and `shareableCache` to the `LintResultCache` class. `cwd` is a string, the current working directory. `shareableCache` is a boolean, the result of the user's passed in command line parameter. Use these values in two main places:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should clarify what we mean by the current working directory that will be passed to the LintResultCache class. Is it process.cwd(), or cwd that was passed to ESLint/FlatESLint constructor? In the latter case, is there a way to pass cwd to file-entry-cache or we could end up calculating paths differently?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question is does file-entry-cache support relative paths? All tests seem to be with absolute paths:

https://github.com/jaredwray/file-entry-cache/blob/master/test/specs/cache.js

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will. I am just going to push up the code review sometime today. I have a complete and working branch on this task, and most of your questions wuold be resolved by the code review/better answered looking at the code changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to answer the question regarding what cwd is from the previous comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it process.cwd(), or cwd that was passed to ESLint/FlatESLint constructor?

When created in cli-engine.js, it's options.cwd. It is also referred to as 'cwd' in the code.
When created in flat-eslint.js, it's processedOptions.cwd.

Copy link
Member

@fasttime fasttime Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that cwd would be a better base location to store relative paths?

Do you mean in all cases? If so, the answer is no. I think it makes the most sense to have it relative to the cache file when --cache-location is not used.

I'm asking because when --cache-location is not set, the cache file location should be already cwd. I think the advantage of keeping the paths relative to the cache file directory is that they don't change if the current directory changes, for example in a monorepo. Do you have a particular scenario in mind where storing paths relative to cwd would be advantageous?

Copy link
Member

@mdjermanovic mdjermanovic Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as @fasttime noticed, when --cache-location is not used, the cache file defaults to .eslintcache in cwd. So if the logic would be to use the location of the cache file when --cache-location is not used, and cwd when --cache-location is used, then it's basically the same as - always use cwd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a use case where a monorepo has a single eslint cache file that's checked-in and shared between developers, and developers work and run eslint in project subdirectories (using --cache-location with a path to the single cache file somewhere in the root), I believe the only solution would be that the paths are relative to the cache file directory.

For other use cases, e.g., when the cache file is generated in CI and shared between CI runs (I believe that was the original use case in eslint/eslint#16493?), I'm not sure, it depends on where and how the cache file will be stored and used. It might be good to get more input from people requesting this feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just make a decision and see what happens when people test with it. This PR has been open since October of last year, and it seems like this is the last thing to decide on, so I don't think there's value in drawing this out further when we have a binary decision to make.

So I'd like to propose that we say the paths are always relative to location of the .eslintcache file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'd like to propose that we say the paths are always relative to location of the .eslintcache file.

I agree.

@nzakas
Copy link
Member

nzakas commented Dec 8, 2023

@cs6cs6 there were still some outstanding requests for you to consider here. Are you still working on this?

@nzakas
Copy link
Member

nzakas commented Jan 9, 2024

@cs6cs6 just checking back to see if you intend on finishing this up?

@cs6cs6
Copy link
Author

cs6cs6 commented Jan 12, 2024

Hello, I apologize for the delay. I am working on this today.

@jaredwray
Copy link

@mdjermanovic I think you were the last one to look at file-entry-cache, do you recall if your concerns were addressed?

@mdjermanovic just let me know anything you want changed and I can do it.

@mdjermanovic
Copy link
Member

@jaredwray thanks for the support! I believe we'll need some changes or new features in file-entry-cache to support this use case. I'll take another look at the latest version of file-entry-cache and come up with an initial proposal in a day or two.

```js
function hashOfConfigFor(config, cwd) {
if (!configHashCache.has(config)) {
// replace the full directory path in the config string to make the hash location-neutral
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this feature will only be implemented for the new config system, I believe there is no need for this because there are no absolute paths in configs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A place where we do have absolute paths is LintResult#filePath, so that's something we'll need to convert from absolute to relative and vice versa when cache entries are stored and retrieved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A place where we do have absolute paths is LintResult#filePath, so that's something we'll need to convert from absolute to relative and vice versa when cache entries are stored and retrieved.

Or, we could omit LintResult#filePath property from cache entries and add it back when cache entries are retrieved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A reminder that the question about how LintResult#filePath will be handled should be addressed in this RFC.

@mdjermanovic
Copy link
Member

I see that the latest file-entry-cache v9 resolves relative paths passed to getFileDescriptor() as relative to process.cwd() and stores file keys as absolute paths in the cache file, as if the absolute paths were passed to getFileDescriptor().

What does everyone think about the following solution to support relative paths?

We're currently initializing cache this way:

this.fileEntryCache = fileEntryCache.create(
    cacheFileLocation,
    void 0,
    useChecksum
);

We could pass another argument, relative (boolean):

this.fileEntryCache = fileEntryCache.create(
    cacheFileLocation,
    void 0,
    useChecksum,
    relative
);

When relative is true, file-entry-cache operates in "relative" mode:

  • File keys in the cache file are stored as relative paths (e.g., ./src/app.js or ../src/app.js), relative to the location of the cache file.
  • getFileDescriptor() would still be getting absolute paths, but when searching for cache entries, it would convert them to relative paths.

@mdjermanovic
Copy link
Member

Here a stackblitz demonstration of what we'd like to achieve:

https://stackblitz.com/edit/stackblitz-starters-axsbxs?file=package.json

@nzakas
Copy link
Member

nzakas commented Jun 10, 2024

I see that the latest file-entry-cache v9 resolves relative paths passed to getFileDescriptor() as relative to process.cwd() and stores file keys as absolute paths in the cache file, as if the absolute paths were passed to getFileDescriptor().

I think hardcoding of process.cwd() is problematic as API users might set cwd to something else. Would it be possible to pass in the cwd?

@jaredwray
Copy link

I see that the latest file-entry-cache v9 resolves relative paths passed to getFileDescriptor() as relative to process.cwd() and stores file keys as absolute paths in the cache file, as if the absolute paths were passed to getFileDescriptor().

I think hardcoding of process.cwd() is problematic as API users might set cwd to something else. Would it be possible to pass in the cwd?

Yep, will get this added in this week!

@jaredwray
Copy link

Here a stackblitz demonstration of what we'd like to achieve:

https://stackblitz.com/edit/stackblitz-starters-axsbxs?file=package.json

awesome. Will work on this now and should have it released this week

@jaredwray
Copy link

Here a stackblitz demonstration of what we'd like to achieve:
https://stackblitz.com/edit/stackblitz-starters-axsbxs?file=package.json

awesome. Will work on this now and should have it released this week

I did a pull request as I think we will want to just set the relative path at the constructor instead of passing in a boolean as we can check if it is an absolute path or not.

https://github.com/jaredwray/file-entry-cache/pull/52

Please let me know if this does the fix you want.

@mdjermanovic
Copy link
Member

Here a stackblitz demonstration of what we'd like to achieve:
https://stackblitz.com/edit/stackblitz-starters-axsbxs?file=package.json

awesome. Will work on this now and should have it released this week

I did a pull request as I think we will want to just set the relative path at the constructor instead of passing in a boolean as we can check if it is an absolute path or not.

jaredwray/file-entry-cache#52

Please let me know if this does the fix you want.

This doesn't seem to solve the problem, as the cache keys are still absolute paths. So when everything is moved to another location, the cache file becomes unusable because it contains entries for absolute paths that no longer exist, and the files that were cached now have different absolute paths.

@mdjermanovic
Copy link
Member

@jaredwray just checking if you need more clarification on the use case we're trying to support.

@nzakas
Copy link
Member

nzakas commented Aug 7, 2024

@jaredwray just checking in to see if @mdjermanovic's comments make sense?

@jaredwray
Copy link

@jaredwray just checking in to see if @mdjermanovic's comments make sense?

it does make sense and I am working on a new version that does not use absolute paths. Should be ready by the 19th.

@nzakas
Copy link
Member

nzakas commented Aug 8, 2024

@jaredwray awesome. Thanks so much. We'll check back then.

@jaredwray
Copy link

@nzakas and @mdjermanovic - I released v9.1.0 of file-entry-cache it does what you are thinking but with one minor change. You need to specify the currentWorkingDir instead of just setting relativePath = true. The reason for this is that we are now using that path to handle the file keys correctly. You can see it working here with the example @mdjermanovic put together. https://stackblitz.com/edit/stackblitz-starters-npqlt7?file=package.json,test.js

const path = require('node:path');
const fileEntryCache = require('file-entry-cache');

const cacheFileLocation = path.resolve(__dirname, '.eslintcache');
const useChecksum = true;
const currentWorkingDirectory = __dirname;

const cache = fileEntryCache.create(
  cacheFileLocation,
  undefined,
  useChecksum,
  currentWorkingDirectory
);

Thanks for the patience on this and I believe this should un-block you with this where you want to rename the working folder and still use the cache.

@mdjermanovic
Copy link
Member

@jaredwray thanks for the new versions!

Question: how to store file locations relative to the cache file?

I tried the following:

const fileEntryCache = require("file-entry-cache");
const path = require("node:path");

const cacheFileDir = path.resolve(__dirname, "cache");

const cacheFileLocation = path.resolve(cacheFileDir, ".eslintcache");
const useChecksum = true;
const currentWorkingDirectory = cacheFileDir;

const cache = fileEntryCache.create(
    cacheFileLocation,
    undefined,
    useChecksum,
    currentWorkingDirectory
);

const filePath = path.resolve(__dirname, "src", "my-file.js"); // exists
const fileDescriptor = cache.getFileDescriptor(filePath);
fileDescriptor.meta.results = { foo: "bar" };

cache.reconcile();

cache/.eslintcache cache file is successfully created, but it seems empty:

[{}]

@jaredwray
Copy link

@jaredwray thanks for the new versions!

Question: how to store file locations relative to the cache file?

I tried the following:

const fileEntryCache = require("file-entry-cache");
const path = require("node:path");

const cacheFileDir = path.resolve(__dirname, "cache");

const cacheFileLocation = path.resolve(cacheFileDir, ".eslintcache");
const useChecksum = true;
const currentWorkingDirectory = cacheFileDir;

const cache = fileEntryCache.create(
    cacheFileLocation,
    undefined,
    useChecksum,
    currentWorkingDirectory
);

const filePath = path.resolve(__dirname, "src", "my-file.js"); // exists
const fileDescriptor = cache.getFileDescriptor(filePath);
fileDescriptor.meta.results = { foo: "bar" };

cache.reconcile();

cache/.eslintcache cache file is successfully created, but it seems empty:

[{}]

Do you have a project that I can see with this? I believe the paths are causing issues on this and one thing to look at is that with using the currentWorkingDirectory you need to do relative path to the files. An example of this is the unit test here:

https://github.com/jaredwray/file-entry-cache/blob/0be8ea67cd55cafa277ac199a62c3f087396d517/test/relative.js#L71

@mdjermanovic
Copy link
Member

I believe the paths are causing issues on this and one thing to look at is that with using the currentWorkingDirectory you need to do relative path to the files. An example of this is the unit test here:

https://github.com/jaredwray/file-entry-cache/blob/0be8ea67cd55cafa277ac199a62c3f087396d517/test/relative.js#L71

In the unit test, all paths are absolute, as they're produced by path.resolve()?

@mdjermanovic
Copy link
Member

@jaredwray
Copy link

Here a stackblitz repro:

https://stackblitz.com/edit/stackblitz-starters-eh4ubs?file=test.js

The fix for this is to point at the current working directory of where the file is instead of the cache directory:

https://stackblitz.com/edit/stackblitz-starters-ikrcmu?file=test.js,cache%2F.eslintcache

const currentWorkingDirectory = path.resolve(__dirname, 'src');

The currentWorkingDirectory needs to point where the files are that we have the description on as it is using relative paths.

@jaredwray
Copy link

jaredwray commented Oct 4, 2024

@mdjermanovic, @nzakas - I wanted to update you that we have released a new major version of file-entry-cache. This has some significant updates but should work better for you. I have been looking through your code and it looks like the big thing you need to handle is when a folder gets renamed you want to track files in it without resetting them.

Read the new docs here: https://www.npmjs.com/package/file-entry-cache

I would recommend that you continue to do absolute paths like you have done before. We do support relative paths but I think the goal here is to make it as easy as possible.

When a folder changes you can now call renameAbsolutePathKeys(oldPath: string, newPath: string): void which will do the rename in the memory cache and then just call reconcile() which will save the updated to cache.

const fileEntryCache = require('file-entry-cache');
const cache = fileEntryCache.create('.eslintcache', undefined, true);
const fileEntry = cache.getFileDescriptor(path.resolve(__dirname, './src/test-file.js');
console.log(fileEntry.changed); // true, first time run
cache.reconcile(); // save to the file cache
// rename the `src` folder as an example
const oldPath = path.resolve(__dirname, './src');
const newPath = path.resolve(__dirname, './src2');
fs.renameSync(oldPath, newPath);
// update all the keys
cache.renameAbsolutePathKeys(oldPath, newPath);
// they are now updated in memory, update the disk cache
cache.reconcile();

There are more updates around this with a faster caching layer and more robust features that you can read about at the README: https://github.com/jaredwray/cacheable/blob/main/packages/file-entry-cache/README.md

Please let me know how I can help to make this happen. If you want to do a video conference just let me know. https://fantastical.app/jaredwray/30-minute-sync

@mdjermanovic
Copy link
Member

When a folder changes you can now call renameAbsolutePathKeys(oldPath: string, newPath: string): void which will do the rename in the memory cache and then just call reconcile() which will save the updated to cache.

A problem with this is that we don't know oldPath. The cache file was created/updated in a previous run, possibly on a different machine, and new run doesn't have the context of the previous run. Also, in a use case where the cache file is checked into source control and shared between developers, there could be multiple old paths.

@mdjermanovic
Copy link
Member

To clarify, the example with renaming the directory was made to simplify the repro. The use case this RFC targets is when the cache file is created/updated on one machine, and then should be used on another machine where the project is possibly cloned into a different path so the absolute paths of files are different.

https://github.com/cs6cs6/eslint_rfc/blob/feat/16493_relative_cache/designs/2023-relative-cache/README.md

We think the only way to achieve this is by storing relative instead of absolute paths in the cache file.

@mdjermanovic
Copy link
Member

The fix for this is to point at the current working directory of where the file is instead of the cache directory:

https://stackblitz.com/edit/stackblitz-starters-ikrcmu?file=test.js,cache%2F.eslintcache

const currentWorkingDirectory = path.resolve(__dirname, 'src');

The currentWorkingDirectory needs to point where the files are that we have the description on as it is using relative paths.

In this example, the cache key for the file is "/my-file.js". But, if we do this for every file, then all files with the same filename would have the same cache key? For example, src/my-file.js and test/my-file.js would both have "/my-file.js" as the cache key and therefore have the same cache entry?

@nzakas
Copy link
Member

nzakas commented Oct 15, 2024

@mdjermanovic @jaredwray at this point it seems better to open an issue to contain this back-and-forth instead of continuing on in this RFC. What do you think?

@jaredwray
Copy link

@mdjermanovic @jaredwray at this point it seems better to open an issue to contain this back-and-forth instead of continuing on in this RFC. What do you think?

Who can I work with on this to get it completed? More than happy to schedule a time as with the new file-entry-cache this should be really easy and less work.

@nzakas
Copy link
Member

nzakas commented Oct 15, 2024

I think @mdjermanovic is the best person to work with.

@mdjermanovic
Copy link
Member

Here's an issue on the file-entry-cache repo for further discussion: jaredwray/cacheable#914

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Initial Commenting This RFC is in the initial feedback stage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants