Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow startup in monorepo #276

Open
culli opened this issue Aug 15, 2023 · 6 comments
Open

Slow startup in monorepo #276

culli opened this issue Aug 15, 2023 · 6 comments

Comments

@culli
Copy link

culli commented Aug 15, 2023

I am trying out cargo watch in a monorepo which is made of node, rust, etc. and it takes a long time (~50 seconds) to start cargo watch -x test. Turning on --debug it looks like it is watching the whole repo (node_modules, etc). I've tried --skip-local-deps and -w . but it still seems to be watching much too widely. Any other options I can try?

I'm running it straight on a mac m1, no docker or anything, happens in both iterm2 and jetbrains (goland) embedded terminal.

@passcod
Copy link
Member

passcod commented Aug 16, 2023

I am working on this this week/month (time permitting) actually! The immediate issue seems like a regression in ignores but there's other deeper issues that I'm working to eliminate.

As a workaround for right now you can try using -i '**/node_modules/**' and if that doesn't work either watch the src directory only (or whatever's useful) or as a last resort downgrade to 8.1.2

@culli
Copy link
Author

culli commented Aug 16, 2023

Sorry to say that didn't seem to help. I also added some other big directories like **/dist/**. Looking more at the debug, it's still loading a lot of .gitignore from down in node_module, for what that's worth.

What does help for now is --no-vcs-ignores, then it starts right up. I might have to tweak the ignores a bit, but it's working!

Also possibly relevant is that the main cargo.toml has [workspace] with several members.

Some output (before using --no-vcs-ignores):

cargo_watch::options: 2023-08-16T09:07:21.538-06:00 - DEBUG - All ignores: ["*/.DS_Store", "*.sw?", "*.sw?x", "#*#", ".#*", ".*.kate-swp", "*/.hg/**", "*/.git/**", "*/.svn/**", "*.db", "*.db-*", "*/*.db-journal/**", "*/target/**", "**/node_modules/**", "**/dist/**"]
...
watchexec::gitignore: 2023-08-16T09:07:21.707-06:00 - DEBUG - Looking in "/Users/jimcullison/projects/m/u/s/statsd" for a .git directory
watchexec::gitignore: 2023-08-16T09:07:21.707-06:00 - DEBUG - Looking in "/Users/jimcullison/projects/m/u/s" for a .git directory
watchexec::gitignore: 2023-08-16T09:07:21.707-06:00 - DEBUG - Looking in "/Users/jimcullison/projects/m/u" for a .git directory
watchexec::gitignore: 2023-08-16T09:07:21.707-06:00 - DEBUG - Looking in "/Users/jimcullison/projects/m" for a .git directory
watchexec::gitignore: 2023-08-16T09:07:21.707-06:00 - DEBUG - Found the top level git directory: "/Users/jimcullison/projects/m
watchexec::gitignore: 2023-08-16T09:07:22.049-06:00 - DEBUG - Loaded "/Users/jimcullison/projects/m/x/node_modules/nopt/.gitignore"

Let me know if anything else in the output might be interesting.

@passcod
Copy link
Member

passcod commented Aug 16, 2023

Ahhh yep, different regression, also on the todo list but a bit further down. Glad you've got a workaround tho!

@MaxFangX
Copy link

MaxFangX commented Apr 5, 2024

I'm running into the same issue. Passing --debug, it looks like cargo watch is repeatedly parsing the .gitignore files for each crate in the workspace? Considering we have over 30 crates in our workspace, cargo watch startup is taking multiple minutes.

A sample of the logs:

watchexec::gitignore: 2024-04-04T22:25:29.798-07:00 - DEBUG - Looking in "/Users/fang/lexe/dev/lexe/public/run-sgx" for a .git directory
watchexec::gitignore: 2024-04-04T22:25:29.798-07:00 - DEBUG - Looking in "/Users/fang/lexe/dev/lexe/public" for a .git directory
watchexec::gitignore: 2024-04-04T22:25:29.798-07:00 - DEBUG - Looking in "/Users/fang/lexe/dev/lexe" for a .git directory
watchexec::gitignore: 2024-04-04T22:25:29.798-07:00 - DEBUG - Found the top level git directory: "/Users/fang/lexe/dev/lexe"
globset: 2024-04-04T22:25:31.492-07:00 - DEBUG - glob converted to regex: Glob { glob: "**/Flutter/ephemeral/**", re: "(?-u)^(?:/?|.*/)Flutter/ephemeral(?:/?|/.*)$", opts: GlobOptions { case_insensitive: false, literal_separator: true, backslash_escape: true }, tokens: Tokens([RecursivePrefix, Literal('F'), Literal('l'), Literal('u'), Literal('t'), Literal('t'), Literal('e'), Literal('r'), Literal('/'), Literal('e'), Literal('p'), Literal('h'), Literal('e'), Literal('m'), Literal('e'), Literal('r'), Literal('a'), Literal('l'), RecursiveSuffix]) }
globset: 2024-04-04T22:25:31.492-07:00 - DEBUG - glob converted to regex: Glob { glob: "**/Pods/**", re: "(?-u)^(?:/?|.*/)Pods(?:/?|/.*)$", opts: GlobOptions { case_insensitive: false, literal_separator: true, backslash_escape: true }, tokens: Tokens([RecursivePrefix, Literal('P'), Literal('o'), Literal('d'), Literal('s'), RecursiveSuffix]) }
globset: 2024-04-04T22:25:31.492-07:00 - DEBUG - glob converted to regex: Glob { glob: "**/dgph/**", re: "(?-u)^(?:/?|.*/)dgph(?:/?|/.*)$", opts: GlobOptions { case_insensitive: false, literal_separator: true, backslash_escape: true }, tokens: Tokens([RecursivePrefix, Literal('d'), Literal('g'), Literal('p'), Literal('h'), RecursiveSuffix]) }
globset: 2024-04-04T22:25:31.492-07:00 - DEBUG - glob converted to regex: Glob { glob: "**/xcuserdata/**", re: "(?-u)^(?:/?|.*/)xcuserdata(?:/?|/.*)$", opts: GlobOptions { case_insensitive: false, literal_separator: true, backslash_escape: true }, tokens: Tokens([RecursivePrefix, Literal('x'), Literal('c'), Literal('u'), Literal('s'), Literal('e'), Literal('r'), Literal('d'), Literal('a'), Literal('t'), Literal('a'), RecursiveSuffix]) }

...

# Eventually gets to another crate, then repeats the whole glob -> regex process again:

globset: 2024-04-04T22:25:31.561-07:00 - DEBUG - built glob set; 0 literals, 0 basenames, 0 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 14 regexes
watchexec::gitignore: 2024-04-04T22:25:31.561-07:00 - DEBUG - Loaded "/Users/fang/lexe/dev/lexe/.gitignore"
watchexec::gitignore: 2024-04-04T22:25:31.576-07:00 - DEBUG - Looking in "/Users/fang/lexe/dev/lexe/repotools" for a .git directory
watchexec::gitignore: 2024-04-04T22:25:31.576-07:00 - DEBUG - Looking in "/Users/fang/lexe/dev/lexe" for a .git directory
watchexec::gitignore: 2024-04-04T22:25:31.576-07:00 - DEBUG - Found the top level git directory: "/Users/fang/lexe/dev/lexe"
globset: 2024-04-04T22:21:51.834-07:00 - DEBUG - glob converted to regex: Glob { glob: "**/Flutter/ephemeral/**", re: "(?-u)^(?:/?|.*/)Flutter/ephemeral(?:/?|/.*)$", opts: GlobOptions { case_insensitive: false, literal_separator: true, backslash_escape: true }, tokens: Tokens([RecursivePrefix, Literal('F'), Literal('l'), Literal('u'), Literal('t'), Literal('t'), Literal('e'), Literal('r'), Literal('/'), Literal('e'), Literal('p'), Literal('h'), Literal('e'), Literal('m'), Literal('e'), Literal('r'), Literal('a'), Literal('l'), RecursiveSuffix]) }
globset: 2024-04-04T22:21:51.834-07:00 - DEBUG - glob converted to regex: Glob { glob: "**/Pods/**", re: "(?-u)^(?:/?|.*/)Pods(?:/?|/.*)$", opts: GlobOptions { case_insensitive: false, literal_separator: true, backslash_escape: true }, tokens: Tokens([RecursivePrefix, Literal('P'), Literal('o'), Literal('d'), Literal('s'), RecursiveSuffix]) }
globset: 2024-04-04T22:21:51.834-07:00 - DEBUG - glob converted to regex: Glob { glob: "**/dgph/**", re: "(?-u)^(?:/?|.*/)dgph(?:/?|/.*)$", opts: GlobOptions { case_insensitive: false, literal_separator: true, backslash_escape: true }, tokens: Tokens([RecursivePrefix, Literal('d'), Literal('g'), Literal('p'), Literal('h'), RecursiveSuffix]) }

...

The --no-vcs-ignores workaround isn't too practical for us as we have about 35 lines in our .gitignore.

Any progress on this is appreciated. 🙏

@ag-mathieulj
Copy link

Another thing I noticed while trying to debug this myself (before finding the workaround here), we seem to be opening files in the ignored directory even though they will never cause an execution.

 sudo fs_usage | rg cargo-watch | head -n 5000

Prints tons and tons of:

10:08:45  open              .../node_modules/...some file...    0.000020   cargo-watch
10:08:45  fstatfs64                                                                                          0.000001   cargo-watch
10:08:45  getdirentries64                                                                                    0.000011   cargo-watch
10:08:45  open              .../.git/...some file...    0.000020   cargo-watch
10:08:45  fstatfs64                                                                                          0.000001   cargo-watch
10:08:45  getdirentries64                                                                                    0.000011   cargo-watch
10:08:45  open              .../target/...some file...    0.000020   cargo-watch
10:08:45  fstatfs64                                                                                          0.000001   cargo-watch
10:08:45  getdirentries64                                                                                    0.000011   cargo-watch

Despite all of the above folders being ignored by multiple criteria.

@passcod
Copy link
Member

passcod commented May 31, 2024

Yeah, the core issue here and still in watchexec is that the notify library does the recursion for the actual watching, and it's not aware of ignores/filtering, so it goes way beyond where it should look.

Watchexec has a work item pending (I've been on a break/holiday, will start work on it in 2024Q3) to do the recursion in the watchexec library, this time with awareness of ignores. That should solve the remaining startup performance issues, at which point I'll get cargo-watch over too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants