Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ctrl-T command (__fzf_select__) never finishes, consumes all RAM, by default on macos #2705

Closed
5 of 9 tasks
hraftery opened this issue Jan 5, 2022 · 13 comments
Closed
5 of 9 tasks

Comments

@hraftery
Copy link

hraftery commented Jan 5, 2022

  • I have read through the manual page (man fzf)
  • I have the latest version of fzf
  • I have searched through the existing issues

Info

  • OS
    • Linux
    • Mac OS X
    • Windows
  • Shell
    • bash
    • zsh
    • fish

Problem / Steps to reproduce

  1. Fairly standard setup running macOS 11.6.2, except default shell replaced with bash 5.1.8.
  2. Install with brew install fzf
  3. Then run /usr/local/opt/fzf/install to install key bindings and shell integrations (ref).
  4. Then hit Ctrl-T.

While the command can be used right away, searching and selecting the files found, the number of entries grows into the tens of millions and continues endlessly. If left for 10 minutes or so, fzf consumes 16GB of RAM, and attempts to continue. With only 16GB of RAM on this machine, the computer becomes increasingly unresponsive and I eventually aborted the command. No lingering effects, but the behaviour can be repeated.

Expected Behaviour

Since it's not clear which files are found first, if I'm looking for a particular file I would expect to have to wait until the search is complete to be sure. It turns out the search does not complete before bringing the system down. I expected the search to complete and the full list of files to be searchable.

References

I wasn't able to find many relevant reports, but the search terms are tricky to get right, so I'm littering this issue with key terms to provide a trail for posterity. This is the most relevant issue I could find:

Find process got stuck in the background and led to high CPU usage

@hraftery
Copy link
Author

hraftery commented Jan 5, 2022

After some exploration, I traced the issue to shell/key-bindings.bash, which binds Ctrl-T to fzf-file-widget(), which runs __fzf_select__(). That function ultimately runs this command:

find -L . -mindepth 1 \( -path '*/\.*' -o -fstype 'sysfs' -o -fstype 'devfs' -o -fstype 'devtmpfs' -o -fstype 'proc' \) -prune -o -type f -print -o -type d -print -o -type l -print 2> /dev/null

If I run that command and redirect to a file, the command runs indefinitely as the file grows to many gigabytes. So it seems like the likely culprit.

Skimming the file I see it often contains entries like:

./Library/Containers/<something>/Data/...

Where the paths after Data are large sections of my home directory. Given that includes the iTunes and Photos libraries, there are lots and lots of these entries.

It turns out that this is related to the macOS sandboxing security feature that has brought so much effort and drama to the OS recently. Applications that want access to local files must request permission from the user. If granted, it seems the application still doesn't access those files directly, but via symlinks placed into their Data folder. Hence, you end up with lots and lots of copies of your home directory via symlinks within Containers.

To resolve this, I considered adding another pattern to the -prune clause, but it's very macOS specific, and tied to internal details that may change.

Instead I tried just removing the -L to prevent find descending into symlinks. I believe this also prevents metadata (eg. original filename, modification date, etc.) of the linked file being retrieved too, but I can't imagine that being a big loss in this case.

Now when I run the command, I get a list of 1.5 million files in about a minute, and no excessive RAM usage.

The choice to include -L seems deliberate, but in my case I can't see much advantage - any relevant file ought to be able to be found with having to navigate symlinks, and I expect that files outside the home directory are excluded by default anyway.

In conclusion, I think removing -L both fixes this issue, and is a net benefit to fzf, but as a complete noob am not qualified to speak on behalf of others. If others agree, I'm happy to provide a PR.

hraftery added a commit to hraftery/fzf that referenced this issue Jan 5, 2022
@junegunn
Copy link
Owner

junegunn commented Jan 6, 2022

Thanks for the report, that's unfortunate.

find of macOS and GNU find both can detect filesystem loops, and the latter specifically prints diagnostic messages like so

brew install findutils

gfind -L ~/Library/Containers > /dev/null
  # gfind: File system loop detected; ‘/Users/jg/Library/Containers/com.apple.CloudPhotosConfiguration/...’.
  # gfind: File system loop detected; ‘/Users/jg/Library/Containers/com.apple.ScreenSaver.Engine.legacyScreenSaver/...’.
  # ...

So, they will eventually stop at some point.

find ~/Library/Containers | wc -l
  # 89303
find -L ~/Library/Containers | wc -l
  # 3809469
gfind -L ~/Library/Containers | wc -l
  # 3809072

But still, this is extremely wasteful.

Removing -L makes a lot of sense in this particular case, but I'm not sure if it's generally desirable in everyday use cases. In my case, I rarely hit CTRL-T in the home directory, instead, I use it on an individual project root where such weird filesystem loops are not present. But that's just me and I can only speak for myself. Also, we've been using -L for 7 years by now (53d5d9d), so removing it and breaking backward compatibility is not an easy decision to make.

I'll leave this issue open. Let's hear what other users think. For the time being, you might want to set up FZF_CTRL_T_COMMAND to override the default command.

@timhillgit
Copy link

Let's hear what other users think.

Hopefully late feedback is better than none 😅 I've also been bitten by this. Typical uses for using fzf from the home directory for me are using Alt-C to quickly jump to a subfolder or wanting to search all my repositories at once. Removing -L from all my default commands (or using something like rg --files) has solved this for me and hasn't prevented me from finding any files I was looking for.

That said, nine years is a lot of precedent to overturn. I like @hraftery's idea of excluding these files based on macOS properties. Specifically, the Data directories that contain the problematic symlinks all have the com.apple.macl extended attribute. The container directories themselves have the com.apple.containermanager.uuid extended attribute. So when [[ "$(uname -s)" == "Darwin" ]] FZF_CTRL_T_COMMAND could be something like:

find -L -x * \( -name '.*' -o -xattrname com.apple.containermanager.uuid \) -prune \
 -o -type f -print 2> /dev/null

Happy to write up a PR if there's interest.

@timhillgit
Copy link

I have some numbers. From my home directory currently:

~ % time find -L . -mindepth 1 \( -path '*/.*' -o -fstype 'sysfs' -o -fstype 'devfs' -o -fstype 'devtmpfs' -o -fstype 'proc' \) -prune -o -type f -print -o -type d -print -o -type l -print 2> /dev/null | wc -l
 10156090
find -L . -mindepth 1 \( -path '*/.*' -o -fstype 'sysfs' -o -fstype 'devfs' -  27.99s user 42.01s system 48% cpu 2:24.44 total
wc -l  1.93s user 0.34s system 1% cpu 2:24.44 total

After the change:

~ % time find -L . -mindepth 1 \( -path '*/.*' -o -fstype 'sysfs' -o -fstype 'devfs' -o -fstype 'devtmpfs' -o -fstype 'proc' -o -xattrname 'com.apple.containermanager.uuid' \) -prune -o -type f -print -o -type d -print -o -type l -print 2> /dev/null | wc -l
 3185375
find -L . -mindepth 1 \( -path '*/.*' -o -fstype 'sysfs' -o -fstype 'devfs' -  10.74s user 68.69s system 88% cpu 1:29.69 total
wc -l  0.69s user 0.16s system 0% cpu 1:29.69 total

Is that worth it? Maybe? There are still a lot of files in the macOS ~/Library/ directory but this does reduce the search time by quite a noticeable amount. I've written up my suggestion as #3647

One more note: despite my original suggestion com.apple.macl cannot be used because it is present on too many other files.

@hraftery
Copy link
Author

Hmm, not as dramatic a result, which is surprising. Still, I think UX-wise, a 2.5 minute max wait is a lot more than a 1.5 minute wait.

My results (on a very different computer to my original post!) for your commands:

  • Original
    • Num files: 384,702,857
    • Time: 3:15:16.05 total (yes, that's hours)
  • With xattrname flag
    • Num files: 2,487,528
    • Time: 1:43.96 total

So far more dramatic. I don't know if that makes me special or you special. FWIW, I have 535 items in ~/Library/Containers.

@junegunn
Copy link
Owner

I agree that we should strive to provide a better default, but I'd like to limit the amount of platform-specific code as much as possible and keep the code short and simple, as they also serve as a reference implementation of things you can do with fzf.

I think many users these days who are concerned about the performance are using programs like fd or ripgrep that perform scanning in parallel.

This is the one I use.

export FZF_CTRL_T_COMMAND='fd --type f --type d --hidden --follow --exclude .git --strip-cwd-prefix'

Have you tried these commands? What do you think of them?

@timhillgit
Copy link

timhillgit commented Feb 27, 2024

I haven't used fd before but I have used ripgrep to great effect. So I'd understand not changing this behavior for macOS especially since there are good alternatives with good examples in the documentation.

One follow-up: what do you think about adding a --no-follow and perhaps a --hidden flag to fzf? Not strictly required since we can use FZF_DEFAULT_COMMAND but they would be a way to still use the new built-in fastwalk with slightly different behavior.

EDIT: Oh, I suppose one other option would be to cap the number of results returned by find by using e.g. head -n 1000000 but I'm not in love with that.

@junegunn
Copy link
Owner

junegunn commented Feb 27, 2024

One follow-up: what do you think about adding a --no-follow and perhaps a --hidden flag to fzf?

Related: #3464 / 208e556#r138774139

I was strictly against the idea of adding options for directory traversal, but I'm reconsidering it. The question is to what extent.

  • The default behavior is to list the files. So it would make sense to add an option to list directories, and another one to list both files and directories.
  • An option to control symlink following
  • An option to follow hidden directories or not (hidden files are listed by default)

But unfortunately, we would need to add a conditional branch.

if [[ -n $FZF_CTRL_T_COMMAND ]]; then
  eval "$FZF_CTRL_T_COMMAND" | fzf ...
else
  fzf --walker-all --walker-follow ...
fi | ...

EDIT: Oh, we could do it without the branch: FZF_DEFAULT_COMMAND=$FZF_CTRL_T_COMMAND fzf --walker...

@timhillgit
Copy link

@junegunn The new PR looks great. With this I don't think we'll need to worry about any macOS specific hacks, or if we do they can exist in one location rather than across three different find paths. Thank you!

@timhillgit
Copy link

Now that I've tested it the new walker flags are working very well for me. I can either ignore symlinks or specifically ignore the macOS problematic directories. Using --walker-skip .git,node_modules,.Trash,Containers,'Group Containers',Caches has made things super speedy, and I bet I could find some other directories to ignore.

@hraftery
Copy link
Author

hraftery commented Mar 14, 2024

I gave it a try. I ran brew update and brew upgrade fzf and got 0.48 without a hitch. I took note of the release notes about the completion/bindings changes, and so deleted my .fzf.* files and changed my .zshrc file to include eval "$(fzf --zsh)" instead.

I fired up a new shell and hit Ctrl-T. By default everything looks much the same (ie. all good and working) from the outside. I killed the process after it had some 25 million files.

I then went back to my .zshrc file and added:

export FZF_CTRL_T_OPTS="--walker file,dir,hidden"

which I gathered was the option to turn off symlinks. Fired up a new shell and voila! Ctrl-T finishes well under a minute with just over 3 million files.

So a couple of extra hoops, but a great result I think.

@hraftery
Copy link
Author

Oh, I forgot this is my issue, so am happy to close it with the documented workaround above. Feel free to add more breadcrumbs/caveats/oversights.

@junegunn
Copy link
Owner

junegunn commented Mar 14, 2024

@hraftery @timhillgit Glad to hear the new options are working well for you, thanks.

It would also help to add a custom --walker-skip option to your $FZF_DEFAULT_OPTS as suggested by @timhillgit as it will apply to all the key bindings and fuzzy completion, not just CTRL-T.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants