-
-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
skip-gitignore: use allow list, not deny list #1900
skip-gitignore: use allow list, not deny list #1900
Conversation
Struggeling with the same problem you described in #1895 (~90s run time an hour ago for ~10k lines of code across 107 files), thus this post: This reduces runtime a bit, but doesn't fully solve it for me. On my codebase, with Using It looks to me as if the following execution flow doesn't ignore a .gitignored directory:
# git_ls_files are good files you should parse. If you're not in the allow list, skip.
if (
git_folder
and not file_path.is_dir()
and str(file_path.resolve()) not in self.git_ls_files[git_folder]
):
return True
return False so |
Yup @he3lixxx you're right that this PR does not completely fix the problem. I wrote this fix as a performance improvement that specifically targeted my use case: Before my PR, the runtime was O(a + b). With this PR it's O(a + c + d). That's a lot faster for me! But in an ideal world, it would only be O(c). And if you're running it on the whole repo like you are it's not much faster at all.
You're right. The reason I wrote it that way is that git does not have an easy way to get a list of ignored / tracked directories, only tracked files. A more correct way would be:
That would probably work & have good performance. It would skip ignored directories which is the main issue. The most correct solution would be to remove @he3lixxx does that make sense? I could update this PR to include that. But I kind of want to get the maintainers to take a look at it & be ok with the general idea before trying to optimize (& obfuscate) it further. |
Improve the performance of
--skip-gitignore
by enumerating all tracked files instead of all ignored files.Before:
os.walk
for every target directoryos.walk
to walk over all tracked and ignored filesgit check-ignore
on all those files to narrow it to only ignored filesAfter:
os.walk
for every target directorygit ls-files
to get a list of all tracked filesCloses #1895.