Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --blob-exec to run system commands #83

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

pauldraper
Copy link

git-filter-branch docs, describing BFG:

The command options are much more restrictive than git-filter branch, and dedicated just to the tasks of removing unwanted data- e.g: --strip-blobs-bigger-than 1M.

This is true. BFG's simple and easy options are (naturally) limited to a subset of all possible data manipulations.

This change adds --blob-exec which can execute any system command or script, while still taking advantage of BFG's killer optimizations of (1) parallelism and (2) per-blob (rather than per-commit) rewrites. (Though it does concede the advantage of in-process-only operations.)

Users can use any scripting language or tool; Scala, awesome though it is, isn't well-known, and most Git users -- particularly the ones doing filter-branch type stuff -- are more familiar with command line tools.


A few use cases:

  • I want to convert all tabs in my project to spaces, via the Unix expand utility. (This is more than a simple regex, as it considers mid-line tab positions.)
  • I want to convert all Windows line ending to Unix line endings with dos2unix. This is already possible with a replace expressions \r(\n)==>$1, though that takes some trickery to figure out.
  • I want to run a code formatter like scalariform or js-beautify on each file.

There may be some improvements to this.

  • Perhaps the filter options also apply?
  • This uses Java's Runtime.exec whose tokenizer only uses spaces -- no quotes mechnanism. Git uses the default shell (somehow) in a portable way, which is nicer.
  • Documentation

@jfoliveira
Copy link

Nice one! Very cool new feature!

Just curious: would the blob exec script also apply to filter the blob out of the tree?
Something like:
git filter-branch -f --tree-filter myFilterScript

My use case is renaming thousands of files in a large repository and keep the file history reachable without the need of using the --follow flag on git log to reach history prior to renaming.

@copumpkin
Copy link

Can we merge this? @rtyley any feedback? Some projects even manually apply this patch to BFG and recompile it from source because this PR isn't merged.

@copumpkin
Copy link

@copumpkin
Copy link

Or probably merge #169 instead, but one of them would be useful.

@dmgerman
Copy link

dmgerman commented May 1, 2017

I am the author of the patch linked above (cregit). I have emailed both roberto and paul about it, but never got a response from either one.

The big issue with the Pau's patch is that it is incomplete and does not properly handle the spawning of the process. My patch works well under linux (Paul code doesn't). I think it still needs testing. Also, I think its use needs a bit of know-how of how bfg works, since it cannot filter specific files by full path, only by basename. I will be happy to help making this patch part of the distribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants