- 
                Notifications
    You must be signed in to change notification settings 
- Fork 381
Addresses #268 : local builds should respect .gitignore #269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
        
          
                repo2docker/buildpacks/base.py
              
                Outdated
          
        
      | tar.add('.', 'src/', filter=_filter_tar) | ||
| _exclude_tar = None | ||
| if os.path.exists(".gitignore"): | ||
| gitignore_fh = open(".gitignore") | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use a context manager here so that the file gets closed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typical docker builds use a separate .dockerignore and ignore .gitignore. Should we follow that pattern? I'd rather be as close to exact for what docker build would normally do as we can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@betatim : sure, I will push in another commit with a bit of refactoring, and also the tests.
@minrk : Thats more of a usability question. If repo2docker wants to be transparent to the user then, adding a separate .dockerignore might add to the confusion. You could add conditionals to check if both .dockerignore and .gitignore exist, and prioritize the .dockerignore, but if the repository just has a .gitignore then you use that instead.
| Could you add a test to check that a file that should get added gets added and one that shouldn't shouldn't? Do the others have any thoughts on adding a new dependency? Otherwise LGTM. | 
        
          
                repo2docker/buildpacks/base.py
              
                Outdated
          
        
      | tar.add('.', 'src/', filter=_filter_tar) | ||
| _exclude_tar = None | ||
| if os.path.exists(".gitignore"): | ||
| gitignore_fh = open(".gitignore") | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typical docker builds use a separate .dockerignore and ignore .gitignore. Should we follow that pattern? I'd rather be as close to exact for what docker build would normally do as we can.
        
          
                repo2docker/buildpacks/base.py
              
                Outdated
          
        
      | # same to the way `git` excludes files based on `.gitignore`. | ||
| # | ||
| # https://github.com/cpburnz/python-path-specification/issues/19 | ||
| return ignorespec.match_file(filepath) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It concerns me a bit that this isn't exact. Is there any possibility that we could end up excluding files that shouldn't be excluded, or does the bug you link exclusively mean that we will include some files we maybe shouldn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@minrk : pathspecs was the only pure python implementation I could find. Also, I think it does implement the pathspecs mostly correctly, but the difference comes from how git is a bit more liberal. For example, git is okay with referencing D/* to refer to all the subdirectories of D, whereas pathspecs expect you to write just D/.
In the issue I reported, and in most of the use cases, this will only mean we will forget to exclude some files which were meant to be excluded.
The only use case when a file that is not meant to be excluded is excluded, is when .gitignore specifically ignores a particular file, but the user had force-pushed the file to the repository.
But in the both these use cases, the .gitignore should ideally be modified to include the exceptions. For example, when you force-push say a *.csv file which was ignored by .gitignore, then you are expected to also remember to add a !<path_to_file> to .gitignore.
| Hoisting @minrk's inline comment to here: 
 I hadn't thought of this and it is a very good point. The pragmatist in me would like to merge the git and docker ignore files to make it "just work". But it probably brings more problems than it solves. On a binderhub we wouldn't ever see this as files in  | 
| @betatim : Well true, focusing on the binderhub use case, the  Going by the discussion here, it seems that while they considered  They seem to seem to handle it here, where the call the  The only major risk is that, their  | 
| I'm +1 on respecting  | 
| @spMohanty do you think you'll have time to re-tool this PR to use  | 
| @choldgraf : Sure. Happy to give it a go this weekend. | 
| Closing this as it has been inactive for over a year. We can make a new PR if the need still exists. | 
Fixes #268 (added by @betatim)