Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pattern for files to ignore? #5

Open
danilobellini opened this issue Apr 27, 2014 · 1 comment
Open

Pattern for files to ignore? #5

danilobellini opened this issue Apr 27, 2014 · 1 comment

Comments

@danilobellini
Copy link
Owner

I don't like the way the "file pattern glob" is used today in dose.

There's already one common file to store a list of ignoring patterns: .gitignore. One can get several examples here.

Why not use the same idea to decide whether a file change should be neglect in dose?

But one can also think on a reversed approach: why not watch files only when they're at least staged on git? Explicitly, only a file in git ls-files would be watched.

...and a Mercurial user wouldn't be happy either. For the ignored file, dose would be changed to ignore watching files that matches both .gitignore and .hgignore contents, if any, without defaults. Obviously, these files can be changed on the fly as well, and such a change would change the whole filtering going on.

For now, this issue is still an open question about how the "file pattern glob" configuration should be changed to be better. How should we configure dose to allow (without bureaucracy, i.e., "convention over configuration" + DRY):

  1. Neglecting all but a few globs/patterns/regexes
  2. Ignoring none but some few globs/patterns/regexes
  3. Having useful defaults (e.g. ignore the .git directory as usually no one changes its contents manually; use the .gitignore and .hgignore files)
  4. Performing some inference about whether a file change means something or not without "losing control" of what is being done
@danilobellini
Copy link
Owner Author

I think there could be both a "white list" and a "black list" behavior for dose.py file watching patterns, called watch and ignore (or something alike). By default, .gitignore and .hgignore should be in the ignore/black list behavior. The .dose.conf config file would have:

{
    "pattern": [
        {
            "action": "ignore",
            "value": "*.py[co]",
        },
        {
            "action": "ignore",
            "type": "hgignore",
        },
        {
            "action": "ignore",
            "type": "regex",
            "value": "^\.",
        },
        {
            "action": "watch",
            "type": "regex",
            "path_type": "absolute",
            "value": ".*test.*",
        },
        {
            "action": "ignore",
            "type": "regex",
            "path_type": "filename",
            "value": "^[2-9]{2,3}.*\.py$",
        },
        {
            "action": "watch",
            "type": "glob",
            "value": "*",
        },
    ]
}

One list of dictionaries, which is read in order: if appears an ignore action, file is ignored, even if there's a watch that would apply to that file afterwards. The same should be said for the other way around: if appears a watch that accepts a file name, it should be watched, no matter if a ignore that appears afterwards would remove it. In short, it should look for a match and break after finding the first, no matter the action.

Each dictionary should have at least the action key, this isn't optional. The only valid values for the action key are ignore and watch. Any other value for action is an error but it must avoid annoying the user: a simple stderr message sounds enough, without killing the app nor popping-up message boxes.

The type key should be either:

  • glob (default): shell-style glob;
  • regex, regexp or re: regular expression using Python re module syntax;
  • gitignore_shallow: the value key for this one is optional. If explicit, it will be one single file with a pattern in each line for the matching, following the gitignore syntax. For this type, value is .gitignore by default.
  • hgignore: the same as gitignore, but for the Mercurial hgignore syntax. For this type, value is .hgignore by default.
  • gitignore_deep: allow the multiple .gitignore files in each child directory following git behavior for these.
  • git_repo_files: call git ls-files each time a file changes to know whether it is in such "white/black list". In this case, a key value makes no sense.
  • hg_repo_files: same for git_repo_files, but calls hg status -qnA instead.

Like the action, an invalid type value should be neglect, warning the user without annoying him with pop-ups nor killing the app.

About the value, it should be just one single pattern, one single filename, or shouldn't be declared, depending on the type. For regex values, ^ and $ aren't implicit, i.e., the regex should match just a part of the string, not it fully. For glob values, probably the fnmatch Python module wouldn't be enough, it matches hidden files (starting with the dot .) without the need to explicitly use the dot on the glob: the specs should follow the shell-style glob. On the other hand, a hgignore or gitignore that doesn't exist as the value shouldn't warn the user.

The path_type is optional, used only by globs and regexes, behaving:

  • filename (default for globs without the / char): Try to match only the file name changed, without its path
  • absolute (default for globs starting with the / char): Use the os.path.abspath for the changed file for trying the match
  • directory (default for globs ending the / char): a/b/ would try to match with a/, a/b/
  • relative (default for regexes and other globs): The filename relative to the current directory, such as a/b/file

The relative path type is useful to allow matching/ignoring a glob pattern like *.ext only when directly on the watching directory. A glob like ./*.pyc would do the same. An absolute_directory path type might be also needed for completeness.

The os.path.sep symbol should be used instead of / (or perhaps both), but Windows would still need some other workaround for the absolute globs (which would start with a drive letter).

For the CLI, these would become an option per action, like --ignore or --watch. These obviously can repeat. The value for these is given as one single string that includes with type, path type and value mixed together without the key names, separated by ; and with everything as optional as described above, except the value, which is always the last parameter in the colon-splitting. The value should be left blank if not needed or to access defaults. That would allow declaring patterns with values like "--ignore gitignore_shallow;" or "--watch regex;filename;...a*.py"

For the GUI, a new screen should be done to replace the current dialog used for colon-separated globs, giving all the options above described. I first thought on it as three combo boxes and one edit box (for the value) in a table-like (or sheet-like) widget, but this sounds hard to follow with the keyboard. Perhaps keeping a CLI-like syntax in the GUI would be more helpful, e.g. writing the JSON proposed at the beginning as a multiline text box with the CLI options for it:

--ignore *.py[co]
--ignore hgignore;
--ignore regex;^\.
--watch regex;absolute;.*test.*
--ignore regex;filename;^[2-9]{2,3}.*\.py$
--watch *

That's still a sketch for a specification, any idea is welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant