Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace std::os::glob with extra::glob, which is written in rust ... #8201

Closed
wants to merge 7 commits into from

Conversation

Dretch
Copy link
Contributor

@Dretch Dretch commented Aug 1, 2013

... fixing issue #6100. This also addresses #6101.

@huonw
Copy link
Member

huonw commented Aug 1, 2013

How does one match a literal ]?

'[' => {
let mut chars = ~[];
let is_except = match pattern_iter.next() {
None => false, // let the following loop fail with a message
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless things have changed in the past 2 weeks, the iterator contract does not require that an iterator that returns None once continue to return None forever. In fact, the iterator contract specifies nothing at all about the behavior after None has been returned once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you state appears to be true. Maybe the Iterator docs should be more explicit about this situation. I can rework the glob code to avoid the issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dretch After discussion with @thestinger it sounds like he wants the iterator protocol to actually state that once None is returned, future calls to next() will continue to return None.

@lilyball
Copy link
Contributor

lilyball commented Aug 1, 2013

I'd love to see a public function equivalent to fnmatch(), not just globbing. This would basically mean making a public named type for ~[PatternToken] and making pattern_matches() public (and probably renaming it too).

@lilyball
Copy link
Contributor

lilyball commented Aug 1, 2013

Also, your implementation of the glob is inefficient when it hits AnySequence. It loops until the end of the string, trying to match at each position. This is fine for patterns that only have a single *, but any pattern that has multiple * will perform badly.

Instead, you can split your pattern on all instances of * (and coalesce them too.. foo**bar is equivalent to foo*bar) and match each sub-pattern in sequence. Once you find a match for a subpattern, never backtrack. Because * matches anything, if a subpattern matches at one location, but the overall pattern does not, trying to re-match that subpattern at a later location will never cause the overall pattern to match.

pattern
}

fn pattern_matches(mut file: &str, pattern: &[PatternToken]) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that pattern_matches lacks some considerations required by Patterns Used for Filename Expansion in SUSv4.
For example, foo/.bar shall not be matched by the pattern foo/*.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Special behaviour for dot-files would at least be consistent with the python glob module.

@Dretch
Copy link
Contributor Author

Dretch commented Aug 2, 2013

@huonw

Literal ] can be matched by []]

@kballard

Regarding fnmatch(): how about introducing a Pattern struct with a constructor that takes &str and a match(pattern: &str) method, and hiding the ~[PatternToken] inside of it?

I will also try and have a look at making * matching more efficient.

@lilyball
Copy link
Contributor

lilyball commented Aug 3, 2013

@Dretch: fnmatch() in C has a few options. The ones that I think would be useful to provide here are FNM_PATHNAME (don't match / with * or ?, the pattern needs an explicit /), FNM_PERIOD (a leading period in a path component (if FNM_PATHNAME is set) or in the string must be matched by an explicit period in the pattern), and FNM_CASEFOLD (case insensitivity).

If you can make a match(pattern: &str, opts: something) function that provides those options in some fashion, that would be ideal.

@omasanori
Copy link
Contributor

Note that FNM_PATHNAME | FNM_PERIOD is the glob's behavior.

@kballard there is also FNM_NOESCAPE.

@lilyball
Copy link
Contributor

lilyball commented Aug 3, 2013

@omasanori There's also FNM_LEADING_DIR, but I don't think those two options are particularly useful.

there was no way to match NOT ']').

Also do some refactoring to avoid relying on the vector
iterator always returning None after returning None for the
first time (as requested by @kballard).
@omasanori
Copy link
Contributor

@kballard I agree with you. (FNM_NOESCAPE might be used in Windows which uses \ as path separator, but I'm unsure.)

method that acts something like the C fnmatch function
(albeit without any options... yet).
and Pattern::matches, and provide a case_sensitive
option that defaults to true.
to the FNM_PATHNAME flag in the libc fnmatch function
@catamorphism
Copy link
Contributor

@Dretch Needs a rebase.

@catamorphism
Copy link
Contributor

Closing due to lack of activity. Please reopen if you have time to rebase it!

@Dretch
Copy link
Contributor Author

Dretch commented Aug 21, 2013

I am still working on this (albeit slowly)! I have some more commits coming (including a rebase).

I can't see any way to reopen this pull request - should I start a new one?

bors added a commit that referenced this pull request Sep 6, 2013
This is #8201 with a bunch of amendments to address the comments (and re-based).
flip1995 pushed a commit to flip1995/rust that referenced this pull request Jan 13, 2022
Change `unnecessary_to_owned` `into_iter` suggestions to `MaybeIncorrect`

I am having a hard time finding a good solution for rust-lang#8148, so I am wondering if is enough to just change the suggestion's applicability to `MaybeIncorrect`?

I apologize, as I realize this is a bit of a cop out.

changelog: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants