Skip to content

Conversation

@davidnmurray
Copy link
Contributor

@davidnmurray davidnmurray commented May 12, 2016

This is a fix for issue #1508 . The proposed fix changes filepath expansions in expand_filespecs() from relative paths to absolute ones.

In addition, the fix also adds a suitable test for the expand_filespecs function.

(Edit: removal of an unnecessary URL.)

@davidnmurray
Copy link
Contributor Author

I see I have some pep8 issues. I'll clean those up, then push through a new commit...

@davidnmurray
Copy link
Contributor Author

I've edited the test file and pushed the new version.

self.assertEqual(iris.io.expand_filespecs(
[os.path.join(self.tmpdir, '*')]),
[os.path.join(self.tmpdir, fname)
for fname in self.fnames])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is too complex to be easily readable. (The same is true for some other lines later in the file as well.) I'd recommend breaking it up into multiple lines and using descriptive names for the temporary variables. Perhaps:

    def test_absolute_path(self):
        result = iris.io.expand_filespecs([os.path.join(self.tmpdir, '*')])
        expected = [os.path.join(self.tmpdir, fname) for fname in self.fnames]
        self.assertEqual(result, expected)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks, @rhattersley Those are good suggestions; I'll have a look into editing along those lines.

@davidnmurray
Copy link
Contributor Author

Here is a new commit, with edits to improve the readability of the code, and also to remove any pep8 line length issues.

Returns:
A list of matching file paths. If any of the file-specs matches no
A list of matching absolute file paths. If any of the file-specs matches no
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we have completed this PR, I think we should look at making the sort order of the returned filenames consistent and expected. The behaviour I think we need:

def expand_filespecs(file_specs):
    return list_of_expanded_filenames_sorted_by_filespecs_input_order

@davidnmurray
Copy link
Contributor Author

davidnmurray commented May 16, 2016

On investigation of the previous Travis failures, some odd behaviour was noted from test_exapnd_filespecs.py. This behaviour consisted of alternating test successes and test failures, when the code was run locally. It appears that this could be down to the unordered output from the dictionary glob_expanded in --init--.py

So, the suggested change is to go from

if not all(value_lists):
    raise IOError("One or more of the files specified did not exist %s." %
    ["%s expanded to %s" % (pattern, expanded if expanded else "empty")
     for pattern, expanded in six.iteritems(glob_expanded)])

to

if not all(value_lists):
    raise IOError("One or more of the files specified did not exist %s." %
    ["%s expanded to %s" % (pattern, expanded if expanded else "empty")
     for pattern, expanded in sorted(six.iteritems(glob_expanded))])

(Edit for typos)

@davidnmurray
Copy link
Contributor Author

Interestingly, Travis appears to be finding the same set of problems this morning, even with yesterdays edit. I think this suggests that the next step is to substitute in something along the lines of @pelson 's suggestion.

@davidnmurray
Copy link
Contributor Author

Here is an edit to include a sorting term within expand_filespecs(). The theory is, if value_lists is provided in a consistent order then maybe this will act to regularise the output behaviour.

@davidnmurray
Copy link
Contributor Author

Thinking about it, this won't work. What sorted() is presumably going to do is take the collection of full file paths out of glob_expanded, and then rearrange them into alphabetical order. But, we need the files in the order that they're in filenames, not the full absolute paths.

@davidnmurray
Copy link
Contributor Author

And the Travis tests have failed, along with the same expand failure error as before, so sorted() is indeed not the way forward.

@davidnmurray
Copy link
Contributor Author

In fact, I now see that sorted() also automatically-sorts on case as well, putting upper-case letters first. So the string 'B.txt' will in fact be sorted in front of 'a.dat', which is an important subtlety.

@davidnmurray
Copy link
Contributor Author

I've been having a look at the insides of expand_filespecs() and I'm starting to think some of it is unnecessary. This statement:

glob_expanded = {fn : sorted(glob.glob(fn)) for fn in filenames}
value_list = glob_expanded.values()

just appears to be a complicated way of setting up the error message inside

if not all(value_lists):
    raise IOError("One or more of the files specified did not exist %s." %
    ["%s expanded to %s" % (pattern, expanded if expanded else "empty")
     for pattern, expanded in sorted(six.iteritems(glob_expanded))])

and thus might be creating an unnecessary problem, with the unpredictable-returns thing from the dictionary at glob_expanded.

@davidnmurray
Copy link
Contributor Author

Perhaps the way forward might be to get rid of this stuff entirely and implement some sort of alternate approach to setting up the error message?

@davidnmurray
Copy link
Contributor Author

Perhaps a better error-message might be something that makes sure that the expanded paths still include the original file names. With a term also to make sure that there isn't a null result or anything like that.

@marqh
Copy link
Member

marqh commented Nov 18, 2016

i think this is not going to be completed, so I am closing the PR

@marqh marqh closed this Nov 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants