Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix isInvalidFilename & always operate on the full passed in string #19004

Merged
merged 3 commits into from
Oct 20, 2021

Conversation

johnnovak
Copy link
Contributor

@johnnovak johnnovak commented Oct 16, 2021

This is a breaking change, but isInvalidFilename in its current form can be arguably considered broken anyway (e.g. it wouldn't pick up "foo/bar" as invalid, which defeats the purpose of the function).

Also fixes segfault on empty strings.

@johnnovak johnnovak force-pushed the jnovak/fix-is-valid-filename branch 2 times, most recently from 13a9105 to 62578fb Compare October 16, 2021 04:06
@juancarlospaco
Copy link
Collaborator

I do not think this is correct, because basically you can mount pretty much anything as a path in Linux.
But checking the filename only is cheap and simple code.

"foo/bar" is a valid relative path in several OS (Linux, BSD, etc).

Copy link
Contributor

@dom96 dom96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"foo/bar" is a valid relative path in several OS (Linux, BSD, etc).

But it's not a valid filename. I think these changes make sense unless I'm missing something. See my comments though for suggested changes.

lib/pure/os.nim Outdated
Comment on lines 3505 to 3509
assert not isValidFilename("con") # "CON" is invalid (Windows, case-insensitive)
assert not isValidFilename("foo:bar") # ":" is invalid (Mac)
assert not isValidFilename("AUX") # "AUX" is invalid (Windows)
assert not isValidFilename("foo/bar") # "/" is invalid (Linux, Windows)
assert not isValidFilename("foo\\bar") # "\" is invalid (Windows)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need when defined(windows) etc here. At that point it may be better to move this into a test file (runnable examples should be fairly short and easily understandable, not thorough test suites)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I need any when defined(<platform>) stuff at all because isValidFilename checks for invalid chars that is the union of all invalid chars on Linux, Win & Mac. To put it another way, if there's not platform specific stuff in the implementation, why should we have it in the tests?

Also, I just extended the existing examples slightly. Maybe I should just remove the "(Windows)", "(Mac)", etc. parts from the comments?

Copy link
Contributor Author

@johnnovak johnnovak Oct 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"foo/bar" is a valid relative path in several OS (Linux, BSD, etc).

But it's not a valid filename.

Exactly. What the function should do: can I create a file with this name on all three platforms (e.g. by literally using this string as filename in the file creation call). If yes, return true, otherwise false. That's it, nothing more, nothing less.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I need any when defined() stuff at all because isValidFilename checks for invalid chars that is the union of all invalid chars on Linux, Win & Mac. To put it another way, if there's not platform specific stuff in the implementation, why should we have it in the tests?

Oh, I thought the behaviour was different depending on the platform.

@Varriount
Copy link
Contributor

I believe I've commented before on this - attempting to test whether a string is a "valid" filename is practically impossible to do, not to mention usually pointless. What is considered "valid" depends not only on the operating system, but also the underlying filesystem (and possibly user permissions).

@johnnovak
Copy link
Contributor Author

johnnovak commented Oct 17, 2021

I believe I've commented before on this - attempting to test whether a string is a "valid" filename is practically impossible to do, not to mention usually pointless. What is considered "valid" depends not only on the operating system, but also the underlying filesystem (and possibly user permissions).

I see your point, but in 99% of the cases the method would just do it's job fine for current Windows, Linux and OS X. That's what the whole Nim stdlib is trying to do, cater for the most common "sweet spot" scenarios and basically ignore all sorts of edge cases and uncommon variations.

E.g. I'm writing a cross-platform app now that runs on current OSs (no esoteric file systems and old OS versions), and it's just handy to have this method because then if users on different OSs share their documents they won't run into weird problems with the filenames.

So I don't think this method is pointless, and it's quite possible to validate filenames that will just work fine across the standard filesystems of the major current OSs (again, who is really running non-standard filesystems on Mac or Windows.... maybe 0.0000001% of users...)

@arnetheduck
Copy link
Contributor

Another nuance here is that filenames are typically valid for the combination of OS and underlying filesystem - ie what's valid on a FAT partition is not the same as a ext4 or ntfs. Both on windows and unix you can mount filesystems in arbitrary locations, and those can be network file system, so what's valid depends on where the file goes - if you want to 99% guarantee a valid name, you need to go 8.3 characters and avoid a bunch of special cases, but this of course is unnecessarily limiting in other cases.

In terms of what's included in a standard library, it seems wrong to promote the use of completely arbitrary rules that are both wrong in some cases and overly confining in others - how about deprecating it instead, and documenting the situation in "openfile"?

@Araq
Copy link
Member

Araq commented Oct 17, 2021

So I don't think this method is pointless, and it's quite possible to validate filenames that will just work fine across the standard filesystems of the major current OSs (again, who is really running non-standard filesystems on Mac or Windows.... maybe 0.0000001% of users...)

It's a classical stdlib design mistake: "I needed this code and it's generally useful" (which code isn't btw, you can always generalize the problem to make it worth adding to some library). And indeed for your application the code came in handy. Unfortunately, it's also "wrong", so now we can either "fix" it and break somebody else's application that secretly relied on the fact that slashes are valid or we can "not fix" it and you have to copy&paste a fixed version of it into your application codebase which is where it belongs, frankly.

@johnnovak
Copy link
Contributor Author

johnnovak commented Oct 18, 2021

Unfortunately, it's also "wrong", so now we can either "fix" it and break somebody else's application that secretly relied on the fact that slashes are valid or we can "not fix" it and you have to copy&paste a fixed version of it into your application codebase which is where it belongs, frankly.

Yeah that sounds reasonable, I almost suggested to just remove it but then I decided to "fix" it. Maybe we should deprecate it though at least to discourage people from using it, then perhaps remove later?

EDIT: Actually, at least my segfault fix should go in (plus the minor improvements), if we want to keep it (empty strings or strings ending with a slash currently segfault). Happy to rework if we all agree that's the path we want to take.

@Araq
Copy link
Member

Araq commented Oct 18, 2021

EDIT: Actually, at least my segfault fix should go in (plus the minor improvements), if we want to keep it (empty strings or strings ending with a slash currently segfault). Happy to rework if we all agree that's the path we want to take.

Yeah, let's do it this way, fix indexing bugs and also deprecate it.

@johnnovak
Copy link
Contributor Author

EDIT: Actually, at least my segfault fix should go in (plus the minor improvements), if we want to keep it (empty strings or strings ending with a slash currently segfault). Happy to rework if we all agree that's the path we want to take.

Yeah, let's do it this way, fix indexing bugs and also deprecate it.

Done.

lib/pure/os.nim Outdated Show resolved Hide resolved
lib/pure/os.nim Outdated Show resolved Hide resolved
johnnovak and others added 2 commits October 20, 2021 11:19
Co-authored-by: konsumlamm <[email protected]>
Co-authored-by: konsumlamm <[email protected]>
@Araq Araq merged commit 62701cd into nim-lang:devel Oct 20, 2021
PMunch pushed a commit to PMunch/Nim that referenced this pull request Mar 28, 2022
…im-lang#19004)

* Fix isInvalidFilename segfault & deprecate the method

* Update lib/pure/os.nim

Co-authored-by: konsumlamm <[email protected]>

* Update lib/pure/os.nim

Co-authored-by: konsumlamm <[email protected]>

Co-authored-by: konsumlamm <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants