Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't fail createDirectory if directory is created concurrently by another process #1021

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -269,7 +269,18 @@ extension _FileManagerImpl {

let parent = path.deletingLastPathComponent()
if !parent.isEmpty {
try createDirectory(atPath: parent, withIntermediateDirectories: true, attributes: attributes)
do {
try createDirectory(atPath: parent, withIntermediateDirectories: true, attributes: attributes)
} catch {
var isDirectory: Bool = false
if fileManager.fileExists(atPath: path, isDirectory: &isDirectory) && isDirectory {
Copy link
Contributor

@jmschonfeld jmschonfeld Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can, I think it'd be best to remove the TOCTOU discrepancy entirely rather than introducing a re-check. What do you think about removing the existing (and this new) fileExists check and just relying on the error produced by the call to createDirectory(atPath: parent)? Theoretically if we remove the earlier check, then this function will throw an error with the .fileWriteFileExists (or something similar, would have to double check) code. We could ignore that code here and re-throw any other errors so that we minimize the amount of times we might have the file system change from under us during a check. Do you think that might work well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable to me. I just thought that there was a reason that we checked for fileExists beforehand (maybe cheaper than calling createDirectory if it already exists). Happy to change it if you think running createDirectory is the better solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that's possible - @compnerd might have additional insight into why we do that today that I might not be aware of

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, this was due to the inability to detect the the existing item (directory vs file vs junction (soft/hard link). But, if we can avoid the TOCTOU, perhaps it is worth trying to see if we can determine that postfacto?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see - because we need to fail if it already exists as a file but we succeed if it already exists as a directory? Yeah I'm not certain if the CreateDirectoryW function would let us do that, but perhaps that's worth looking into here. My main worry with the current proposed change is that while it protects against a single, concurrent mutation to the contents on disk, I don't think it's robust against multiple, concurrent changes to disk so we'll likely see it pop up again 🙁

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked and CreateDirectoryW behaves the same if a directory or a file already exists at the path. In both cases it returns false and GetLastError() is ERROR_ALREADY_EXISTS, so I think we need to have a two step process.

My main worry with the current proposed change is that while it protects against a single, concurrent mutation to the contents on disk, I don't think it's robust against multiple, concurrent changes to disk so we'll likely see it pop up again 🙁

How do you think that multiple concurrent changes might cause an issue? I think we should be fine because:

  • If the directory exists at the first fileExists check, then it existed at point during the execution of createDirectory. If it gets deleted afterwards, that’s the same as if the directory gets delted after the call to createDirectory(atPath:withIntermeidateDirectories:attributes), so that’s fine
  • If createDirectory(parent) fails, then we call fileExists again. If that returns true we’re in the above case again, so we’re fine. If fileExists is false, then it doesn’t really matter why createDirectory failed, the end result is that we don’t have directory as requested and we should thus fail.

Alternatively, what do you think of the following approach:
We call createDirectory. If that fails, we check if we already have a directory (using fileExists(atPath:isDirectory:) and if that’s true, then we return, otherwise we throw the error. The downside of this that we now need run two file system operations if the parent directory already exists, which is probably the common case.

So, I think I prefer to stay with the solution proposed in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multiple, concurrent mutations scenario I was imagining was something like:

  1. This thread checks fileExists, which indicates it doesn't exist
  2. Another thread concurrently creates the parent
  3. We attempt to create the parent, which fails because it exists
  4. The other thread concurrently deletes the parent
  5. We check if the parent exists, and it doesn't so we throw an error indicating the directory already exists when it doesn't.

This wouldn't be uncommon for situations like creating/using/deleting a temporary directory with a non-unique name. Thinking on it, perhaps it's not the end of the world - either way we're going to be failing here it's just the error thrown which is different, but the introduction for new possibilities of TOCTOU discrepancies just makes me worried that we might not be seeing another issue in here.

One more question just to try to avoid the TOCTOU discrepancies: do we need to perform this new check, or should we just check the error code of the thrown error? If the error thrown from createDirectory on the parent is fileWriteFileExists we know there's no way to determine if it was a directory or not, but does it matter? What if we just ignored it either way? If it was a directory, we create the child and it should work (unless the other thread created it with different, less permissive attributes). If it wasn't, well the next call to create the child will fail anyways. In the end it doesn't necessarily change a whole lot of behavior from the client, but it saves an extra file system check and reduces the possibility that this somehow turns into a security issue in the future due to the TOCTOU (we're all too familiar with those on Foundation). @ahoppen @compnerd is there anything I'm missing there, or does that seem reasonable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don’t have the second check, then we could get into the following scenario

  1. This thread checks fileExists, which indicates it doesn't exist
  2. Another thread creates a file (not directory) at the same path
  3. We attempt to create the parent, which fails because an item at this path already exists
  4. The error is fileWriteFileExists so we return success
  5. The client tries to create a file within the directory but that fails because there’s a file at the path, not a directory.

I think with any solution we will either have this scenario or the one you described in the last comment. I’ll leave it up to you to decide which one you think is worse.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we might be able to circumvent the issue altogether by using SHCreateDirectoryExW instead of CreateDirectoryW, which creates the intermediate directories as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #1033 as an alternative to this PR, which uses SHCreateDirectoryW to create intermediate directories as well and thus doesn’t need the manual parent directory creation.

// `createDirectory` failed but we have a directory now. This might happen if the directory
// is created by another process between the check above and the call to `createDirectory`.
// Since we have the expected end result, this is fine.
} else {
throw error
}
}
}
}

Expand Down