-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ReadStream/WriteStream data race #62
Conversation
…d from ReadStream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please add a test which fails on current master and passes on your branch :)
// By calling os.Remove, we unlink the existing file, but since the process | ||
// still has a handle on the file, the existing readers can continue to | ||
// read from it. Meanwhile we're free to update what's actually on-disk. | ||
os.Remove(d.completeFilename(pathKey)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if Remove returns an error? Is it safe to ignore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the time, if Remove returns an error, it's because the file doesn't exist. If some other error comes up, I think the best thing to do is just try and go ahead with the OpenFile, see if that works or not.
// read from it. Meanwhile we're free to update what's actually on-disk. | ||
os.Remove(d.completeFilename(pathKey)) | ||
|
||
mode := os.O_WRONLY | os.O_CREATE | os.O_TRUNC // overwrite if exists somehow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a minimum, this comment should be updated; might also want to change the mode itself, given we know (assume) the file isn't there? Is that a good idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned above, if Remove somehow fails and the file still exists, we definitely want O_TRUNC, even though the most likely case is that OpenFile will also then fail (I've never heard of a situation where you can open a file to write but cannot remove it)
os.Remove(d.completeFilename(pathKey)) | ||
|
||
mode := os.O_WRONLY | os.O_CREATE | os.O_TRUNC // overwrite if exists somehow | ||
f, err := os.OpenFile(d.completeFilename(pathKey), mode, d.FilePerm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... what do you think about creating the new file as d.completeFilename(pathKey) plus some random extension, and then relying on the existing behavior in writeStreamWithLock to atomically move it to the proper filename? That would accomplish the same thing, I think, and avoid this intermediate phase...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This brings up an excellent point: the TempDir flag already fixes this problem by doing basically that.
The problem is that if you instantiate a diskv without setting the TempDir option, you cannot safely use ReadStream, which seems bad.
What's your opinion on making createKeyFileWithLock always write to d.completeFilename(pathKey)+extension? We would leave TempDir in the Options struct, but note that it is deprecated since all writes are now atomic.
If you have many slow readers using ReadStream, coupled with frequent writes, you will end up consuming additional disk space because the OS is keeping the old files around until the readers complete... but the alternative is handing back invalid data to the readers, which seems much worse to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's your opinion on making createKeyFileWithLock always write to d.completeFilename(pathKey)+extension?
Yeah, I think I like this better. (Assuming the +extension bit is the random extension I mentioned.)
If you have many slow readers using ReadStream, coupled with frequent writes, you will end up consuming additional disk space because the OS is keeping the old files around until the readers complete...
I think this is basically unavoidable. If it became a problem, you could maybe interrupt readers.
You OK with this change?
I created #63 to track this more easily. |
Test has been pushed. I've verified that it works with my new code and fails on master. |
See #64 |
Handle possible race where WriteStream can mess up the data being read from ReadStream.