-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make write safer #1416
Make write safer #1416
Conversation
require False = matches `\.\..*` path | ||
else failWithError "Attempt to write outside of the workspace" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably needs to be run through a relativizer of some sort (I don't think it is yet, but I might have missed something) to catch subdir/../..
. It's also filtering out things like ..inRoot
which are most definitely bad names but which are still in the workspace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All paths leading into this function simplify the path before passing it to the implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks!
require Pair (Pass inFile) _ = | ||
write specFilePath (prettyJSON json) | ||
| rmap getPathName | ||
else Pair (Fail (makeError "Failed to 'write {specFilePath}'.")) "" | ||
| addErrorContext "Failed to 'write {specFilePath}: '" | ||
| (Pair _ "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the Pair
for? It seems to be unconditionally added with no meaningful data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type of this function is Pair (Result ...) String
because its a runner pre function. You can see that previously the same Pair was being constructed in the else
branch. I just made the error message nicer while working around the return type being a Pair
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, now that I'm seeing the mixed lines, it looks like something to fit the else case.
else failWithError "Attempt to write to an absolute path" | ||
|
||
# Source files should never be deleted so we check for this case | ||
def scan dir regexp = prim "sources" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling prim "sources"
for every call to write
is expensive no? Do we cache the result somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wake reads the full list at the start and then does a linear regex scan through it. The linear scan could be improved to be a Tree
in this case since we don't need a regex but this is the same cost we incur when we source
a file today so I think its fine? I'll test it on a larger build to see what the effect is.
We should try to get this into a wake release |
1. write will no longer allow "." or "" as a write location 2. write will only unlink files, and not do a deep unlink. This means that two different build options might conflict in wake but I think that's bad style anyway 3. write will not write outside of the root workspace. This will break some rare use cases but they can be replaced with a bespoke job instead 4. write will not overwrite a source file
error += | ||
" is a directory and cannot be overwritten. If this is intentional please manually " | ||
"delete this directory"; | ||
size_t len = std::min(error.size(), max_error); | ||
String *out = String::claim(runtime.heap, error.c_str(), len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is running into max_error
problems, and really doesn't look good when it does: Fail (Error "src is a directory and cannot be overwritten. If this is intentional please manually delete this direct" Nil)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh crap, thanks for finding and debugging that! We should increase max error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's always going to be some problem if we simply bump the value. Using an unrealistically low max_error
to stand in for a longer path than I want to type: Fail (Error "very/long/path/to/some/fi" Nil)
. We need to trim the filepath (with some indication that it's been trimmed) without risking the error message itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably won't be prioritizing that level of quality fix but I'd love to see that sort of thing go in. We can add in the the size of the path instead I think. Paths on Linux are not allowed to be any longer than 4096 so we can set it to something like 100 + std::max(path.size(), 4096).
Recently we had a very small mistake (swapping the arguments of
write
) cause total workspace data loss. This is unacceptable for wake. This change makes write safer via the following measures:write
will no longer allow "." or "" as a write locationwrite
will only unlink files, and not do a deep unlink. This means that two different build options might conflict in wake but I think that's bad style anywaywrite
will not write outside of the root workspace. This will break some rare use cases but they can be replaced with a bespoke job insteadwrite
will not overwrite a source fileIt would be nice to add an additional check that
write
does not overwrite anything previously hashed but this is a bit tricky to do and thanks to the fact that we don't deep unlink anything now, I think this should be fine for now as data loss is quite limited and it would be unusual for the content to 1) be a valid file path and 2) point to something the user intended to keep.