-
Notifications
You must be signed in to change notification settings - Fork 824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up core WASI FS code #2247
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -784,7 +784,7 @@ impl WasiFs { | |
let file_type = metadata.file_type(); | ||
// we want to insert newly opened dirs and files, but not transient symlinks | ||
// TODO: explain why (think about this deeply when well rested) | ||
let mut should_insert = false; | ||
let should_insert; | ||
|
||
let kind = if file_type.is_dir() { | ||
should_insert = true; | ||
|
@@ -803,6 +803,7 @@ impl WasiFs { | |
fd: None, | ||
} | ||
} else if file_type.is_symlink() { | ||
should_insert = false; | ||
let link_value = file.read_link().ok().ok_or(__WASI_EIO)?; | ||
debug!("attempting to decompose path {:?}", link_value); | ||
|
||
|
@@ -957,18 +958,15 @@ impl WasiFs { | |
Ok(cur_inode) | ||
} | ||
|
||
/// Splits a path into the first preopened directory that is a parent of it, | ||
/// if such a preopened directory exists, and the rest of the path. | ||
/// | ||
/// NOTE: this behavior seems to be not the same as what libpreopen is | ||
/// doing in WASI. | ||
/// | ||
/// TODO: evaluate users of this function and explain why this behavior is | ||
/// not the same as libpreopen or update its behavior to be the same. | ||
/// Splits a path into the preopened directory that has the largest prefix of | ||
/// the given path, if such a preopened directory exists, and the rest of the path. | ||
fn path_into_pre_open_and_relative_path( | ||
&self, | ||
path: &Path, | ||
) -> Result<(__wasi_fd_t, PathBuf), __wasi_errno_t> { | ||
let mut max_prefix_len = 0; | ||
let mut max_stripped_path = None; | ||
let mut max_po_fd = None; | ||
// for each preopened directory | ||
for po_fd in &self.preopen_fds { | ||
let po_inode = self.fd_map[po_fd].inode; | ||
|
@@ -978,46 +976,23 @@ impl WasiFs { | |
_ => unreachable!("Preopened FD that's not a directory or the root"), | ||
}; | ||
// stem path based on it | ||
if let Ok(rest) = path.strip_prefix(po_path) { | ||
// if any path meets this criteria | ||
// (verify that all remaining components are not symlinks except for maybe last? (or do the more complex logic of resolving intermediary symlinks)) | ||
// return preopened dir and the rest of the path | ||
|
||
return Ok((*po_fd, rest.to_owned())); | ||
} | ||
} | ||
Err(__WASI_EINVAL) // this may not make sense | ||
} | ||
|
||
// if this is still dead code and the year is 2020 or later, please delete this function | ||
#[allow(dead_code)] | ||
pub(crate) fn path_relative_to_fd( | ||
&self, | ||
fd: __wasi_fd_t, | ||
inode: Inode, | ||
) -> Result<PathBuf, __wasi_errno_t> { | ||
let mut stack = vec![]; | ||
let base_fd = self.get_fd(fd)?; | ||
let base_inode = base_fd.inode; | ||
let mut cur_inode = inode; | ||
|
||
while cur_inode != base_inode { | ||
stack.push(self.inodes[cur_inode].name.clone()); | ||
match &self.inodes[cur_inode].kind { | ||
Kind::Dir { parent, .. } => { | ||
if let Some(p) = parent { | ||
cur_inode = *p; | ||
} | ||
if let Ok(stripped_path) = path.strip_prefix(po_path) { | ||
// find the max | ||
let new_prefix_len = po_path.as_os_str().len(); | ||
// we use >= to favor later preopens because we iterate in order | ||
// whereas WASI libc iterates in reverse to get this behavior. | ||
if new_prefix_len >= max_prefix_len { | ||
max_prefix_len = new_prefix_len; | ||
max_stripped_path = Some(stripped_path); | ||
max_po_fd = Some(*po_fd); | ||
} | ||
_ => return Err(__WASI_EINVAL), | ||
} | ||
} | ||
|
||
let mut out = PathBuf::new(); | ||
for p in stack.iter().rev() { | ||
out.push(p); | ||
if max_prefix_len == 0 { | ||
Err(__WASI_EINVAL) // this may not make sense depending on where it's called | ||
} else { | ||
Ok((max_po_fd.unwrap(), max_stripped_path.unwrap().to_owned())) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (I am very new, but I figure I should introduce myself by bein useful with a code review!) I wonder if it is possible to structure this so the unwraps are not required. It seems like this function pivots once a candidate is found, would it be possible to fold the three parameters into a single Result value, and have the last step here be over the existing result methods instead of using unwrap? The unwrap was a red flag for me, and I almost posted about the empty-list case, however these unwraps are safe because the max_prefix_len check and that if it's not null, these two are populated. Recognizing that requires me to mull over the state machine to figure out that the three are correlated, where as I feel the code could directly tell me that. Either way, still works the same, just a clarity for me! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, this is a good observation! I agree, but it's a bit tricky here. Sorry by the way -- I missed these comments before -- reviewing them now, I'll make a new PR with any updates I make. Thanks! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In theory, we can solve this problem with a typed state machine, I'm just not sure it's really worth it given the size of this function. If it were larger, I'd definitely be in favor though. The thing is it feels like I'll at least double the amount of code to do it that way... Like enum BaseFdAndRelPath {
None,
BestMatch {
fd: __wasi_fd_t,
rel_path: PathBuf,
max_seen: usize,
}
}
Impl BaseFdAndRelPath {
fn max_seen(&self) -> usize {
match self {
Self::None => 0,
Self::BestMatch { max_seen, .. } => max_seen,
}
}
} Actually, that's not too bad with the method... I'll try it out There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something like https://doc.rust-lang.org/std/cmp/fn.max_by.html could also work here, unfortunately it's not stable. That would lead to simplified code too. I'm pretty happy with the result I have, I do think the code is more clear now! Thanks! The above Edit: but really for this code if we wanted to do it properly we'd probably just use a proper prefix-search data structure like a trie of the preopen FD paths to upgrade the complexity from M * N (where M is the length of the user given path and N is the number of preopens) to just M. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Pushed it up to #2325
Nice to meet you and thanks for the suggestions! We have a community slack as well where you can ping us if we ever miss something like this. If you're interested in hacking on some Wasm / Wasmer stuff, feel free to reach out! |
||
} | ||
Ok(out) | ||
} | ||
|
||
/// finds the number of directories between the fd and the inode if they're connected | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wording of this comment confused me for a bit, this is just my perspective and additional data points should be gathered if this actually matters.
Either way it's a massive nitpick, feel free to ignore this.
With this though, my mind interperted the second sentence fragment "if such a preopened directory exists" to be the final part of the sentence, and then the "and the rest of the path" was confusing and didn't correctly link in my mind that it was talking about the second tuple value.
Instead, can I request that the topics of the "path of the preopened directory" and the "rest of the path" be located near eachother in the comment, with the error case being near the end of the line? That way it mirrors the structure of the result value, so reading the return value and comment is a similar read.