Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up core WASI FS code #2247

Merged
merged 2 commits into from
May 19, 2021
Merged

Clean up core WASI FS code #2247

merged 2 commits into from
May 19, 2021

Conversation

MarkMcCaskey
Copy link
Contributor

@MarkMcCaskey MarkMcCaskey commented Apr 21, 2021

Started looking into #2243 and noticed a few things that could be cleaned up and fixed.

The first fix just makes the code more explicit and removes a mutable variable. This is good because it also makes it more obvious that the final branch can never exit, it always returns.

The second fix aligns our path stripping behavior to that of WASI libc (we take the max, not the first, and favor later preopens over newer ones).

Review

  • Add a short description of the the change to the CHANGELOG.md file

if max_prefix_len == 0 {
Err(__WASI_EINVAL) // this may not make sense depending on where it's called
} else {
Ok((max_po_fd.unwrap(), max_stripped_path.unwrap().to_owned()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I am very new, but I figure I should introduce myself by bein useful with a code review!)

I wonder if it is possible to structure this so the unwraps are not required. It seems like this function pivots once a candidate is found, would it be possible to fold the three parameters into a single Result value, and have the last step here be over the existing result methods instead of using unwrap?

The unwrap was a red flag for me, and I almost posted about the empty-list case, however these unwraps are safe because the max_prefix_len check and that if it's not null, these two are populated. Recognizing that requires me to mull over the state machine to figure out that the three are correlated, where as I feel the code could directly tell me that.

Either way, still works the same, just a clarity for me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is a good observation! I agree, but it's a bit tricky here.

Sorry by the way -- I missed these comments before -- reviewing them now, I'll make a new PR with any updates I make. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, we can solve this problem with a typed state machine, I'm just not sure it's really worth it given the size of this function. If it were larger, I'd definitely be in favor though. The thing is it feels like I'll at least double the amount of code to do it that way...

Like

enum BaseFdAndRelPath {
  None,
  BestMatch  {
    fd: __wasi_fd_t,
    rel_path: PathBuf,
    max_seen: usize,
  }
}

Impl  BaseFdAndRelPath {
  fn max_seen(&self) -> usize {
     match self {
       Self::None => 0,
       Self::BestMatch { max_seen, .. } => max_seen,
     }
  }
}

Actually, that's not too bad with the method... I'll try it out

Copy link
Contributor Author

@MarkMcCaskey MarkMcCaskey May 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like https://doc.rust-lang.org/std/cmp/fn.max_by.html could also work here, unfortunately it's not stable. That would lead to simplified code too. I'm pretty happy with the result I have, I do think the code is more clear now! Thanks!

The above Enum that I posted could definitely be made generic and reused with generic data for finding the most X of a collection of things, could probably even move control flow into a method on this type.

Edit: but really for this code if we wanted to do it properly we'd probably just use a proper prefix-search data structure like a trie of the preopen FD paths to upgrade the complexity from M * N (where M is the length of the user given path and N is the number of preopens) to just M.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed it up to #2325

(I am very new, but I figure I should introduce myself by bein useful with a code review!)

Nice to meet you and thanks for the suggestions! We have a community slack as well where you can ping us if we ever miss something like this. If you're interested in hacking on some Wasm / Wasmer stuff, feel free to reach out!

/// TODO: evaluate users of this function and explain why this behavior is
/// not the same as libpreopen or update its behavior to be the same.
/// Splits a path into the preopened directory that has the largest prefix of
/// the given path, if such a preopened directory exists, and the rest of the path.
Copy link
Contributor

@Earthmark Earthmark Apr 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording of this comment confused me for a bit, this is just my perspective and additional data points should be gathered if this actually matters.

Either way it's a massive nitpick, feel free to ignore this.

With this though, my mind interperted the second sentence fragment "if such a preopened directory exists" to be the final part of the sentence, and then the "and the rest of the path" was confusing and didn't correctly link in my mind that it was talking about the second tuple value.

Instead, can I request that the topics of the "path of the preopened directory" and the "rest of the path" be located near eachother in the comment, with the error case being near the end of the line? That way it mirrors the structure of the result value, so reading the return value and comment is a similar read.

@syrusakbary
Copy link
Member

bors r+

@bors
Copy link
Contributor

bors bot commented May 19, 2021

@bors bors bot merged commit 7e22191 into master May 19, 2021
@bors bors bot deleted the feature/clean-up-core-wasi-fs-code branch May 19, 2021 03:32
bors bot added a commit that referenced this pull request May 20, 2021
2325: Clean up some internal WASI code r=Hywan a=MarkMcCaskey

Shout out to @Earthmark on GitHub for the suggestions

Comments left on #2247, I completely forgot about the PR and didn't see any of the comments until this morning when I saw that the PR had been shipped.

-----

This PR addresses some community feedback by rewording a doc comment and updating the internals of a function using 3 mutable variables to find the best fit to only use 1 mutable variable. This change increases readability of the code by making it obvious that all 3 variables are conveying the same bit of information: that we found a match.

Co-authored-by: Mark McCaskey <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants