Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load the Python WEBC because of WasiEnvBuilder preopening behaviour #3686

Open
Michael-F-Bryan opened this issue Mar 16, 2023 · 0 comments
Labels
bug Something isn't working 📦 lib-vfs About wasmer-vfs 📦 lib-wasi About wasmer-wasi priority-medium Medium priority issue
Milestone

Comments

@Michael-F-Bryan
Copy link
Contributor

Michael-F-Bryan commented Mar 16, 2023

Describe the bug

There seems to be an inconsistency between the way wasmer_wasi::WasiEnvBuilder's preopened directories work and the paths actually used by the WebcFileSystem (it might also happen for other FileSystems - I haven't checked).

The issue is:

  • The WebcFileSystem gives you absolute paths when traversing directory trees with FileSystem::read_dir() and DirEntry's path field
  • The WasiEnvBuilder seems to expect you to preopen directories from a WebcFileSystem using paths that don't have the leading slash
  • You often don't have access to the concrete filesystem type when configuring a WasiEnvBuilder, especially when filesystem composition comes into play (e.g. mounting and overlays)

Steps to reproduce

For consistency, everything is done using the python-0.1.0.wasmer binary and the latest master (1fe1e51).

First, we start off with some boilerplate to run our "experiment".

Boilerplate
use std::{
    path::{Path, PathBuf},
    sync::Arc,
};

use virtual_fs::{webc_fs::WebcFileSystem, FileSystem};
use wasmer::{Module, Store};
use wasmer_wasix::{WasiEnvBuilder, WasiRuntimeError};
use webc::v1::{ParseOptions, WebCOwned};

fn experiment<F>(get_preopen_dirs: F) -> Result<(), WasiRuntimeError>
where
    F: FnOnce(&WebcFileSystem<WebCOwned>) -> Vec<PathBuf>,
{
    let store = Store::default();
    let webc: &[u8] = include_bytes!("../../c-api/examples/assets/python-0.1.0.wasmer");
    let options = ParseOptions::default();
    let webc = Arc::new(WebCOwned::parse(webc.to_vec(), &options).unwrap());

    let package_name = webc.get_package_name();
    let wasm = webc.get_atom(&package_name, "python").unwrap();

    let module = Module::new(&store, wasm).unwrap();

    let fs = WebcFileSystem::init_all(Arc::clone(&webc));
    let preopen_dirs = get_preopen_dirs(&fs);
    println!("{:#?}", &preopen_dirs[..5]);

    let mut builder = WasiEnvBuilder::default().fs(Box::new(fs));

    for dir in preopen_dirs {
        builder.add_preopen_dir(dir).unwrap();
    }

    builder.run(module)
}

We can use the WebcFileSystem::top_level_dirs() function to get the list of top-level directories it thinks we should pre-open.

fn main() -> Result<(), anyhow::Error> {
    experiment(|fs| fs.top_level_dirs().iter().map(PathBuf::from).collect())?;
    Ok(())
}

Running the code will show the Python prompt as expected. You can also see it print paths without the leading /.

$ cargo run --example webc_fs --package wasmer-wasix
[
    "lib",
    "lib/Parser",
    "lib/python3.6",
    "lib/python3.6/asyncio",
    "lib/python3.6/collections",
]
Python 3.6.7 (default, Feb 14 2020, 03:17:48)
[Wasm WASI vClang 9.0.0 (https://github.com/llvm/llvm-project 0399d5a9682b3cef7 on generic
Type "help", "copyright", "credits" or "license" for more information.
>>>

Now, let's switch to the more generic method based on FileSystem::read_dir().

fn main() -> Result<(), anyhow::Error> {
    experiment(|fs| {
        let mut preopen_dirs = Vec::new();
        filesystem_directories(fs, "/".as_ref(), &mut preopen_dirs);
        preopen_dirs
    })?;

    Ok(())
}

fn filesystem_directories(fs: &dyn FileSystem, path: &Path, directories: &mut Vec<PathBuf>) {
    for entry in fs.read_dir(path).unwrap().flatten() {
        let file_type = entry.file_type().unwrap();

        if file_type.is_dir() {
            filesystem_directories(fs, &entry.path, directories);
            directories.push(entry.path);
        }
    }
}

You'll see that we now print absolute paths and Python will crash because it can't find its standard library.

$ cargo run --example webc_fs --package wasmer-wasix
[
    "/lib/Parser",
    "/lib/python3.6/asyncio",
    "/lib/python3.6/collections",
    "/lib/python3.6/concurrent/futures",
    "/lib/python3.6/concurrent",
]
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ModuleNotFoundError: No module named 'encodings'

Current thread 0x00000000 (most recent call first):
Error: Runtime error

Caused by:
    0: RuntimeError: unreachable
           at <unnamed> (<module>[4661]:0x224edf)
           at <unnamed> (<module>[2033]:0xce8da)
           at <unnamed> (<module>[2030]:0xcde7a)
           at <unnamed> (<module>[2028]:0xcd9d0)
           at <unnamed> (<module>[2037]:0xcef38)
           at <unnamed> (<module>[43]:0x4743)
           at <unnamed> (<module>[42]:0x3fec)
           at <unnamed> (<module>[4709]:0x22985e)
           at <unnamed> (<module>[30]:0x3e58)
    1: unreachable

To make things worse, if you use a wasmer_vfs::mem_fs::FileSystem that has directories from a wasmer_vfs::host_fs::FileSystem mounted, using absolute paths to preopen those directories will give the instance access to those directories as expected.

Mounting a WebcFileSystem onto a wasmer_vfs::mem_fs::FileSystem seems to be broken here. When you use relative paths from top_level_dirs() you'll get a wasi filesystem creation error: Could not get metadata for file "lib": fd not a directory error. However, turning them into absolute paths (e.g. by doing Path::new("/").join(path) for each path) won't do anything and you'll get the now-familiar fatal Python error 🙃

Expected behavior

I'm not really sure.

I think I would expect WasiEnvBuilder::add_preopen_dir() to require absolute paths when the underlying filesystem does operations with absolute paths (e.g. WebcFileSystem), and allow relative paths (e.g. ./blah) when the underlying filesystem operations can work with relative paths (e.g. wasmer_vfs::host_fs::FileSystem).

Actual behavior

See above.

Additional context

This was one of the many rabbit-holes I went down as part of #3650 while trying to troubleshoot complex filesystem configurations.

This bug interacts poorly with #3685 because the FileSystem::read_dir() method won't even give you valid paths to preopen.

@Michael-F-Bryan Michael-F-Bryan added bug Something isn't working 📦 lib-wasi About wasmer-wasi 📦 lib-vfs About wasmer-vfs labels Mar 16, 2023
@Michael-F-Bryan Michael-F-Bryan added this to the v3.2 milestone Mar 21, 2023
@Michael-F-Bryan Michael-F-Bryan added the priority-medium Medium priority issue label Mar 21, 2023
@Michael-F-Bryan Michael-F-Bryan self-assigned this Mar 21, 2023
@ptitSeb ptitSeb modified the milestones: v3.2, v3.3 Apr 13, 2023
@ptitSeb ptitSeb modified the milestones: v3.3, v4.0 Apr 20, 2023
@ptitSeb ptitSeb modified the milestones: v4.0, v4.x May 3, 2023
@syrusakbary syrusakbary modified the milestones: v4.x, v4.4-pending Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working 📦 lib-vfs About wasmer-vfs 📦 lib-wasi About wasmer-wasi priority-medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

3 participants