Skip to content

Document usage with MinIO #159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
machichima opened this issue Jan 18, 2025 · 10 comments · Fixed by #241
Closed

Document usage with MinIO #159

machichima opened this issue Jan 18, 2025 · 10 comments · Fixed by #241
Milestone

Comments

@machichima
Copy link
Contributor

I’ve been trying to read a file from a local MinIO instance using obstore, but I encountered a GenericError when specifying a timeout for my S3Store. If I remove the timeout configuration, the file is read successfully. Additionally, this issue does not occur when reading from AWS S3 (with or without the timeout setting).

Below is the code snippet I used:

store = S3Store.from_env(
    "my-s3-bucket",
    config={
        "aws_endpoint": "http://localhost:30002",
        "access_key_id": "minio",
        "secret_access_key": "miniostorage",
        "aws_allow_http": "true",  
        "aws_virtual_hosted_style_request": "false",  
    },
    client_options = {"timeout": "200s"},
)

res = obstore.get(store, "test/test.txt")

with open("./test_obstore.txt", "wb") as f:
    for i, chunk in enumerate(res.stream()):  
        print(chunk)
        print(f"chunk {i}")
        f.write(chunk)

When the timeout is set, I get this error:

pyo3_object_store.GenericError: Generic {
    store: "S3",
    source: Reqwest {
        retries: 0,
        max_retries: 10,
        elapsed: 4.262µs,
        retry_timeout: 180s,
        source: reqwest::Error {
            kind: Builder,
            url: "http://localhost:30002/my-s3-bucket/test/test.txt",
            source: BadScheme,
        },
    },
}

Does anyone know why this issue happens? Any insights or guidance would be greatly appreciated!

Thank you!

@kylebarron
Copy link
Member

Hmm. I have never used Minio but this seems like perhaps a bug in the underlying Rust object_store crate. It might be best to ask in the Arrow Rust discord channel https://discord.gg/Qw5gKqHxUM about this, to see if anyone else has encountered a similar error in their Rust code.

@machichima
Copy link
Contributor Author

Thanks for the reply! I will go and ask in the channel. Appreciate your help!

@machichima
Copy link
Contributor Author

Hi @kylebarron ,

I tired to reproduce this in Rust object_store and add the timeout by with_client_options() (see code here). It uploads and downloads from minio successfully. I think there might be some issue in obstore.

p.s. Sorry for the messy code as I have no experience in Rust before

@kylebarron
Copy link
Member

I tired to reproduce this in Rust object_store and add the timeout by with_client_options() (see code here). It uploads and downloads from minio successfully. I think there might be some issue in obstore.

Wow awesome! Thanks for diving into Rust to reproduce this!

It's certainly possible there's a bug in obstore. Perhaps it's in how we're passing in the client options? Can you try using ClientOptions::with_config. You can see how we convert Python options into the Rust ClientOptions here:

let s = ob.extract::<PyBackedStr>()?.to_lowercase();
let key = ClientConfigKey::from_str(&s).map_err(PyObjectStoreError::ObjectStoreError)?;
Ok(Self(key))

let py_input = ob.extract::<HashMap<PyClientConfigKey, PyConfigValue>>()?;
let mut options = ClientOptions::new();
for (key, value) in py_input.into_iter() {
options = options.with_config(key.0, value.0);
}
Ok(Self(options))

@ion-elgreco
Copy link

ion-elgreco commented Jan 23, 2025

I guess aws_allow_http never gets passed into the client options because client_options are already set explicitly. Below might work:

store = S3Store.from_env(
    "my-s3-bucket",
    config={
        "aws_endpoint": "http://localhost:30002",
        "access_key_id": "minio",
        "secret_access_key": "miniostorage",
        "aws_virtual_hosted_style_request": "false",  
    },
    client_options = {"timeout": "200s", "allow_http": "true"},
)

@machichima
Copy link
Contributor Author

Setting client_options = {"timeout": "200s", "allow_http": "true"}, works!! Thank you so much for the help!

@kylebarron
Copy link
Member

Ah thanks for finding that @ion-elgreco!

With the improved typing in the latest release (0.3), your original code doesn't type check:

Image

However this does:

Image

@kylebarron
Copy link
Member

Also note that as of 0.3 you can pass the config options directly:

Image

@kylebarron
Copy link
Member

It might be helpful to add this to the docs as an example of how to use with minio

@machichima
Copy link
Contributor Author

Sure! I can help with that when I have time

@kylebarron kylebarron added this to the 0.4.0 milestone Jan 28, 2025
@kylebarron kylebarron changed the title GenericError When Reading from Local MinIO Document usage with MinIO Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants