Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding S3 client #93

Open
kubo6472 opened this issue Jan 16, 2024 · 6 comments
Open

Adding S3 client #93

kubo6472 opened this issue Jan 16, 2024 · 6 comments
Labels
enhancement New feature or request low-priority

Comments

@kubo6472
Copy link
Contributor

No idea what the rewrite mentioned here was about, but I wanted to ask if there are any plans at implementing an option to easily specify S3, B2, R2... bucket to store all the media in. Ideally with configurable options to replicate both locally and remotely, or to only save the files remotely.

On an unrelated note, specifically for Backblaze B2, there seems to be a pricing waive of eggress fees when using cloudflare workers to access the data, e.g. with something like this. Maybe this will result in an easier implementation, if it just loaded URLs for the media. However I doubt it has any security and not everyone uses Cloudflare.

You can take advantage of Cloudflare's performance and the free data transfer between Backblaze B2 and Cloudflare: ref.

@girlbossceo
Copy link
Owner

No idea what the rewrite mentioned here was about,

As usual this never happened and I could not find any kind of progress anywhere, which was why I merged in the SHA256 media key stuff (the supposed rewrite was mentioned there too). It definitely could use improvements though.

but I wanted to ask if there are any plans at implementing an option to easily specify S3, B2, R2... bucket to store all the media in. Ideally with configurable options to replicate both locally and remotely, or to only save the files remotely.

I don't use things like S3 but I'm willing to accept an implementation of it with whatever features are necessary. Perhaps that MR with some changes/improvements.

I would be willing to accept that MR you linked if:

  • It had a form of in-memory / tmpfs cache, or even a local storage cache if anything. See the MR discussion about the delay of fetching files and bandwidth concerns. Someone provided a general idea of how it should be implemented, but as is I think a cache is very important here.
  • Considered looking into MSC3860 (MSC3860: Media Download Redirects matrix-org/matrix-spec-proposals#3860) which supports media redirection, tailored to S3-related setups (see Beeper's matrix-media-repo implementation which doesn't seem complex), using presigned URLs
  • And general second look on code-quality

It would be nice to split up uploads, media, and thumbnails as well instead of the current situation where everything is thrown into one media folder, but I won't make that a blocker. It might be relevant to have separate buckets for uploads and downloads/thumbnails though.

If the author of that is still around (seems like it was last rebased 5 months ago), feel free to ping them about my fork, get them to see these concerns I listed, help get them implemented, test it by someone who uses S3, and I'll merge it.

@girlbossceo girlbossceo added the enhancement New feature or request label Jan 16, 2024
@kubo6472
Copy link
Contributor Author

Thanks for the very structured response. I've looked at the (now archived) beeper's media repo, which is a fork of this and that has quite a big disclaimer for small servers specifically for S3 (and compatible) to use this instead.

I was hoping that it could be integrated directly with conduit as the config options, however I myself have zero knowledge of Rust and even less with how existing code's written.

That being said I really hope the original author will re-appear and possibly help contribute this.

@girlbossceo
Copy link
Owner

You are right it mostly appears to be just a bunch of config options. I can consider merging this into my fork, but I will only feel comfortable doing it if I add a clear and explicit warning that this (currently) has no cache and someone malicious can likely incur high-costs to you just by spamming media requests, and performance of media retrieval may be worse when using S3 due to high round trip times. And I will keep this issue open because the existing implementation requires these improvements, and likely can be improved with just supporting media redirection from MSC3860.

@girlbossceo
Copy link
Owner

Will merge https://gitlab.com/famedly/conduit/-/merge_requests/384 in with the warnings I want, and do a check on the code, but this issue will stay open with regards to what I said. Someone malicious can incur high costs to you by spamming requests (could be mitigated with #98) and performance may or may not be significantly worse if roundtrip times are high with no cache.

@zoujiaqing
Copy link

Use minio instead of aws s3.

@AndSDev
Copy link
Contributor

AndSDev commented Apr 22, 2024

Someone malicious can incur high costs to you by spamming requests

The cache will not save you from an attacker (will help, but won't save).
Matrix does not have protection against spam requests (except of rate-limits for requests).
So a warning is needed anyway.

P.S. This MR was designed for use with minio. Ideally, conduit should be a stateless service.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request low-priority
Projects
None yet
Development

No branches or pull requests

4 participants