-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial file system #1043
Initial file system #1043
Conversation
💵 To receive payouts, sign up on Algora, link your Github account and connect with Stripe. |
@MartinKavik Congratulations on submitting your solution for our Bounty-to-Hire challenge (#1004)! We have reviewed your pull request and believe that it is very promising. While not yet satisfying all of the requirements of #1004, we believe there is a short and well-defined path to get there, and so we are delighted to announce that you are a finalist in our Bounty-to-Hire event! Next steps:
If you no longer wish to work on this pull request, please close it, and we will reach out about the interview. On the other hand, if you wish to continue pushing this pull request forward, then please let us know here and continue your work. Congratulations again on being a finalist for Bounty-to-Hire! |
@jdegoes Nice! Let's go! |
Update: API endpoint Ideal worker file flow in the system:
The flow leverages native file system APIs as much as possible to lower complexity caused by managing extra permission lists, file manifests or database records. The flow is almost fully implemented, but the blocker are symlink permissions and Wasmtime/Wasi. I wasn't able make symlinks work inside the worker because of "not permitted" error even though they were relative, not crossing sandbox borders and pointing to a preopened folder. I would need to investigate it more. In the meantime, the code relies on preloaded paths which is not ideal for various reasons. Notes:
Project for manual testing: https://github.com/MartinKavik/golem_async |
Are you talking about working on the |
We have reached November 4th, which is the day we must announce the winner of the Bounty-to-Hire program! Given the scope of the challenge, it's incredibly impressive that we have had not one, but THREE finalists, who each demonstrated high competence and skill in tackling a problem that would stump most engineers on the planet. However, ultimately, there can be only ONE WINNER to the Bounty-to-Hire challenge. So after MUCH analysis by Golem engineers, thoughtful consideration, and spirited discussion, we have decided to pick a winner. That winner will be announced on our LIVE STREAM event TODAY at 12 NOON ET: https://www.youtube.com/watch?v=at8EPqLWIRE Without question, all three finalists are highly qualified Rust engineers and we look forward to interviewing and getting to each of them, with the intention of making at least one job offer to fill our recently opened position. While you await the announcement of the WINNER, please shoot an email to [email protected] to schedule your interview! Stay tuned for more. |
@MartinKavik If you plan to join for the Live stream today, please do send me a message on Discord (you can find me on the Golem Discord), or an email at [email protected]. Thank you! |
@MartinKavik Thanks for your work on this feature! We remain excited to interview you, should you be interested in doing this sort of work full time as part of the Golem Cloud engineering team. Since announcing that Maxim Schuwalow has won this bounty, we are now closing this pull request, and look forward to posting many more bounties on smaller-sized issues in the near future. |
/claim #1004
Closes #1004, Fixes #843
Initial file system
Specification: #1004 (comment)
Currently implemented flow
NOTE: You can run the described flow by yourself with this testing example: https://github.com/MartinKavik/golem_async
golem.yaml
files are collected from the project and validated. Then componentfiles
are downloaded and packed into a zip archive and uploaded to Golem together with the component data from Golem CLI.files
are extracted to a new directory inside the worker executordata
folder. Each durable worker has one such directory representing its root directory. The code running inside the worker has access to all files in the directory and all files created by the worker are stored there as well.Files compression & transport
Files are compressed with the default / standard method
Deflated
and packed to a Zip archive. Permissionsread-only
orread-write
are linked to every file as Unix permissions. If the file is larger than 4GB, then Zip readers need ZIP64 support to open the archive. The attributemodification time
is automatically set to the current time without a time zone. Comments are not used. Related code:I've chosen
Deflated
method and Zip archive because they are the most used ones so we can read or even create such file archives on different platforms and the related tools and libraries should be reliable enough. Zip doesn't support compression across files but it's a tradeoff for potential parallel packing and simple partial extraction that could be useful for more efficient worker files update.If needed, we can optimize packing by choosing a better method like
Zstd
because it should be both faster and produce smaller files. Or we can pack files into a compressed Tar archive or into a second inner Zip archive without any compression to achieve compression across files. It would depend on user files - for example compression across files would be suitable for many text files but not too much for already compressed images and videos.I've decided to include permissions directly in the archive to make the archive work as a self-contained unit. This way:
comment
attribute or custom ZIP fields.read-only
files will have native read-only flag set as well.I tried to use the crate async_zip but created archives were unreadable (as surprisingly described in their docs). Then I tried to use the sync zip crate directly together with Tokio's
spawn_blocking
but the code became unreliable when an early return has been triggered by an error. So I decided to "isolate" sync Zip calls with channels ingolem-cli/src/async_zip_writer.rs
.OpenAPI path parameters
OpenAPI currently (v3) doesn't support optional path parameters. See e.g. swagger.io/docs/specification/v3_0/describing-parameters/:
It means that urls like:
aren't officially possible to make even though the generator allows it. It should be somehow possible in OpenAPI v4, see OAI/OpenAPI-Specification#93 (comment).
I've started to implement
.../workers/{worker_name}/files/<path>
endpoint with url/workers/{worker_name}/files?path=/my_file.txt
as the OAS-compatible alternative. But it has some benefits - you don't need to urlencode it if you are confident that the path doesn't contain chars=
and&
(and maybe others?). And in the future we can add array support when a user wants to define more paths to download more files at once. But I don't know the "business reason" for this API / how real users want to download the files so it depends.Nested preopened dirs / files, Copy files to workers
Consider this
Golem.yaml
snippet:read-write
andread-only
files in one directory. Should it be possible? If I'm not mistaken, Wasi / Wasmtime doesn't support it, you can only preopen directories.read-write
files and folders (to prepare for a big component files update; to remove all user data if the worker represents a user that wants to be forgotten; etc.) You cannot just call a functionremove_dir_all
because it would probably fail if there is a nestedread-only
dir entry. As a worker developer I would probably just try to get permissions from dir entry metadata during walking from the root dir and pray that standard file permissions matches the one set by Wasm engine. Then also host file permissions and guest/standard file permissions may become a bit confusing.read-write
files should be copied to the new worker. What if I don't want it as a worker developer? Let's say I have a componentUser
. It represents all user data and works as a server session as well (see e.g. turso.tech/multi-tenancy or plane.dev/concepts/session-backends). I want to start with almost empty worker state and then lazily attach/copy initial files and databases when the user/customer sign up to my app. I don't want to do that for every web visitor or bot. One idea would be to introduce special directories in the worker root like/static
,/public
,/shared
and/private
, where currentread-only
files and dirs will be inside/static
; currentread-write
files inside/private
./public
could a special folder accessible from outside world./shared
could be for passing/sharing big files among workers. Currentpermissions
in the manifest would define classic file permissions. Then we can make/static
/read-only
files public cheaply just by creating a symlink inside/public
.Windows support
I've tried to work on Golem on Windows first - I wasn't really successful but I've made it at least compilable by these two lines:
Then
cargo make run
script would need to be rewritten to make it cross-platform. Redis doesn't support Windows but I can imagine Redis + Postgre + Nginx will be replaced / alternative will be introduced in the future to make the entire architecture and development / deployment simpler. I haven't found a reasonable replacement for lnav yet. I've tried to compile it with MSYS2 but I've found out there are too much incompatibilities to make it work on Windows in a reasonable time.Next steps
I've opened some topics to discuss in that wall of text above and the PR is not mergeable yet. I can add comments directly to PR code to make the review easier or continue implementation if you want to.
I would finish that
..{component_id}/workers/{worker_name}/files/
endpoint and then update/fix other parts of code according to feedback. Also I think this PR is already pretty big so I would recommend to split it and create another PR for theAPI Definition API
and potential extra work.