-
Notifications
You must be signed in to change notification settings - Fork 833
[Storage] Fix Azure mount install cmd to not reinstall fuse3 when fusermount-shim is used #7818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
|
It's failing with: I think it could be specific to our test infra's k8s setup. Because the test runs fine when ran against a real k8s cluster. |
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
4 similar comments
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
4c7e982 to
c643ac6
Compare
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
This reverts commit 1568d7f.
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
|
/smoke-test -k storage --aws |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, glad we are also adding logs that will help a lot if there are future issues! Thanks @kevinmingtarja !
| MOUNT_EXIT_CODE=$? | ||
| set -e | ||
| if [ $MOUNT_EXIT_CODE -ne 0 ]; then | ||
| echo "Mount failed with exit code $MOUNT_EXIT_CODE." | ||
| if [ "$MOUNT_BINARY" = "goofys" ]; then | ||
| echo "Looking for goofys log files..." | ||
| # Find goofys log files in /tmp (created by mktemp -t goofys.XXXX.log) | ||
| # Note: if /dev/log exists, goofys logs to syslog instead of a file | ||
| GOOFYS_LOGS=$(ls -t /tmp/goofys.*.log 2>/dev/null | head -1) | ||
| if [ -n "$GOOFYS_LOGS" ]; then | ||
| echo "=== Goofys log file contents ===" | ||
| cat "$GOOFYS_LOGS" | ||
| echo "=== End of goofys log file ===" | ||
| else | ||
| echo "No goofys log file found in /tmp" | ||
| fi | ||
| fi | ||
| # TODO(kevin): Print logs from rclone, etc too for observability. | ||
| exit $MOUNT_EXIT_CODE | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: before this, our logs did not give any useful debugging info, only mentions the exit code.
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
1 similar comment
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
Failing on master: https://buildkite.com/skypilot-1/smoke-tests/builds/5076/steps/canvas?sid=019a3b52-536d-4eeb-affb-8b0f0ec808df
Related to #7531
In
get_az_mount_install_cmd, we installfuse3,libfuse3-3,libfuse3-dev. This is a problem when running in k8s, because we symlink fusermount and fusermount3 to our fusermount-shim (see https://github.com/skypilot-org/skypilot/tree/master/addons/fuse-proxy for details):skypilot/sky/templates/kubernetes-ray.yml.j2
Lines 814 to 828 in c2ddb3e
And re-installing
fuse3will overwrite this symlink.To illustrate what happens before and after
get_az_mount_install_cmd:This will lead to an error the next time we try to call fusermount:
Because our skypilot pods by design do not have root privileges, it has to rely on the shim to talk to the fusermount-server (which is the one that has the privileges).
Tested (run the relevant ones):
bash format.sh/smoke-test(CI) orpytest tests/test_smoke.py(local)/smoke-test -k test_name(CI) orpytest tests/test_smoke.py::test_name(local)/quicktest-core(CI) orpytest tests/smoke_tests/test_backward_compat.py(local)