Skip to content

Conversation

@PSU3D0
Copy link

@PSU3D0 PSU3D0 commented Nov 3, 2025

I found skypilot could fail due to timeout errors on slower connections (Starlink, high contention network). Added a defaults-preserving global environment variable so users don't need need to resort to per-cloud ssh_proxy_command: 'none' workarounds.

  • Allow overriding the raw SSH socket probe timeout via SKYPILOT_SSH_SOCKET_CONNECT_TIMEOUT, keeping the 1 s default, and surface misconfigurations with clear errors.
  • Document the new environment variable alongside the existing provision.ssh_timeout option.
  • Add focused unit tests covering default, override, and invalid values for the new timeout path.

Tested (run the relevant ones):

  • Code formatting: install pre-commit (auto-check on commit) or bash format.sh
  • Any manual or new tests for this PR (please specify below)
    • PYTHONPATH=. uv run --no-project pytest tests/unit_tests/test_provisioner.py
  • All smoke tests: /smoke-test (CI) or pytest tests/test_smoke.py (local)
  • Relevant individual tests: /smoke-test -k test_name (CI) or pytest tests/test_smoke.py::test_name (local)
  • Backward compatibility: /quicktest-core (CI) or pytest tests/smoke_tests/test_backward_compat.py (local)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant