-
Notifications
You must be signed in to change notification settings - Fork 7k
[ray_client]: Add more retry logic #13478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
python/ray/util/client/worker.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need three different retry loops for the different stages? Why not have just a single one that tests this last condition?
ericl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment on simplifying
|
Because it could fail at any one of these stages. Now, one big retry loop would look like a finite state machine of But we can't "test this last condition" alone. If this is what you'd prefer, I'll go do that. |
|
Why can't we test the last condition alone? C implies {A, B} are fine right? I think a loop that just tests C would be sufficient. If it's for better error messages, we could also check {A, B} prior to C in the loop. |
|
Btw, it would be good to have a test here |
Change-Id: I4e96d1c4dd7252b754482892597280504a4ba63c
Change-Id: I54c890e7e5d6452bcb58312d54c746e5c8e556bc
Change-Id: I23dffe65feb4c778985be724f3ab213a61eb2da8
Change-Id: I000bb467e61ca5f2f4e0345d07ca01acf143956f
Change-Id: I746c4dda014b8b5b7752c07299f9b42951fe6cb8
44c0b0e to
01b29fa
Compare
Change-Id: I9102433a212846af77ac7a23322e35b48d8464b8
Change-Id: I443e911ca6e8e6c5fcc91d2ec5a4990d81215c1e
|
Sure. Removed state machine and added test. |
|
Change-Id: I3669ea2734de9e7aad8522a898d3248aa560892a
Change-Id: Iae12027d2340e81832305a60a297f85deb8f2919
|
Tests / lint still failing |
Change-Id: Ia3e8df94ea689b38d87f8bbafb5a1142edd7b3e1
|
I'm starting to run out of ideas, but we'll see if this goes |
|
I think I figured it out |
Change-Id: Ie691ed9ac082639c583eec4f6af7578b0841c744
|
This time for sure -- I got the reproduction locally and tracked it down |
This reverts commit bc386dd.
Related issue number
Closes #13446
Checks
scripts/format.shto lint the changes in this PR.