Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix get_result(wait=True) #1949

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from
Draft

Fix get_result(wait=True) #1949

wants to merge 1 commit into from

Conversation

cjao
Copy link
Contributor

@cjao cjao commented May 21, 2024

get_result(wait=True) will still error out after some time. Thi PR proposes a fix using a combination of API and SDK improvements.

The previous GET /dispatches/{dispatch_id} endpoint was trying to do too much. Its responsibilities are now separated into two endpoints:

  • GET /dispatches: bulk query dispatch summaries (including status) with options to filter by dispatch_id, sort chronologically, and also limit the output to status only.

  • GET /dispatches/{dispatch_id}: download manifest

To achieve the desired behavior of get_result(id, wait=True), the client

  1. Polls the dispatch status by querying the first endpoint.

  2. Downloads the manifest after the dispatch has reached a final status.

The server no longer returns 503 errors when the dispatch is not yet "ready". A 503 status code is not appropriate here because it is intended to convey temporary service unavailablity resulting from server overload or rate limiting. However, the fact that the workflow is still running does not indicate any fault of the server.

These changes will allow get_result(dispatch_id, wait=True) to wait as long as required instead of erroring out after some time.

Fixes #1539

  • I have added the tests to cover my changes.
  • I have updated the documentation and CHANGELOG accordingly.
  • I have read the CONTRIBUTING document.

@cjao cjao force-pushed the improve-result-api branch 2 times, most recently from 0a3b19c to 5f4adfb Compare May 21, 2024 21:48
@cjao cjao marked this pull request as ready for review May 21, 2024 21:52
@cjao cjao requested a review from a team as a code owner May 21, 2024 21:52
@cjao cjao changed the title API: redesign implementation of get_result(wait=True) Redesign implementation of get_result(wait=True) May 21, 2024
@cjao cjao force-pushed the improve-result-api branch 2 times, most recently from 40b509d to f98d1eb Compare May 29, 2024 18:42
@cjao cjao changed the title Redesign implementation of get_result(wait=True) Fix get_result(wait=True) May 29, 2024
@cjao cjao marked this pull request as draft May 29, 2024 18:49
@cjao cjao force-pushed the improve-result-api branch 2 times, most recently from 9138a30 to dd00785 Compare June 14, 2024 23:01
@santoshkumarradha
Copy link
Member

santoshkumarradha commented Jun 14, 2024

~~@cjao can you see if you make this true in a way for other waits as well like env and deployment ? ~~

Please ignore

The previous `GET /dispatches/{dispatch_id}` endpoint was trying to do
too much. Its responsibilities are now separated into two endpoints:

* `GET /dispatches`: bulk query dispatch summaries (including status)
with options to filter by `dispatch_id`, sort chronologically, and
also limit the output to status only.

* `GET /dispatches/{dispatch_id}`: download manifest

To achieve the desired behavior of `get_result(id, wait=True)`, the
client

1. Polls the dispatch status by querying the first endpoint.

2. Downloads the manifest after the dispatch has reached a final
status.

The server no longer returns 503 errors when the dispatch is not yet
"ready". A 503 status code is not entirely accurate here because it is
intended to convey temporary service unavailablity resulting from
server overload or rate limiting. However, the fact that the workflow
is still running does not indicate any fault of the server.

These changes will allow `get_result(dispatch_id, wait=True)` to wait
as long as required instead of erroring out after some time.

Supporting improvements:

DAL: Add sorting and pagination to Controller

DAL: improve bulk get when retrieving only some columns

Directly select the specified columns instead of retrieving the whole
ORM entities and deferring column loading using load_only
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inevitable recursion error from inside get_result
2 participants