Skip to content

Conversation

@electrum
Copy link
Member

This works around broken clients that don't fetch the final link.

Release notes

(x) This is not user-visible and no release notes are required.

@cla-bot cla-bot bot added the cla-signed label Sep 14, 2022
@martint
Copy link
Member

martint commented Sep 14, 2022

What does "when client has consumed results" mean, concretely (i.e., in terms of protocol interactions with the server). If I recall correctly, following the last link was the way for the client to signal to the server that it had received and seen the results successfully.

@hashhar
Copy link
Member

hashhar commented Sep 14, 2022

I'd argue the clients are buggy and had not been following the protocol as defined - the recent change #13055 only exposed the fact that the clients are buggy.

@electrum
Copy link
Member Author

@martint It means that all the results have been fetched from the exchange client and the client has (potentially) seen the final page of data. Meaning that we returned a response with the final page.

@hashhar Yes, they definitely are buggy, no argument there. That’s why the description says “This works around broken clients”.

Ideally, all clients would be immediately fixed and users would instantly update to the fixed versions. In practice, this is a serious regression on the server for anyone using such clients, as it results in queries hanging around (for five minutes by default) and tying up the max concurrency slots.

Related, we should probably decrease the default client query timeout, since five minutes seems way too long, and no client retires that long anyway.

@findepi
Copy link
Member

findepi commented Sep 14, 2022

(x) This is not user-visible and no release notes are required.

Ideally, all clients would be immediately fixed and users would instantly update to the fixed versions. In practice, this is a serious regression on the server for anyone using such clients

if this change is important for users, let's be sure to mention this in release notes.

@arhimondr
Copy link
Contributor

The fix looks good, though there are many test failures. I wonder if in some cases the query is getting closed prematurely

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that might break the logic. I wonder if this boolean has to be set here or is it sufficient to call queryManager.resultsConsumed to let the query "finish"?

@arhimondr arhimondr self-requested a review September 14, 2022 15:54
@martint
Copy link
Member

martint commented Sep 14, 2022

It means that all the results have been fetched from the exchange client and the client has (potentially) seen the final page of data. Meaning that we returned a response with the final page.

One possible problem to consider is that this will also potentially means the query is marked as finished before the client thinks it's finished -- e.g., if there are any delays in a properly working client to when fetching the final link, or if the client is having problems fetching the final result set and is in the process of retrying.

This works around broken clients that don't fetch the final link.
@findepi
Copy link
Member

findepi commented Sep 15, 2022

One possible problem to consider is that this will also potentially means the query is marked as finished before the client thinks it's finished

is it a problem?

server is executing the query, so it's done when it's done
client is observing the results only

@martint
Copy link
Member

martint commented Sep 15, 2022

is it a problem?

Some possible problems:

  • Incorrect reporting of query execution time
  • Early pruning of query state / results in clusters with large query volume (based on query.max-history)
  • Possibly other future weird behavior if the coordinator gets stricter about removing certain state for queries that have already finished.

@electrum electrum merged commit 4ad7177 into trinodb:master Sep 15, 2022
@electrum electrum deleted the queryfinish branch September 15, 2022 18:42
@github-actions github-actions bot added this to the 396 milestone Sep 15, 2022
@electrum
Copy link
Member Author

Incorrect reporting of query execution time

I'm not sure it's incorrect. At that point, the client has requested the final page of data. In extremely rare cases, the client might not receive the response and need to retry, but the query has already finished executing on the server.

@martint
Copy link
Member

martint commented Sep 15, 2022

This works around broken clients that don't fetch the final link.

I still don't understand what problem this is solving. How are those clients deciding they don't want to follow the final link? What if they stop two steps from the final link, or three, or N? How do they know there's no more data to be returned?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

5 participants