Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSO Request Parallelizing #54960

Closed
2 tasks done
MyonKeminta opened this issue Jul 26, 2024 · 2 comments · Fixed by #56292
Closed
2 tasks done

TSO Request Parallelizing #54960

MyonKeminta opened this issue Jul 26, 2024 · 2 comments · Fixed by #56292
Labels
type/enhancement The issue or PR belongs to an enhancement.

Comments

@MyonKeminta
Copy link
Contributor

MyonKeminta commented Jul 26, 2024

Enhancement

We used to find that in some OLTP workloads where the QPS is high and the queries are simple, the TSO Wait duration usually become a significant portion of the total duration of queries. In TiDB, TSO loading is already made concurrent with some other works such as compiling. In cases that the queries are simple, it would be hard to further optimize it by making it concurrent with more phases of the SQL execution. But we found a practical way to optimize it is to do it from the TSO client.

Currently, a TSO client object has a goroutine that collects GetTS (and GetTSAsync) calls (tsoRequests) as a batch, send it to PD, wait for the response, and dispatch the results to these tsoRequests, serially. As a result, each GetTS calls may need to spend up to 1x TSO RPC time to wait for being collected to the next batch.

Considering the case that PD's TSO allocator is not the bottle neck and can deal with more TSO requests (so that the majority part of TSO RPC's time cost is on the network), we find that it's possible to start collecting the next batch and send it before receiving the response of the previous batch. So that each GetTS call needs to wait for less time to be batched, and gets a shorter total duration.

So this is an approach that reduces the duration of GetTS & GetTSAsync - Wait at the expense of higher TSO RPC OPS and higher pressure to PD. It's not suitable to be enabled by default, but we can provide such an option when the TSO Wait duration becomes a problem.

@MyonKeminta MyonKeminta added the type/enhancement The issue or PR belongs to an enhancement. label Jul 26, 2024
@ngaut
Copy link
Member

ngaut commented Jul 26, 2024

Another idea(no Parallelizing needed, easy to implement) is to predict(no need to be precise) and preallocate more timestamps. For example, if the current batch count is 100, we may get more timestamp based on current ts speed, let's say it's 200 requests/sec, We can predict next batch is: 100 + 200 + (add some buffer or scaling factors).

@MyonKeminta
Copy link
Contributor Author

Another idea(no Parallelizing needed, easy to implement) is to predict(no need to be precise) and preallocate more timestamps. For example, if the current batch count is 100, we may get more timestamp based on current ts speed, let's say it's 200 requests/sec, We can predict next batch is: 100 + 200 + (add some buffer or scaling factors).

Thanks for commenting. It should be pointed out that in our centralized TSO architecture, it's crucial to let each getting timestamp operations pass through a single allocator to guarantee the monotonicity, i. e., if GetTS operation A finishes before GetTS operation B's starting, then A must get an earlier timestamp than B. We would lose the monotonicity by pre-allocating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants