-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL] Implement Async Flush for COPY command. #11628
Labels
area/ysql
Yugabyte SQL (YSQL)
Comments
nathanhjli
added a commit
that referenced
this issue
Mar 10, 2022
Summary: Currently, as part of any statement, YSQL does some processing and buffers writes. The write buffer is flushed once either of the below conditions is hit - (1) the write buffer is full (i.e., hits ysql_session_max_batch_size limit) (2) a read op is required On a flush, YSQL directs the writes to required tablet servers in different rpcs (all issued in parallel). Only once responses to all RPCs are received, the YSQL backend makes further progress. This waiting behaviour affects performance of bulk loading using COPY FROM because YSQL spends a lot of time waiting for responses. It would be ideal to use that wait time for reading further tuples from the input source and perform necessary processing. In this diff, we are adding some asynchrony to the flush to allow the YSQL's COPY FROM to read more tuples after sending a set of rpcs to tablet servers (without waiting for the responses). This is done by storing the flush future and not waiting for its result immediately. Only when YSQL refills its write buffer, it will wait for the earlier flush's result just before performing the next flush call. Note that the right choice of ysql_session_max_batch_size is required to help us mask almost all of the wait time. The optimal batch size is one in which both of the following tasks (which will run simultaneously after this diff) take almost the same time - (1) YSQL fetching and buffering ysql_session_max_batch_size rows (2) Sending rpcs for the previous ysql_session_max_batch_size rows and arrival of responses from the tserver Note also that there might not be any value of ysql_session_max_batch_size for which both tasks complete at roughly the same time. This could be due to the inherently different speeds of disk reading and tablet servers' performance. Test Plan: Tested manually locally and on portal clusters. Experiments show that there is generally a 20-25% increase in speed when using async flush versus using regular flushing. Reviewers: kannan, smishra, pjain Reviewed By: pjain Subscribers: mtakahara, zyu, lnguyen, yql Differential Revision: https://phabricator.dev.yugabyte.com/D15757
nathanhjli
added a commit
that referenced
this issue
Mar 15, 2022
Summary: This reverts commit 1a3a344. Reverting current implementation of async flush changes so that we can refactor, fix potential bugs, and improve implementation details. Test Plan: Jenkins: urgent Built and run a COPY locally to verify it worked. Reviewers: pjain Reviewed By: pjain Subscribers: dmitry, yql Differential Revision: https://phabricator.dev.yugabyte.com/D15975
nathanhjli
added a commit
to nathanhjli/yugabyte-db
that referenced
this issue
Mar 16, 2022
…r pg_session, indexes supported Summary: Working on flakiness still (Aborted: backfill connection to DB failed), but getting this diff out for initial reviews and opinions. One thing to note is that the initial pipeline where we passed async flush seems to be less flakier since we can control exactly when we want to use async flush. Test Plan: Built locally and tested by creating indexes and performing COPY FROM. Also added java test: ./yb_build.sh --java-test org.yb.pgsql.TestAsyncFlush Reviewers: pjain, dmitry Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D16005
d-uspenskiy
added a commit
that referenced
this issue
Mar 25, 2022
Summary: To simplify the code of `PgSession` class and simplify further implementation of write operation buffering subsystem improvements (issue #11628) code related to buffering is moved into separate class `PgOperationBuffer`. Current functionality of buffering subsystem is preserved (in general). Test Plan: Jenkins Reviewers: nli, pjain Reviewed By: pjain Differential Revision: https://phabricator.dev.yugabyte.com/D16083
nathanhjli
added a commit
that referenced
this issue
Apr 11, 2022
Summary: Currently, as part of any statement, YSQL does some processing and buffers writes. The write buffer is flushed once either of the below conditions is hit - (1) the write buffer is full (i.e., hits ysql_session_max_batch_size limit) (2) a read op is required On a flush, YSQL directs the writes to required tablet servers in different rpcs (all issued in parallel). Only once responses to all RPCs are received, the YSQL backend makes further progress. This waiting behaviour affects performance of bulk loading using COPY FROM because YSQL spends a lot of time waiting for responses. It would be ideal to use that wait time for reading further tuples from the input source and perform necessary processing. In this diff, we are adding some asynchrony to the flush to allow the YSQL's COPY FROM to read more tuples after sending a set of rpcs to tablet servers (without waiting for the responses). This is done by storing the flush future and not waiting for its result immediately. Only when YSQL refills its write buffer, it will wait for the earlier flush's result just before performing the next flush call. Note that the right choice of ysql_session_max_batch_size is required to help us mask almost all of the wait time. The optimal batch size is one in which both of the following tasks (which will run simultaneously after this diff) take almost the same time - (1) YSQL fetching and buffering ysql_session_max_batch_size rows (2) Sending rpcs for the previous ysql_session_max_batch_size rows and arrival of responses from the tserver Note also that there might not be any value of ysql_session_max_batch_size for which both tasks complete at roughly the same time. This could be due to the inherently different speeds of disk reading and tablet servers' performance. Test Plan: Built locally and tested by creating indexes and performing COPY FROM. Previous experiments on portal clusters show that there is generally a 30% increase in speed when using async flush versus using regular flushing. Also Jenkins tests since this is a general enhancement that is used everywhere. Reviewers: dmitry, pjain Reviewed By: dmitry, pjain Subscribers: jason, yql Differential Revision: https://phabricator.dev.yugabyte.com/D16005
This was referenced May 30, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
Currently, we synchronously wait for a flush response every time we flush. We want to make this asynchronous to reduce the time spent waiting and improve the performance of COPY.
The text was updated successfully, but these errors were encountered: