[YSQL] Implement Async Flush for COPY command. #11628

nathanhjli · 2022-03-02T16:22:42Z

Description

Currently, we synchronously wait for a flush response every time we flush. We want to make this asynchronous to reduce the time spent waiting and improve the performance of COPY.

Summary: Currently, as part of any statement, YSQL does some processing and buffers writes. The write buffer is flushed once either of the below conditions is hit - (1) the write buffer is full (i.e., hits ysql_session_max_batch_size limit) (2) a read op is required On a flush, YSQL directs the writes to required tablet servers in different rpcs (all issued in parallel). Only once responses to all RPCs are received, the YSQL backend makes further progress. This waiting behaviour affects performance of bulk loading using COPY FROM because YSQL spends a lot of time waiting for responses. It would be ideal to use that wait time for reading further tuples from the input source and perform necessary processing. In this diff, we are adding some asynchrony to the flush to allow the YSQL's COPY FROM to read more tuples after sending a set of rpcs to tablet servers (without waiting for the responses). This is done by storing the flush future and not waiting for its result immediately. Only when YSQL refills its write buffer, it will wait for the earlier flush's result just before performing the next flush call. Note that the right choice of ysql_session_max_batch_size is required to help us mask almost all of the wait time. The optimal batch size is one in which both of the following tasks (which will run simultaneously after this diff) take almost the same time - (1) YSQL fetching and buffering ysql_session_max_batch_size rows (2) Sending rpcs for the previous ysql_session_max_batch_size rows and arrival of responses from the tserver Note also that there might not be any value of ysql_session_max_batch_size for which both tasks complete at roughly the same time. This could be due to the inherently different speeds of disk reading and tablet servers' performance. Test Plan: Tested manually locally and on portal clusters. Experiments show that there is generally a 20-25% increase in speed when using async flush versus using regular flushing. Reviewers: kannan, smishra, pjain Reviewed By: pjain Subscribers: mtakahara, zyu, lnguyen, yql Differential Revision: https://phabricator.dev.yugabyte.com/D15757

Summary: This reverts commit 1a3a344. Reverting current implementation of async flush changes so that we can refactor, fix potential bugs, and improve implementation details. Test Plan: Jenkins: urgent Built and run a COPY locally to verify it worked. Reviewers: pjain Reviewed By: pjain Subscribers: dmitry, yql Differential Revision: https://phabricator.dev.yugabyte.com/D15975

…r pg_session, indexes supported Summary: Working on flakiness still (Aborted: backfill connection to DB failed), but getting this diff out for initial reviews and opinions. One thing to note is that the initial pipeline where we passed async flush seems to be less flakier since we can control exactly when we want to use async flush. Test Plan: Built locally and tested by creating indexes and performing COPY FROM. Also added java test: ./yb_build.sh --java-test org.yb.pgsql.TestAsyncFlush Reviewers: pjain, dmitry Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D16005

Summary: To simplify the code of `PgSession` class and simplify further implementation of write operation buffering subsystem improvements (issue #11628) code related to buffering is moved into separate class `PgOperationBuffer`. Current functionality of buffering subsystem is preserved (in general). Test Plan: Jenkins Reviewers: nli, pjain Reviewed By: pjain Differential Revision: https://phabricator.dev.yugabyte.com/D16083

Summary: Currently, as part of any statement, YSQL does some processing and buffers writes. The write buffer is flushed once either of the below conditions is hit - (1) the write buffer is full (i.e., hits ysql_session_max_batch_size limit) (2) a read op is required On a flush, YSQL directs the writes to required tablet servers in different rpcs (all issued in parallel). Only once responses to all RPCs are received, the YSQL backend makes further progress. This waiting behaviour affects performance of bulk loading using COPY FROM because YSQL spends a lot of time waiting for responses. It would be ideal to use that wait time for reading further tuples from the input source and perform necessary processing. In this diff, we are adding some asynchrony to the flush to allow the YSQL's COPY FROM to read more tuples after sending a set of rpcs to tablet servers (without waiting for the responses). This is done by storing the flush future and not waiting for its result immediately. Only when YSQL refills its write buffer, it will wait for the earlier flush's result just before performing the next flush call. Note that the right choice of ysql_session_max_batch_size is required to help us mask almost all of the wait time. The optimal batch size is one in which both of the following tasks (which will run simultaneously after this diff) take almost the same time - (1) YSQL fetching and buffering ysql_session_max_batch_size rows (2) Sending rpcs for the previous ysql_session_max_batch_size rows and arrival of responses from the tserver Note also that there might not be any value of ysql_session_max_batch_size for which both tasks complete at roughly the same time. This could be due to the inherently different speeds of disk reading and tablet servers' performance. Test Plan: Built locally and tested by creating indexes and performing COPY FROM. Previous experiments on portal clusters show that there is generally a 30% increase in speed when using async flush versus using regular flushing. Also Jenkins tests since this is a general enhancement that is used everywhere. Reviewers: dmitry, pjain Reviewed By: dmitry, pjain Subscribers: jason, yql Differential Revision: https://phabricator.dev.yugabyte.com/D16005

nathanhjli added the area/ysql Yugabyte SQL (YSQL) label Mar 2, 2022

nathanhjli self-assigned this Mar 2, 2022

nathanhjli closed this as completed Mar 11, 2022

sushantrmishra reopened this Mar 15, 2022

ymahajan mentioned this issue Mar 15, 2022

[New Feature] Faster Bulk-Data Loading in YugabyteDB #11765

Open

pkj415 closed this as completed Apr 27, 2022

This was referenced May 30, 2022

[YSQL][Bulk load] Cancelled non-transactional Copy and shutdown: missing rows #12684

Open

[YSQL][LST] ERROR: Illegal state: Used read time is not set #12464

Open

ryan-ally mentioned this issue Nov 5, 2022

[Snyk] Fix for 1 vulnerabilities ryan-ally/yugabyte-db#51

Open

nyndyny mentioned this issue Nov 5, 2022

[Snyk] Fix for 1 vulnerabilities nyndyny/yugabyte-db#27

Open

ryan-ally mentioned this issue Dec 25, 2022

[Snyk] Fix for 1 vulnerabilities ryan-ally/yugabyte-db#102

Open

nyndyny mentioned this issue Dec 25, 2022

[Snyk] Fix for 1 vulnerabilities nyndyny/yugabyte-db#63

Open

snyk-bot mentioned this issue Jan 9, 2023

[Snyk] Security upgrade webpack from 4.44.2 to 5.0.0 ryan-ally/yugabyte-db#107

Open

nyndyny mentioned this issue Jan 9, 2023

[Snyk] Security upgrade webpack from 4.44.2 to 5.0.0 nyndyny/yugabyte-db#66

Open

ryan-ally mentioned this issue Nov 27, 2023

[Snyk] Fix for 10 vulnerabilities ryan-ally/yugabyte-db#200

Open

nyndyny mentioned this issue Nov 28, 2023

[Snyk] Fix for 5 vulnerabilities nyndyny/yugabyte-db#171

Open

ryan-ally mentioned this issue Dec 20, 2023

[Snyk] Fix for 3 vulnerabilities ryan-ally/yugabyte-db#229

Open

nyndyny mentioned this issue Dec 20, 2023

[Snyk] Fix for 1 vulnerabilities nyndyny/yugabyte-db#200

Open

nyndyny mentioned this issue Mar 15, 2024

[Snyk] Fix for 2 vulnerabilities nyndyny/yugabyte-db#232

Open

ryan-ally mentioned this issue May 13, 2024

[Snyk] Fix for 1 vulnerabilities ryan-ally/yugabyte-db#276

Open

nyndyny mentioned this issue May 13, 2024

[Snyk] Fix for 2 vulnerabilities nyndyny/yugabyte-db#250

Open

ryan-ally mentioned this issue May 14, 2024

[Snyk] Fix for 1 vulnerabilities ryan-ally/yugabyte-db#278

Open

ryan-ally mentioned this issue Oct 12, 2024

[Snyk] Security upgrade webpack from 4.44.2 to 5.0.0 ryan-ally/yugabyte-db#336

Open

nyndyny mentioned this issue Oct 12, 2024

[Snyk] Security upgrade webpack from 4.44.2 to 5.0.0 nyndyny/yugabyte-db#309

Open

ryan-ally mentioned this issue Oct 19, 2024

[Snyk] Security upgrade webpack from 4.44.2 to 5.0.0 ryan-ally/yugabyte-db#338

Open

nyndyny mentioned this issue Oct 19, 2024

[Snyk] Security upgrade webpack from 4.44.2 to 5.0.0 nyndyny/yugabyte-db#310

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[YSQL] Implement Async Flush for COPY command. #11628

[YSQL] Implement Async Flush for COPY command. #11628

nathanhjli commented Mar 2, 2022

[YSQL] Implement Async Flush for COPY command. #11628

[YSQL] Implement Async Flush for COPY command. #11628

Comments

nathanhjli commented Mar 2, 2022

Description