Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL][Bulk load] Cancelled non-transactional Copy and shutdown: missing rows #12684

Open
def- opened this issue May 27, 2022 · 3 comments
Open
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue qa_automation Bugs identified via itest-system, LST, Stress automation or causing automation failures

Comments

@def-
Copy link
Contributor

def- commented May 27, 2022

Jira Link: DB-578

Description

Not sure if this is important enough to warrant an issue, but this kind of missing flush could cause other missing data. I'm running current yugabyte-db master state (55c2d15). I created a large CSV file:

#!/usr/bin/env python3
print('a,b,c,d')
for i in range(100000000): # 3.6 GB
    print(f'{i*4},{i*4+1},{i*4+2},{i*4+3}')

And tried reading it into a local RF3 database (macOS, M1):

set yb_disable_transactional_writes=true;
create table t (a integer, b serial, c varchar, d int);
copy t from '/Users/deen/foo.csv' with (format csv, header);

After ~10 minutes I canceled the copy:

yugabyte=# copy t from '/Users/deen/foo.csv' with (format csv, header);
^CCancel request sent
ERROR:  canceling statement due to user request
CONTEXT:  COPY t, line 24622499: "98489988,98489989,98489990,98489991"

I would now expect 24622498 rows in t. When running select count(*) from t; immediately after the count is lower, after about a minute it reaches this number of rows. But if I restart the database before it reaches it, the last 498 rows seem to get lost permanently:

$ bin/yb-ctl --replication_factor 3 stop
$ bin/yb-ctl --replication_factor 3 start
$ bin/ysqlsh
yugabyte=# select count(*) from t;
  count
----------
 24622000
(1 row)

I couldn't always reproduce this, probably depends on how much has to be flushed and timing, but I got it 2 times separately.

@def- def- added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels May 27, 2022
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels May 27, 2022
@def-
Copy link
Contributor Author

def- commented May 30, 2022

@pkj415 Could this be related to #11628 ?

@yugabyte-ci yugabyte-ci added area/ysql Yugabyte SQL (YSQL) and removed area/docdb YugabyteDB core features labels May 31, 2022
@yugabyte-ci yugabyte-ci changed the title [DocDB][Bulk load] Cancelled non-transactional Copy and shutdown: missing rows [YSQL][Bulk load] Cancelled non-transactional Copy and shutdown: missing rows May 31, 2022
@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Jul 27, 2022
@kripasreenivasan kripasreenivasan added the qa_automation Bugs identified via itest-system, LST, Stress automation or causing automation failures label Sep 13, 2022
@sushantrmishra
Copy link

@def- This can happen with transactions disabled.

When transaction is enabled then transaction control the number of rows persisted.

Though with disabled transaction, each row gets inserted as single row transaction. If the copy gets cancelled abruptly then inflight writes which are already sent to docDB will get persisted.
Another layer is that there is buffering in YSQL layer as well, if the buffer is not sent yet to docDB yet then that might get cleaned up as well and will be reflected as lost rows.

@def-
Copy link
Contributor Author

def- commented Sep 15, 2022

@sushantrmishra Alright, so this is expected behavior? Can close the bug in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue qa_automation Bugs identified via itest-system, LST, Stress automation or causing automation failures
Projects
None yet
Development

No branches or pull requests

4 participants