Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Sort state client cache keys before flush #1856

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

askurydzin
Copy link

Summary

in some concurrent scenarios several cache keys can be updated by several routines, thus, in order to avoid the deadlocks (in postgresql for example) we should update keys in sorted order.

in some concurrent scenarios several cache keys can be updated by
several routines, thus, in order to avoid the deadlocks (in postgresql
for example) we should update keys in sorted order.
@askurydzin askurydzin requested a review from a team as a code owner August 2, 2024 19:39
@github-actions github-actions bot added the feat label Aug 2, 2024
@askurydzin askurydzin changed the title feat: sort state client cache keys before flush feat: Sort state client cache keys before flush Aug 2, 2024
@github-actions github-actions bot added feat and removed feat labels Aug 2, 2024
@erezrokah
Copy link
Member

Hi @askurydzin 👋 Thanks for the PR. Do you mind opening an issue reproducing the scenario you're experiencing? That would help us understand the fix better. You can open the issue in our main repository.

Alternatively can you add a unit test that's failing without this fix (e.g. one that uses the state client and has a deadlock in Postgres)?

@askurydzin
Copy link
Author

askurydzin commented Aug 5, 2024

Hello, @erezrokah! It would be quite hard to reproduce it as it requires writing some demo plugin from scratch which uses state client. Would it be fine if I just briefly explain why primary keys for db needed to be sorted during insert/update?

When it goes about two concurrent transaction T1 with set of keys { K1, K2, K3 } and T2 with set of keys T2 { K2, K3, K4 }, and we don't sort keys, following issue happens in postgres destination:

T1 BEGIN T2 BEGIN
INSERT K1 (exclusive-lock for K1) INSERT K2 (exclusive-lock for K2)
INSERT K2 (wait share-lock for K1 / locked by K2) INSERT K1 (wait share-lock for K1 / locked by K1)

In the end both transactions appear to be in a dead-lock state.

So, sorting state table primary keys before inserting allows to avoid such situations.

@erezrokah
Copy link
Member

It would be quite hard to reproduce it as it requires writing some demo plugin from scratch which uses state client. Would it be fine if I just briefly explain why primary keys for db needed to be sorted during insert/update?

Is the issue not reproducible on any of our existing plugins? I think it's better to have a reproduction otherwise we can't confirm the fix is working.

@askurydzin
Copy link
Author

@erezrokah no, I think it's not reproducible in any of existing plugins as I'm not sure if any of existing plugins (that I may have an access to) use state client. I stumbled across the issue when writing my own plugin (it's private). The issue is quite straight-forward, but if existence of the plugin vulnerable to the issue is the case for this SDK, I cannot write some demo plugin because it wouldn't be an existing plugin per se, and it may seem too artificial example.

Hence, closing the PR. Sorry for wasting your time.

@askurydzin askurydzin closed this Aug 5, 2024
@erezrokah
Copy link
Member

erezrokah commented Aug 5, 2024

Hence, closing the PR. Sorry for wasting your time.

Not wasting anyone's time that's for sure. We appreciate the feedback.

There's a public plugin with a state client here https://github.com/cloudquery/cloudquery/blob/e94dda546631c5ef801852bbae997ad11f615f2e/plugins/source/hackernews/resources/services/items/items_fetch.go#L34

To clarify the request is not for one of our own plugins to reproduce this, but to have a test that fails without the fix, and passes with the fix, or a minimal reproduction scenario that shows the bug (maybe a very trimmed down version of your own plugin without any private code).

@askurydzin
Copy link
Author

askurydzin commented Aug 5, 2024

@erezrokah Hm, if the run of the mentioned plugin consistently emits the set of keys that may overlap, I think running it in parallel may reproduce the thing, then give me some time and I'll check if it work.

@askurydzin askurydzin reopened this Aug 5, 2024
@erezrokah
Copy link
Member

Hi @askurydzin, where you able to write a unit test that reproduces the issue?

@askurydzin
Copy link
Author

@erezrokah sorry, not yet, will provide steps to reproduce this week.

@askurydzin
Copy link
Author

@erezrokah I've prepared a repo with demo plugin with somewhat ridiculous logic, but it's aimed to demonstrate the deadlocks occurring with state-table overlapping keys:
https://github.com/askurydzin/cq-source-state-deadlock

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants