-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent write of nodes referencing the same node fails with Dgraph v1.1 #4079
Comments
This is a very interesting issue, thanks for the report. temporary workaroundFor now, you can build a workaround to this by simply retrying, so your for loop would look like this: errChan := make(chan error)
for i := range in {
go func(n node) {
js, err := json.Marshal(n)
require.NoError(t, err)
for {
_, err = dg.NewTxn().Mutate(context.Background(), &api.Mutation{SetJson: js, CommitNow: true})
if err != dgo.ErrAborted {
break
}
}
errChan <- err
}(in[i])
} This, unfortunately requires the next version of dgo which has not been released yet. Instead of Is this a bug?Regardless of the workaround it is worth talking about whether one of the transactions should indeed fail. I tested the same code using But still, the question remains. Should this transaction be aborted since it refers to the same ID or is this something that should be accepted? While they both refer to the same uid (the one you create first in I'm not 100% sure, so I think @manishrjain is the one that should answer this one. |
I think concurrent http requests which end up in a mutation referencing the same node is verly likely, hence I have to disagree on this beeing correct behavior. |
Also your workaround isn't really feasible for me, as some of my mutations share a transaction with other requests. Implementing this workaround would require a huge effort on my side, as I build an abstraction layer around Dgraph that build up these transactions. |
I’m trying to understand the cause of txn abort, but I don’t see the schema here. Can you please paste the schema you’re using? But, in general, I think any transactional system can result in aborts. Even if we were to figure out that somehow our txn system is a bit too strict and can be made lenient, aborts won’t go away. So, a way to retry is unavoidable. Perhaps, a mechanism could be built by Dgraph doing the retry internally if a user asks for it, if CommitNow is set to true. |
I updated the code to also write the schema. |
I'm having similar issues,
I think that would be a great feature. |
So, looking at the conflict generation code in v1.0.16, for how it was treating uids: And with my changes: 693e7db24 , which consolidates the logic for conflict detection in list types: The behavior is the same. If the (sub, pred, obj) where obj = UID, is the same, there would be conflict. That hasn't changed between the two versions. We do this to ensure that if a user has facets on that edge, they don't get overwritten. OR, if one op sets an object, while the other op deletes the object, we only apply one of those and reject the other. Seems like a logical approach. I have three potential solutions here:
|
@manishrjain, I can't speak for @jostillmanns but all three of these approaches would work in my use case. Options 1 & 2 would be my preference, but I think option 3 would also solve my problem. |
@manishrjain, thanks for sharing the deeper insights here. I would agree and say that all 3 of those options sound extremely appealing to me 😄. |
What is the status of this issue? If this is low priority for you guys I'd appreciate if you can point me in the direction to fix it myself. This particular issue is a huge priority for my team. |
Hi @manishrjain. Sorry it took me so long to write back on this issue. It took us another incident regarding this issue to look back at this 😉 . We discussed this again and would like to advocate for solution 1, as we have knowledge about the cases when overwrites would be okay. Example: we have scheduled processes, which write data in parallel and we can make sure that there are no conflicts on the client level. On the other hand we accept HTTP requests, where the current behavior (NO_ABORT = false) is desirable. |
What version of Dgraph are you using?
v1.1
Have you tried reproducing the issue with the latest release?
yes
What is the hardware spec (RAM, OS)?
16Gb, Linux 4.19.67-1-lts
Steps to reproduce the issue (command/config used to run Dgraph).
run test
Expected behaviour and actual result.
Expected behaviour: no error and two nodes are created
Actual behaviour: One mutate request returns
Transaction has been aborted. Please retry
Problem description
I can reproduce this to be working with Dgraph 1.0.16. If the nodes that are written concurrently don't reference the same child node everything works. Hence I guess the issue is with the child node reference.
The text was updated successfully, but these errors were encountered: