Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

documenation: unique constraint #3152

Closed
pjebs opened this issue Mar 17, 2019 · 7 comments
Closed

documenation: unique constraint #3152

pjebs opened this issue Mar 17, 2019 · 7 comments
Assignees

Comments

@pjebs
Copy link
Contributor

pjebs commented Mar 17, 2019

There should be more details on how to artificially create a unique constraint.

In the docs, the closest I can see is in the upsert section under conflicts.

In my use case, I don't want to upsert. I just want to insert if the email doesn't exist. Otherwise failed.

From the upsert section this is what I infer for my use-case:

  1. Create transaction
  2. Query to see if a node exists with email
  3. If not => create node + Commit
  4. If already exists => Fail

It would be good if the docs explains how this is meant to work if 2 requests are occur at the same time.

Both requests do Step 1.
Both requests do Step 2 -> No node with email is found so both proceed to Step 3.
Both requests do Step 3 => Now there are 2 exact nodes violating the "unique constraint".

I don't understand from the "upsert" docs, how DGraph with transactions is meant to prevent that from happening. Could the docs explain what is going on.

@pjebs pjebs changed the title unique constraint documenation: unique constraint Mar 17, 2019
@srfrog srfrog self-assigned this Mar 18, 2019
@srfrog
Copy link
Contributor

srfrog commented Mar 18, 2019

Duplicate #3059

@srfrog srfrog closed this as completed Mar 18, 2019
@martinmr
Copy link
Contributor

Not sure how this issue is a duplicate of #3059 but to answer OP's question, the unique constraint is preserved and enforced because commits do not really happen at the same time. Each commit sends a request to the zero server and zero uses the RAFT algorithm (https://raft.github.io) to achieve consensus on what values should be stored. So after the first commit has been agreed to, the second transaction will fail because there's a consensus that another value already exists.

I don't think the documentation is a good place for this kind of explanation so I won't add it there.

@pjebs
Copy link
Contributor Author

pjebs commented Mar 18, 2019

Correct me if I am wrong. You are saying:

  1. My example illustrating the race condition is theoretical. DGraph is not designed to cater for this race condition - but it doesn't matter because in practice it will never happen. => Understood and Accepted.

2a. DGraph can handle only 1 mutation at a time in ENTIRE cluster irrespective of how big the data set is and how many servers in cluster.

OR
2b. Are you saying that DGraph is a bit smarter and if 2 or more transactions touch/impact the same node or edge, then only 1 succeeds and the others fail?

I can see how 2a prevents unique constraint violations (albeit inefficiently). If 2b is correct, I can't see how it prevents unique constraints. If I attempt to create 2 nodes with a unique predicate, that aren't connected to another node, then how does DGraph recognise that I intend for it to be unique?

@pjebs
Copy link
Contributor Author

pjebs commented Mar 18, 2019

I don't think the documentation is a good place for this kind of explanation so I won't add it there.

I think unique keys are important. There should a separate section below "upsert" section to explain how to artificially create a unique key.

@srfrog srfrog reopened this Mar 18, 2019
@srfrog
Copy link
Contributor

srfrog commented Mar 18, 2019

I don't think the documentation is a good place for this kind of explanation so I won't add it there.

I think unique keys are important. There should a separate section below "upsert" section to explain how to artificially create a unique key.

This is why I called it dupe and closed. This is a new feature. Please check #3059 and let me know if that's what you're hoping to do. I would call it merge in the traditional (relational) sense.

@pjebs
Copy link
Contributor Author

pjebs commented Mar 18, 2019

In #3059:

It definitely seems to be on the right track - and perhaps can be used for my use case.
The example does the mutation if the email exists.
If there is a function that allows the @if to cause the mutation if the email doesn't exist, it would suit my use case.

Either way, with regard to 2a or 2b, which is correct?

@martinmr
Copy link
Contributor

I checked the code and it seems like the proposals are processed in a linear manner. However, this means dgraph can process one proposal at a time, not one mutation at a time. Since a transaction can have multiple mutations, the effective performance of Dgraph is much faster than what this limitation would suggest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants