-
Notifications
You must be signed in to change notification settings - Fork 740
docs: repare the transaction conflicts resolution flow #5133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -183,18 +183,19 @@ See the `transaction.proto` file for its definition. | |
| The commit process is as follows: | ||
|
|
||
| 1. The writer finishes writing all data files. | ||
| 2. The writer creates a transaction file in the `_transactions` directory. | ||
| This file describes the operations that were performed, which is used for two purposes: | ||
| (1) to detect conflicts, and (2) to re-build the manifest during retries. | ||
| 2. Build a transaction struct based on the current version to tell how the manifest changes. Different operations have different structures for metadata changing. | ||
| 3. Look for any new commits since the writer started writing. | ||
| If there are any, read their transaction files and check for conflicts. | ||
| If there are any conflicts, abort the commit. Otherwise, continue. | ||
| 4. Build a manifest and attempt to commit it to the next version. | ||
| If there are any, begin the rebase process: read their transaction structures and check for conflicts. | ||
| If there are any conflicts and the conflicts are not retriable, abort the commit. | ||
| If the conflicts are retriable, writer should go back to step 2 and rebuild the transaction based on the newest version to resolve the conflicts. | ||
|
wjones127 marked this conversation as resolved.
Outdated
|
||
| 4. Create a transaction file in the _transactions directory which describes the operations that were performed for two purposes: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We might want to update this soon with #4774 |
||
| (1) to detect conflicts, and (2) to re-build the manifest during retries. | ||
| 5. Build a manifest and attempt to commit it to the next version. | ||
| If the commit fails because another writer has already committed, go back to step 3. | ||
|
|
||
| When checking whether two transactions conflict, be conservative. | ||
| If the transaction file is missing, assume it conflicts. | ||
| If the transaction file has an unknown operation, assume it conflicts. | ||
| If the transaction structure is missing, assume it conflicts. | ||
| If the transaction structure has an unknown operation, assume it conflicts. | ||
|
|
||
| ### External Manifest Store | ||
|
|
||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we could actually just make this a mermaid diagram. Then it's more understandable by AIs and can just be edited as text. Here's how I would write it: flowchart LR
A[Write data files] --> C[Check for concurrent commits]
A -.-> DF{{data/31a7060e-4898-4ecd-a428-afbff3539fa6.lance}}
C --> D{Are there conflicts?}
D -->|None| E[Write transaction file]
E -.-> Txn{{_transactions/42-76019405-8d5a-43c3-a7a2-324ed49a9d75.txn}}
D-->|Resolvable| G[Resolve conflicts] --> E
G -.->|merged| Deletions{{_deletions/31a7060e-4898-4ecd-a428-afbff3539fa6.lance}}
D -->|Retryable| H[Retry operation 🔄]
D -->|Non-retryable| F[Abort ✗]
E --> I[Atomically write manifest]
I -.-> Manifest{{_versions/43.manifest}}
I --> J{Success?}
J -->|Yes| K[Complete ✓]
J -->|No| C
style A fill:#e1f5fe
style K fill:#c8e6c9
style F fill:#ffcdd2
style H fill:#fff3e0
style DF fill:#ddd
style Txn fill:#ddd
style Manifest fill:#ddd
style Deletions fill: #ddd
This basically describes four outcomes of checking for conflicts:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry a bit late to this conversation, did not realize this PR. I just published #5209 which refreshes a lot of content for the table format and overall format intro. I also updated conflict resolution with mermaid diagram, but a bit different from this one. I will check tomorrow to see if I can reuse the content here.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I guess this PR would be useless.
Thanks for correcting me @wjones127 . I was misunderstanding about Is this good to close? @jackye1995 |
Uh oh!
There was an error while loading. Please reload this page.