-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix OOM: Only keep deltas in memory for a pending txn #3349
Conversation
…eltas. This significantly reduces the memory usage, in fact, make it negligible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, how are you testing this? is this change something the existing tests would cover?
Reviewed 8 of 8 files at r1.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @manishrjain)
dgraph/cmd/counter/increment.go, line 131 at r1 (raw file):
} mu := newMutation(counter, pred)
mu is only being used in the next line. you could inline it like you are doing below.
dgraph/cmd/counter/increment.go, line 142 at r1 (raw file):
return Counter{}, err } _, err = txn.Mutate(context.Background(), newMutation(counter, pred))
I don't understand why there are now two calls to txn.Mutuate
when there was only one before.
posting/lists.go, line 192 at r1 (raw file):
Quoted 9 lines of code…
// deltas keep track of the updates made by txn. These must be kept around until written to disk // during commit. deltas map[string][]byte // max committed timestamp of the read posting lists. maxVersions map[string]uint64 // plists are posting lists in memory. They can be discarded to reclaim space. plists map[string]*List
I think it would help to mention what the key of these three maps represents.
posting/mvcc.go, line 106 at r1 (raw file):
Quoted 6 lines of code…
// txn.Lock() // // TODO: We can remove the deltas here. Now that we're using txn local cache. // for key := range txn.deltas { // keys = append(keys, key) // } // txn.Unlock()
Did you mean to keep this code?
posting/mvcc.go, line 120 at r1 (raw file):
continue } if ts, ok := txn.cache.maxVersions[key]; ok {
ok && ts >= commitTs
to avoid the nested ifs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The increment change helps test that the update can be read back correctly. And yeah, all existing tests including Jepsen tests should be able to ensure that updates are being applied correctly.
I'll work on your review tomorrow. Still stuck with blockade.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @manishrjain and @martinmr)
dgraph/cmd/counter/increment.go, line 142 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
I don't understand why there are now two calls to
txn.Mutuate
when there was only one before.
I'm testing that the query is able to read the pending mutation correctly.
posting/oracle.go, line 105 at r1 (raw file):
o.pendingTxns = make(map[uint64]*Txn) go func() {
Need to remove this goroutine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 8 files reviewed, 5 unresolved discussions (waiting on @manishrjain and @martinmr)
dgraph/cmd/counter/increment.go, line 131 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
mu is only being used in the next line. you could inline it like you are doing below.
Done.
posting/lists.go, line 192 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
// deltas keep track of the updates made by txn. These must be kept around until written to disk // during commit. deltas map[string][]byte // max committed timestamp of the read posting lists. maxVersions map[string]uint64 // plists are posting lists in memory. They can be discarded to reclaim space. plists map[string]*List
I think it would help to mention what the key of these three maps represents.
Done.
posting/mvcc.go, line 106 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
// txn.Lock() // // TODO: We can remove the deltas here. Now that we're using txn local cache. // for key := range txn.deltas { // keys = append(keys, key) // } // txn.Unlock()
Did you mean to keep this code?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r2.
Reviewable status: complete! all files reviewed, all discussions resolved
posting/mvcc.go, line 120 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
ok && ts >= commitTs
to avoid the nested ifs.
Done.
Currently, we bring and keep the entire posting list in memory for each pending txn, which remains there until the txn is committed or aborted. Mutations can easily touch a lot of data (including indices) which gets very expensive in terms of memory usage, causing OOMs during data loads. This PR fixes that issue by only keeping the deltas that need to be applied to the lists and discards the lists as soon as mutation application is done. On a commit, these deltas are then directly written to disk. On a read from the same txn, we apply the delta onto a newly read posting list, so a pending txn can read back its own write. This PR dramatically reduces the memory usage when mutations are going on, avoiding OOMs. Changes: * Instead of keeping the entire posting list in memory, only keep the deltas. This significantly reduces the memory usage, in fact, make it negligible. * Keep track of max version per posting list and use that to avoid repeat commits. * Revert changes to increment tool. They cause two counters to get created. * Add txn.Update in the right places, so any PLs in cache get converted to diffs. * Remove CommitToMemory
) Currently, we bring and keep the entire posting list in memory for each pending txn, which remains there until the txn is committed or aborted. Mutations can easily touch a lot of data (including indices) which gets very expensive in terms of memory usage, causing OOMs during data loads. This PR fixes that issue by only keeping the deltas that need to be applied to the lists and discards the lists as soon as mutation application is done. On a commit, these deltas are then directly written to disk. On a read from the same txn, we apply the delta onto a newly read posting list, so a pending txn can read back its own write. This PR dramatically reduces the memory usage when mutations are going on, avoiding OOMs. Changes: * Instead of keeping the entire posting list in memory, only keep the deltas. This significantly reduces the memory usage, in fact, make it negligible. * Keep track of max version per posting list and use that to avoid repeat commits. * Revert changes to increment tool. They cause two counters to get created. * Add txn.Update in the right places, so any PLs in cache get converted to diffs. * Remove CommitToMemory
Currently, we bring and keep the entire posting list in memory for each pending txn, which remains there until the txn is committed or aborted. Mutations can easily touch a lot of data (including indices) which gets very expensive in terms of memory usage, causing OOMs during data loads.
This PR fixes that issue by only keeping the deltas that need to be applied to the lists and discards the lists as soon as it done applying mutations. On a commit, these deltas are then directly written to disk. On a read from the same txn, we apply the delta onto a newly read posting list, so a pending txn can read back its own write.
This change is