Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OOM: Only keep deltas in memory for a pending txn #3349

Merged
merged 11 commits into from
May 2, 2019

Conversation

manishrjain
Copy link
Contributor

@manishrjain manishrjain commented May 1, 2019

Currently, we bring and keep the entire posting list in memory for each pending txn, which remains there until the txn is committed or aborted. Mutations can easily touch a lot of data (including indices) which gets very expensive in terms of memory usage, causing OOMs during data loads.

This PR fixes that issue by only keeping the deltas that need to be applied to the lists and discards the lists as soon as it done applying mutations. On a commit, these deltas are then directly written to disk. On a read from the same txn, we apply the delta onto a newly read posting list, so a pending txn can read back its own write.


This change is Reviewable

@manishrjain manishrjain requested review from martinmr and a team as code owners May 1, 2019 00:01
Copy link
Contributor

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, how are you testing this? is this change something the existing tests would cover?

Reviewed 8 of 8 files at r1.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @manishrjain)


dgraph/cmd/counter/increment.go, line 131 at r1 (raw file):

	}

	mu := newMutation(counter, pred)

mu is only being used in the next line. you could inline it like you are doing below.


dgraph/cmd/counter/increment.go, line 142 at r1 (raw file):

		return Counter{}, err
	}
	_, err = txn.Mutate(context.Background(), newMutation(counter, pred))

I don't understand why there are now two calls to txn.Mutuate when there was only one before.


posting/lists.go, line 192 at r1 (raw file):

Quoted 9 lines of code…
	// deltas keep track of the updates made by txn. These must be kept around until written to disk
	// during commit.
	deltas map[string][]byte

	// max committed timestamp of the read posting lists.
	maxVersions map[string]uint64

	// plists are posting lists in memory. They can be discarded to reclaim space.
	plists map[string]*List

I think it would help to mention what the key of these three maps represents.


posting/mvcc.go, line 106 at r1 (raw file):

Quoted 6 lines of code…
	// txn.Lock()
	// // TODO: We can remove the deltas here. Now that we're using txn local cache.
	// for key := range txn.deltas {
	// 	keys = append(keys, key)
	// }
	// txn.Unlock()

Did you mean to keep this code?


posting/mvcc.go, line 120 at r1 (raw file):

					continue
				}
				if ts, ok := txn.cache.maxVersions[key]; ok {

ok && ts >= commitTs to avoid the nested ifs.

Copy link
Contributor Author

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The increment change helps test that the update can be read back correctly. And yeah, all existing tests including Jepsen tests should be able to ensure that updates are being applied correctly.

I'll work on your review tomorrow. Still stuck with blockade.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @manishrjain and @martinmr)


dgraph/cmd/counter/increment.go, line 142 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

I don't understand why there are now two calls to txn.Mutuate when there was only one before.

I'm testing that the query is able to read the pending mutation correctly.


posting/oracle.go, line 105 at r1 (raw file):

	o.pendingTxns = make(map[uint64]*Txn)

	go func() {

Need to remove this goroutine.

Copy link
Contributor Author

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 8 files reviewed, 5 unresolved discussions (waiting on @manishrjain and @martinmr)


dgraph/cmd/counter/increment.go, line 131 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

mu is only being used in the next line. you could inline it like you are doing below.

Done.


posting/lists.go, line 192 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…
	// deltas keep track of the updates made by txn. These must be kept around until written to disk
	// during commit.
	deltas map[string][]byte

	// max committed timestamp of the read posting lists.
	maxVersions map[string]uint64

	// plists are posting lists in memory. They can be discarded to reclaim space.
	plists map[string]*List

I think it would help to mention what the key of these three maps represents.

Done.


posting/mvcc.go, line 106 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…
	// txn.Lock()
	// // TODO: We can remove the deltas here. Now that we're using txn local cache.
	// for key := range txn.deltas {
	// 	keys = append(keys, key)
	// }
	// txn.Unlock()

Did you mean to keep this code?

Done.

Copy link
Contributor

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 7 files at r2.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved


posting/mvcc.go, line 120 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

ok && ts >= commitTs to avoid the nested ifs.

Done.

@manishrjain manishrjain merged commit fe07e3e into master May 2, 2019
@manishrjain manishrjain deleted the mrjn/only-keep-diffs-in-memory branch May 2, 2019 00:03
manishrjain added a commit that referenced this pull request May 2, 2019
Currently, we bring and keep the entire posting list in memory for each pending txn, which remains there until the txn is committed or aborted. Mutations can easily touch a lot of data (including indices) which gets very expensive in terms of memory usage, causing OOMs during data loads.

This PR fixes that issue by only keeping the deltas that need to be applied to the lists and discards the lists as soon as mutation application is done. On a commit, these deltas are then directly written to disk. On a read from the same txn, we apply the delta onto a newly read posting list, so a pending txn can read back its own write.

This PR dramatically reduces the memory usage when mutations are going on, avoiding OOMs.

Changes:
* Instead of keeping the entire posting list in memory, only keep the deltas. This significantly reduces the memory usage, in fact, make it negligible.
* Keep track of max version per posting list and use that to avoid repeat commits.
* Revert changes to increment tool. They cause two counters to get created.
* Add txn.Update in the right places, so any PLs in cache get converted to diffs.
* Remove CommitToMemory
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 19, 2019
)

Currently, we bring and keep the entire posting list in memory for each pending txn, which remains there until the txn is committed or aborted. Mutations can easily touch a lot of data (including indices) which gets very expensive in terms of memory usage, causing OOMs during data loads.

This PR fixes that issue by only keeping the deltas that need to be applied to the lists and discards the lists as soon as mutation application is done. On a commit, these deltas are then directly written to disk. On a read from the same txn, we apply the delta onto a newly read posting list, so a pending txn can read back its own write.

This PR dramatically reduces the memory usage when mutations are going on, avoiding OOMs.

Changes:
* Instead of keeping the entire posting list in memory, only keep the deltas. This significantly reduces the memory usage, in fact, make it negligible.
* Keep track of max version per posting list and use that to avoid repeat commits.
* Revert changes to increment tool. They cause two counters to get created.
* Add txn.Update in the right places, so any PLs in cache get converted to diffs.
* Remove CommitToMemory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants