Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace TxnWriter with WriteBatch #5007

Merged
merged 56 commits into from
May 6, 2020
Merged

Conversation

all-seeing-code
Copy link
Contributor

@all-seeing-code all-seeing-code commented Mar 23, 2020

Replace txnWrite with batchWrite to improve performance. txnWrite commits one transaction at a time whereas batchWrite can commit multiple transactions in a batch which reduces I/O and hence results in performance improvements.

Fixes DGRAPH-945 https://dgraph.atlassian.net/browse/DGRAPH-945

Present below are the bench-marking done for different workloads as part of the review process.

Testing production usage.
Re-indexing is triggered on full-text index on name predicate in 21 million movie data set and corresponding time is noted for three different implementation of writer.

  1. TxnWriter (Relevant Code fragment)
Rebuilding index for predicate name: building temp index took: 37.006795029s
Rebuilding index for predicate name: writing index took: 5.052920693s
  1. Writebatch (Relevant Code fragment)
Rebuilding index for predicate name: building temp index took: 22.328473389s
Rebuilding index for predicate name: writing index took: 3.997808833s
  1. Writebatch with diff writer for each batch. Suggested by @gja (Relevant Code fragment)
Rebuilding index for predicate name: building temp index took: 23.359693661s
Rebuilding index for predicate name: writing index took: 4.111893385s

Without any reasonable performance benefit, I go ahead with 2. Writebatch in this PR.

Testing ad-hoc data loads and compare two Write APIs in badger
Use writer_test.go to reproduce the benchmark numbers. The two Write APIs I compare are one in which lock is acquired for every SetEntry and the second in which one lock is acquired for one write-load.

Write API (Locks for each Set entry)
name                                   time/op
Writer/TxnWriter-8                      492ms ±33%
Writer/WriteBatch1-8                   87.3ms ± 7%
Writer/WriteBatchMultThreadDiffWB-8    58.6ms ± 2%
Writer/WriteBatchMultThreadSameWB-8    92.4ms ± 1%
Writer/WriteBatchSingleThreadDiffWB-8  72.6ms ± 4%
Writer/WriteBatchSingleThreadSameWB-8  87.5ms ± 5%

Write API (Single Lock for one Write)
name                                   time/op
Writer/TxnWriter-8                      448ms ±21%
Writer/WriteBatch1-8                   92.2ms ±19%
Writer/WriteBatchMultThreadDiffWB-8    66.1ms ±11%
Writer/WriteBatchMultThreadSameWB-8    90.9ms ±14%
Writer/WriteBatchSingleThreadDiffWB-8  72.2ms ± 4%
Writer/WriteBatchSingleThreadSameWB-8  88.7ms ± 5%

Although for the data-load I didn't see any significant performance difference between two Write APIs, I propose to go ahead with Single lock for one Write after discussing with @jarifibrahim and @ashish-goswami

To test hypothesis that using same Writer and Flushing it at the end should perform better than initializing multiple Writers and Flushing them all independently, I tested with heavier workloads.

KV List size 500000

name                                   time/op
Writer/WriteBatchMultThreadDiffWB-8    1.77s ±28%
Writer/WriteBatchMultThreadSameWB-8    1.64s ± 5%
Writer/WriteBatchSingleThreadDiffWB-8  1.86s ±10%
Writer/WriteBatchSingleThreadSameWB-8  1.65s ± 7%

KV List size 5000000

name                                   time/op
Writer/WriteBatchMultThreadDiffWB-8    4.32s ± 1%
Writer/WriteBatchMultThreadSameWB-8    4.30s ± 7%
Writer/WriteBatchSingleThreadDiffWB-8  4.57s ± 5%
Writer/WriteBatchSingleThreadSameWB-8  4.33s ± 6%

As expected for heavier workloads one Writer works better than multiple writers because WriteBatch can parallelize the writes in the background and doesn't wait for individual flushes.


This change is Reviewable

Docs Preview: Dgraph Preview

@CLAassistant
Copy link

CLAassistant commented Mar 23, 2020

CLA assistant check
All committers have signed the CLA.

posting/index.go Outdated Show resolved Hide resolved
@mangalaman93 mangalaman93 changed the title anurags92/replace txnwrite Replace TxnWriter with WriteBatch Mar 24, 2020
Copy link
Contributor

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 3 files at r2, 1 of 2 files at r3.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @anurags92 and @manishrjain)

@all-seeing-code

This comment has been minimized.

posting/writer_test.go Outdated Show resolved Hide resolved
posting/writer_test.go Outdated Show resolved Hide resolved
posting/writer_test.go Outdated Show resolved Hide resolved
posting/writer_test.go Outdated Show resolved Hide resolved
Copy link
Member

@mangalaman93 mangalaman93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 6 unresolved discussions (waiting on @anurags92 and @manishrjain)


posting/writer_test.go, line 2 at r6 (raw file):

/*
 * Copyright 2019 Dgraph Labs, Inc. and Contributors

2020

@all-seeing-code

This comment has been minimized.

@all-seeing-code

This comment has been minimized.

@all-seeing-code all-seeing-code requested a review from martinmr May 4, 2020 09:33
Copy link
Contributor

@gja gja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 4 files at r13, 2 of 2 files at r14, 2 of 3 files at r15.
Reviewable status: 5 of 6 files reviewed, 14 unresolved discussions (waiting on @anurags92, @manishrjain, and @martinmr)


go.mod, line 72 at r15 (raw file):

)

//replace github.com/dgraph-io/badger/v2 => /home/alvis/go/src/github.com/dgraph-io/badger

remove comment


posting/index.go, line 576 at r15 (raw file):

	var counter uint64 = 1

	// TODO(Aman): Replace TxnWriter with WriteBatch. While we do that we should ensure that

remove comments


posting/index.go, line 629 at r15 (raw file):

	}
	stream.Send = func(kvList *bpb.KVList) error {
		for _, kv := range kvList.Kv {

could we please benchmark creating tmpWriter within Send() then flushing at the end of Send(), v/s creating one tmpWriter and calling SetEntryAt within the transaction. (since SetEntryAt requires locks, is it better to lock when flushing instead)


posting/index.go, line 631 at r15 (raw file):

		for _, kv := range kvList.Kv {
			e := &badger.Entry{Key: kv.Key, Value: kv.Value}
			if len(kv.UserMeta) > 0 {

add a comment on what this is doing


posting/index.go, line 634 at r15 (raw file):

				e.UserMeta = kv.UserMeta[0]
			}
			if err := tmpWriter.SetEntryAt(e, kv.Version); err != nil {

double check that SetEntryAt is thread safe


posting/index.go, line 638 at r15 (raw file):

			}
		}
		// if err := tmpWriter.Write(kvList); err != nil {

remove this comment


posting/index.go, line 691 at r15 (raw file):

			// which occurred before this schema mutation.
			//err := writer.SetAt(kv.Key, kv.Value, BitCompletePosting, r.startTs)

same logic as above. Is it faster to create the writer withing the send()


posting/writer_test.go, line 1 at r15 (raw file):

/*

Can we move this file somewhere else? Or can we create a separate directory for experimental benchmarks or something? This isn't testing production code


posting/writer_test.go, line 90 at r15 (raw file):

		b.ResetTimer()

		for i := 0; i < b.N; i++ {

for i from 1 to 5
txnWriter = newWriter
for entry from 1 to 10000
...
commit
end


worker/draft.go, line 714 at r15 (raw file):

}

// TODO(Anurag): Are we using pkey? Remove if unused.

also on a fixme, please add date.


worker/snapshot.go, line 41 at r15 (raw file):

	Flush() error
}
type newwriteBatch struct {

writeBatchWriter


worker/snapshot.go, line 49 at r15 (raw file):

}

func (nwb *newwriteBatch) Write(kvs *bpb.KVList) error {

i see this twice, can we move this to

bulkAddKvsToBatch(nwb.wb, kvs.Kv)

Copy link
Contributor

@jarifibrahim jarifibrahim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the benchmark numbers to the PR description. Mostly it looks good to me.

Reviewable status: 1 of 7 files reviewed, 20 unresolved discussions (waiting on @anurags92, @gja, @manishrjain, and @martinmr)


posting/writer_test.go, line 46 at r16 (raw file):

	}

	writeInBagder := func(db *badger.DB, KVList []kv, wg *sync.WaitGroup) {

Typo: writeInBagder => writeInBadger


posting/writer_test.go, line 51 at r16 (raw file):

		for _, typ := range KVList {
			e := &badger.Entry{Key: typ.key, Value: typ.value}
			wb.SetEntryAt(e, 1)

Check error, please


posting/writer_test.go, line 57 at r16 (raw file):

	}

	writeInBagder2 := func(wb *badger.WriteBatch, KVList []kv, wg *sync.WaitGroup) {

Typo here as well


posting/writer_test.go, line 83 at r16 (raw file):

		dbOpts.Dir = tmpIndexDir
		dbOpts.ValueDir = tmpIndexDir
		var db, err2 = badger.OpenManaged(dbOpts)

Ni: err2 => err


posting/writer_test.go, line 93 at r16 (raw file):

				k := typ.key
				v := typ.value
				w.SetAt(k, v, BitSchemaPosting, 1)

I think SetAt returns an error. We should check the error here.


posting/writer_test.go, line 118 at r16 (raw file):

			for _, typ := range KVList {
				e := &badger.Entry{Key: typ.key, Value: typ.value}
				wb.SetEntryAt(e, 1)

SetEntryAt also returns error. We should check the error.


x/x.go, line 854 at r16 (raw file):

//WriteBatchWriter exposes Write API batch writer.
func WriteBatchWriter(writer *badger.WriteBatch, kvList *bpb.KVList) error {

We already have a WriteBatchWriter defined in snapshot.go. Please call it something else. You could use the name suggested by @gja in the comments above.

Copy link
Contributor Author

@all-seeing-code all-seeing-code left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 7 files reviewed, 20 unresolved discussions (waiting on @anurags92, @gja, @jarifibrahim, @manishrjain, and @martinmr)


go.mod, line 72 at r15 (raw file):

Previously, gja (Tejas Dinkar) wrote…

remove comment

Done.


posting/index.go, line 576 at r15 (raw file):

Previously, gja (Tejas Dinkar) wrote…

remove comments

Done.


posting/index.go, line 629 at r15 (raw file):

Previously, gja (Tejas Dinkar) wrote…

could we please benchmark creating tmpWriter within Send() then flushing at the end of Send(), v/s creating one tmpWriter and calling SetEntryAt within the transaction. (since SetEntryAt requires locks, is it better to lock when flushing instead)

Updated writer_test.go to run in multi-thread way.


posting/index.go, line 631 at r15 (raw file):

Previously, gja (Tejas Dinkar) wrote…

add a comment on what this is doing

Would have to figure out what is meta used for. Have wrapped this in one function in x package. Would update that once I have more information.


posting/index.go, line 634 at r15 (raw file):

Previously, gja (Tejas Dinkar) wrote…

double check that SetEntryAt is thread safe

It internally calls SetEntry which is thread safe. (badger/batch.go Line#103)


posting/index.go, line 638 at r15 (raw file):

Previously, gja (Tejas Dinkar) wrote…

remove this comment

Done.


posting/index.go, line 691 at r15 (raw file):

Previously, gja (Tejas Dinkar) wrote…

same logic as above. Is it faster to create the writer withing the send()

Replied above. Now that I think of I feel we can actually benchmark re-indexing and compare the two approaches on 21mil dataset. Working on it.


posting/writer_test.go, line 1 at r15 (raw file):

Previously, gja (Tejas Dinkar) wrote…

Can we move this file somewhere else? Or can we create a separate directory for experimental benchmarks or something? This isn't testing production code

Sure, let me find out what's the best place. Although I think @manishrjain prefers to keep the benchmark files in the test suite itself. Please note that this is does not run any unit tests.


posting/writer_test.go, line 90 at r15 (raw file):

Previously, gja (Tejas Dinkar) wrote…

for i from 1 to 5
txnWriter = newWriter
for entry from 1 to 10000
...
commit
end

Done.


posting/writer_test.go, line 46 at r16 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

Typo: writeInBagder => writeInBadger

Done.


posting/writer_test.go, line 51 at r16 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

Check error, please

Done.


posting/writer_test.go, line 57 at r16 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

Typo here as well

Done.


posting/writer_test.go, line 93 at r16 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

I think SetAt returns an error. We should check the error here.

Done.


posting/writer_test.go, line 118 at r16 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

SetEntryAt also returns error. We should check the error.

Done.


worker/draft.go, line 714 at r15 (raw file):

Previously, gja (Tejas Dinkar) wrote…

also on a fixme, please add date.

Done.


worker/snapshot.go, line 41 at r15 (raw file):

Previously, gja (Tejas Dinkar) wrote…

writeBatchWriter

Kept the name of the struct as the same. I have changed the name of the function below. I feel that makes more sense. Open to suggestions.


worker/snapshot.go, line 49 at r15 (raw file):

Previously, gja (Tejas Dinkar) wrote…

i see this twice, can we move this to

bulkAddKvsToBatch(nwb.wb, kvs.Kv)

Done.


x/x.go, line 854 at r16 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

We already have a WriteBatchWriter defined in snapshot.go. Please call it something else. You could use the name suggested by @gja in the comments above.

Done.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Just a few things.

Reviewed 2 of 6 files at r16, 5 of 5 files at r17.
Reviewable status: all files reviewed, 24 unresolved discussions (waiting on @anurags92, @gja, @jarifibrahim, and @manishrjain)


posting/index.go, line 677 at r17 (raw file):

			// which occurred before this schema mutation.
			if err := writer.SetEntryAt(
				(&badger.Entry{Key: kv.Key,

Create the entry first, and then set it here.


worker/snapshot.go, line 38 at r17 (raw file):

)

type badgerWriter interface {

This can be removed now.


x/x.go, line 854 at r16 (raw file):

Previously, anurags92 (Anurag) wrote…

Done.

This should be moved to Badger, and made a func in WriteBatch.

Write(...), or WriteList.


x/x.go, line 853 at r17 (raw file):

}

//BulkWriteKVsBatchWriter exposes Write API batch writer.

Space after comment.


x/x.go, line 859 at r17 (raw file):

		//
		if len(kv.UserMeta) > 0 {
			e.UserMeta = kv.UserMeta[0]

This is how Badger does it. We have to use a byte slice because protobuf does not have a single byte data type. It only has a repeated bytes.

Copy link
Contributor Author

@all-seeing-code all-seeing-code left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 7 files reviewed, 24 unresolved discussions (waiting on @anurags92, @gja, @jarifibrahim, and @manishrjain)


posting/index.go, line 677 at r17 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Create the entry first, and then set it here.

Done.


worker/snapshot.go, line 41 at r15 (raw file):

Previously, anurags92 (Anurag) wrote…

Kept the name of the struct as the same. I have changed the name of the function below. I feel that makes more sense. Open to suggestions.

Not applicable with new Write API in Badger


worker/snapshot.go, line 49 at r15 (raw file):

Previously, anurags92 (Anurag) wrote…

Done.

Not applicable with new Write API in Badger


worker/snapshot.go, line 38 at r17 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This can be removed now.

Done.


x/x.go, line 854 at r16 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This should be moved to Badger, and made a func in WriteBatch.

Write(...), or WriteList.

Done. Badger PR: https://github.com/dgraph-io/badger/pull/1321/files


x/x.go, line 853 at r17 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Space after comment.

Don't need this function now.


x/x.go, line 859 at r17 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This is how Badger does it. We have to use a byte slice because protobuf does not have a single byte data type. It only has a repeated bytes.

Ok.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r18.
Reviewable status: all files reviewed, 25 unresolved discussions (waiting on @anurags92, @gja, @jarifibrahim, and @manishrjain)


posting/writer_test.go, line 1 at r15 (raw file):

Previously, anurags92 (Anurag) wrote…

Sure, let me find out what's the best place. Although I think @manishrjain prefers to keep the benchmark files in the test suite itself. Please note that this is does not run any unit tests.

In Go, you'd put the tests and benchmarks in the same package. As Anurag mentioned, they don't get run unless you specifically run them.


worker/draft.go, line 714 at r15 (raw file):

Previously, anurags92 (Anurag) wrote…

Done.

Git Blame can show you the date. I don't think there's a need to add date explicitly.


x/x.go, line 44 at r18 (raw file):

	"github.com/dgraph-io/dgo/v200"
	"github.com/dgraph-io/dgo/v200/protos/api"
	"github.com/golang/glog"

Avoid any unintentional change.

@all-seeing-code

This comment has been minimized.

@all-seeing-code all-seeing-code merged commit 20494f9 into master May 6, 2020
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 18, 2020
* Replace txnWriter with WriteBatch
* Add benchmark file writer_test.go

Co-authored-by: Ibrahim Jarif <[email protected]>
Co-authored-by: Ibrahim Jarif <[email protected]>
@all-seeing-code all-seeing-code deleted the anurags92/replace-txnwrite branch September 24, 2020 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

8 participants