-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved latency in live loader using conflict resolution at client level #4362
Conversation
dgraph/cmd/live/batch.go
Outdated
return true | ||
} | ||
|
||
func (l *loader) removeMap(req api.Mutation) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (*loader).removeMap
is unused (from unused
)
dgraph/cmd/live/run.go
Outdated
@@ -190,15 +201,24 @@ func (l *loader) processLoadFile(ctx context.Context, rd *bufio.Reader, ck chunk | |||
if len(nqs) == 0 { | |||
continue | |||
} | |||
remainingNqs := make([]*api.NQuad, 0, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S1019: should use make([]*api.NQuad, 0) instead (from gosimple
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a map[uint64]struct{}
. When a txn starts, it would acquire lock, loop over the keys to check. If they all check out, then do the writes, then release lock. When a txn is done, then delete the keys.
If a req has conflict, block and loop repeatedly until it can proceed.
After further discussion, we decided to have a local buffer of size N in each goroutine. We can then test out N = 1 and N = inf, to see how the live loader performs.
Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @harshil-goel)
dgraph/cmd/live/batch.go, line 78 at r2 (raw file):
threadId uint64 currentUIDS map[string]uint64
Keys can be uint64 and vals can be bool, I think.
dgraph/cmd/live/batch.go, line 170 at r2 (raw file):
func (l *loader) loadOrStore(id string, threadId uint64) bool { if val, ok := l.currentUIDS[id]; !(ok && val != threadId) {
Don't need thread id thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 11 unresolved discussions (waiting on @golangcibot and @harshil-goel)
dgraph/cmd/live/batch.go, line 212 at r1 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
func
(*loader).removeMap
is unused (fromunused
)
removeConflictKeys
dgraph/cmd/live/batch.go, line 78 at r4 (raw file):
start time.Time currentUIDS map[uint64]struct{}
conflicts
dgraph/cmd/live/batch.go, line 139 at r4 (raw file):
atomic.AddUint64(&l.nquads, uint64(len(req.Set))) atomic.AddUint64(&l.txns, 1) l.removeMap(req)
l.deregister(req)
dgraph/cmd/live/batch.go, line 139 at r4 (raw file):
atomic.AddUint64(&l.nquads, uint64(len(req.Set))) atomic.AddUint64(&l.txns, 1) l.removeMap(req)
Do a defer upfront?
dgraph/cmd/live/batch.go, line 169 at r4 (raw file):
} func (l *loader) writeMap(req *api.Mutation) bool {
addConflictKeys
dgraph/cmd/live/batch.go, line 176 at r4 (raw file):
for _, i := range req.Set { objectKey := farm.Fingerprint64([]byte(i.ObjectId))
Conflict keys look at pred, subject. And then other things to generate more keys.
Better to generate the list of conflict keys upfront, before acquiring locks. Then acquire locks, do the checks, etc. Can even make it part of the request.
dgraph/cmd/live/batch.go, line 184 at r4 (raw file):
mSlice = append(mSlice, objectKey) if !reversePreds[i.Predicate] {
No reverse pred stuff. Use conflict keys.
dgraph/cmd/live/run.go, line 145 at r4 (raw file):
for _, schemaLine := range strings.Split(string(b), "\n") { if !strings.Contains(schemaLine, "@reverse") {
Remove all this special handling.
72f4463
to
6cc0cbd
Compare
dgraph/cmd/live/batch.go
Outdated
@@ -159,12 +167,99 @@ func (l *loader) request(req api.Mutation, reqNum uint64) { | |||
go l.infinitelyRetry(req, reqNum) | |||
} | |||
|
|||
func (l *loader) print(req api.Mutation) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (*loader).print
is unused (from unused
)
dgraph/cmd/live/run.go
Outdated
|
||
// TLS configuration | ||
x.RegisterClientTLSFlags(flag) | ||
} | ||
|
||
// processSchemaFile process schema for a given gz file. | ||
func processSchemaFile(ctx context.Context, file string, dgraphClient *dgo.Dgraph) error { | ||
func processSchemaFile(ctx context.Context, file string, dgraphClient *dgo.Dgraph) (*schema.ParsedSchema, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line is 114 characters (from lll
)
dgraph/cmd/live/run.go
Outdated
op := &api.Operation{} | ||
op.Schema = string(b) | ||
return dgraphClient.Alter(ctx, op) | ||
sch, err := schema.Parse(op.Schema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ineffectual assignment to err
(from ineffassign
)
fmt.Println("here") | ||
} | ||
txn := dgraphClient.NewTxn() | ||
defer txn.Discard(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error return value of txn.Discard
is not checked (from errcheck
)
dgraph/cmd/live/run.go
Outdated
} | ||
|
||
var sch LiveSchema | ||
json.Unmarshal(res.GetJson(), &sch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error return value of json.Unmarshal
is not checked (from errcheck
)
dgraph/cmd/live/run.go
Outdated
bufferSize int | ||
} | ||
|
||
type LivePredicate struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type name will be used as live.LivePredicate by other packages, and that stutters; consider calling this Predicate (from golint
)
dgraph/cmd/live/run.go
Outdated
ValueType types.TypeID | ||
} | ||
|
||
type LiveSchema struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type name will be used as live.LiveSchema by other packages, and that stutters; consider calling this Schema (from golint
)
…-io/dgraph into harshil-goel/remove-conflict-uids
This reverts commit 9b4af50.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r8, 3 of 5 files at r9, 2 of 6 files at r11.
Reviewable status: all files reviewed, 32 unresolved discussions (waiting on @golangcibot, @harshil-goel, and @martinmr)
dgraph/cmd/live/batch.go, line 55 at r11 (raw file):
PrintCounters bool MaxRetries uint32 bufferSize int
Add a comment about how this is different from Size.
dgraph/cmd/live/batch.go, line 175 at r11 (raw file):
} func typeValFrom(val *api.Value) (types.Val, error) {
Just expose this function in the gql package and reuse from there.
dgraph/cmd/live/batch.go, line 255 at r11 (raw file):
} func (l *loader) getConflictKeys(nq *api.NQuad) []uint64 {
Would have liked this function to be shared between server and client but looks like that is not easily possible?
dgraph/cmd/live/batch.go, line 320 at r11 (raw file):
func (l *loader) getConflicts(req *api.Mutation) []uint64 { keys := make([]uint64, 0, len(req.Set)) for _, i := range req.Set {
Call it nquad and not i
, i is for index.
dgraph/cmd/live/batch.go, line 327 at r11 (raw file):
func (l *loader) addConflictKeys(req *api.Mutation) bool { mSlice := l.getConflicts(req)
Whats mSlice
? Call it keys
or conflictKeys
.
dgraph/cmd/live/batch.go, line 332 at r11 (raw file):
defer l.uidsLock.Unlock() for _, val := range mSlice {
for _, key := range conflictKeys {
dgraph/cmd/live/batch.go, line 346 at r11 (raw file):
func (l *loader) deregister(req *api.Mutation) { mSlice := l.getConflicts(req)
This calculation would happen multiple times, could we cache this info somewhere after the first time its calculated?
dgraph/cmd/live/batch.go, line 371 at r11 (raw file):
} for len(buffer) >= l.opts.bufferSize-1 {
Why have -1
when you have >=
?
dgraph/cmd/live/batch.go, line 387 at r11 (raw file):
for len(buffer) > 0 { i := 0
All the code below this is repeat of the code above, please store it in a sub-function within makeRequests
and call that at both these places.
dgraph/cmd/live/run.go, line 151 at r11 (raw file):
md.Append("auth-token", opt.authToken) ctx = metadata.NewOutgoingContext(ctx, md) fmt.Println("here")
remove this
dgraph/cmd/live/run.go, line 259 at r11 (raw file):
buffer = append(buffer, nqs...) if len(buffer) >= opt.bufferSize*opt.batchSize {
if len < ... {
continue
}
// rest of the logic
dgraph/cmd/live/run.go, line 259 at r11 (raw file):
buffer = append(buffer, nqs...) if len(buffer) >= opt.bufferSize*opt.batchSize {
Is the first value supposed to be opt.bufferSize
or the concurrency?
dgraph/cmd/live/run.go, line 265 at r11 (raw file):
t := func(a *predicate) int { if a == nil {
The idea here just seems to be to have the predicates with Count at last. Could we simplified to.
if a != nil && a.Count {
return 1
}
return 0
Also add a comment talking about what you are doing here, that is keeping predicates with count index after those without it. And then also sorting predicates by their name.
dgraph/cmd/live/run.go, line 295 at r11 (raw file):
} for len(buffer) > 0 {
again repeat logic, can be put in a function called drain
or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address @pawanrawal ’s comments. I can review this tomorrow in the office.
Reviewable status: all files reviewed, 32 unresolved discussions (waiting on @golangcibot, @harshil-goel, and @martinmr)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 32 unresolved discussions (waiting on @golangcibot, @harshil-goel, @manishrjain, @martinmr, and @pawanrawal)
dgraph/cmd/live/batch.go, line 212 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
removeConflictKeys
Done.
dgraph/cmd/live/batch.go, line 78 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Keys can be uint64 and vals can be bool, I think.
Done.
dgraph/cmd/live/batch.go, line 170 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Don't need thread id thing.
Done.
dgraph/cmd/live/batch.go, line 78 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
conflicts
Done.
dgraph/cmd/live/batch.go, line 139 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
l.deregister(req)
Done.
dgraph/cmd/live/batch.go, line 139 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Do a defer upfront?
Done.
dgraph/cmd/live/batch.go, line 169 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
addConflictKeys
Done.
dgraph/cmd/live/batch.go, line 176 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Conflict keys look at pred, subject. And then other things to generate more keys.
Better to generate the list of conflict keys upfront, before acquiring locks. Then acquire locks, do the checks, etc. Can even make it part of the request.
Done.
dgraph/cmd/live/batch.go, line 184 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
No reverse pred stuff. Use conflict keys.
Done.
dgraph/cmd/live/batch.go, line 170 at r6 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
func
(*loader).print
is unused (fromunused
)
Done.
dgraph/cmd/live/batch.go, line 55 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Add a comment about how this is different from Size.
Done.
dgraph/cmd/live/batch.go, line 255 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Would have liked this function to be shared between server and client but looks like that is not easily possible?
A lot of the data structures are different and since there are no generics, we will have to rewrite those functions.
dgraph/cmd/live/batch.go, line 320 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Call it nquad and not
i
, i is for index.
Done.
dgraph/cmd/live/batch.go, line 327 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Whats
mSlice
? Call itkeys
orconflictKeys
.
Done.
dgraph/cmd/live/batch.go, line 332 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
for _, key := range conflictKeys {
Done.
dgraph/cmd/live/batch.go, line 371 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Why have
-1
when you have>=
?
Done.
dgraph/cmd/live/batch.go, line 387 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
All the code below this is repeat of the code above, please store it in a sub-function within
makeRequests
and call that at both these places.
Done.
dgraph/cmd/live/run.go, line 204 at r1 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
S1019: should use make([]*api.NQuad, 0) instead (from
gosimple
)
Done.
dgraph/cmd/live/run.go, line 145 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Remove all this special handling.
Done.
dgraph/cmd/live/run.go, line 118 at r7 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
line is 114 characters (from
lll
)
Done.
dgraph/cmd/live/run.go, line 155 at r7 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
ineffectual assignment to
err
(fromineffassign
)
Done.
dgraph/cmd/live/run.go, line 70 at r8 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
type name will be used as live.LivePredicate by other packages, and that stutters; consider calling this Predicate (from
golint
)
Done.
dgraph/cmd/live/run.go, line 82 at r8 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
type name will be used as live.LiveSchema by other packages, and that stutters; consider calling this Schema (from
golint
)
Done.
dgraph/cmd/live/run.go, line 151 at r8 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
Error return value of
txn.Discard
is not checked (fromerrcheck
)
Done.
dgraph/cmd/live/run.go, line 159 at r8 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
Error return value of
json.Unmarshal
is not checked (fromerrcheck
)
Done.
dgraph/cmd/live/run.go, line 151 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
remove this
Done.
dgraph/cmd/live/run.go, line 259 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
if len < ... {
continue
}// rest of the logic
Done.
dgraph/cmd/live/run.go, line 259 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Is the first value supposed to be
opt.bufferSize
or the concurrency?
It is supposed to be opt.bufferSize
. It's just the count of the number of requests that a thread can store in memory.
dgraph/cmd/live/run.go, line 265 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
The idea here just seems to be to have the predicates with Count at last. Could we simplified to.
if a != nil && a.Count { return 1 } return 0
Also add a comment talking about what you are doing here, that is keeping predicates with count index after those without it. And then also sorting predicates by their name.
Done.
dgraph/cmd/live/run.go, line 295 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
again repeat logic, can be put in a function called
drain
or something.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Add a lot of comments. This PR's description should be pretty elaborate about what and why and how. Also, mention the benchmarks that you ran and how things are improved.
Got a bunch of comments to address, so do those. Also, wait for @pawanrawal 's LGTM.
Reviewed 1 of 3 files at r8, 3 of 5 files at r9, 3 of 3 files at r12.
Reviewable status: all files reviewed, 52 unresolved discussions (waiting on @golangcibot, @harshil-goel, @manishrjain, @martinmr, and @pawanrawal)
dgraph/cmd/live/batch.go, line 55 at r11 (raw file):
Previously, harshil-goel (Harshil Goel) wrote…
Done.
could this be a const? If so, can be declared outside.
dgraph/cmd/live/batch.go, line 346 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
This calculation would happen multiple times, could we cache this info somewhere after the first time its calculated?
Don't calculate again. Store the keys in a derived struct. All refs to api.Mutation, can now be a request object or something.
dgraph/cmd/live/batch.go, line 85 at r12 (raw file):
start time.Time conflicts map[uint64]bool
Maybe make it a struct{}
dgraph/cmd/live/batch.go, line 91 at r12 (raw file):
reqs chan api.Mutation zeroconn *grpc.ClientConn sch *schema
schema
dgraph/cmd/live/batch.go, line 216 at r12 (raw file):
Facets: nq.Facets, } val, err := getTypeVal(nq.ObjectValue)
if err != nil { return ... }
// do something.
dgraph/cmd/live/batch.go, line 237 at r12 (raw file):
} func (l *loader) getConflictKeys(nq *api.NQuad) []uint64 {
return error.
dgraph/cmd/live/batch.go, line 237 at r12 (raw file):
} func (l *loader) getConflictKeys(nq *api.NQuad) []uint64 {
conflictKeysForNQuad(nq)
dgraph/cmd/live/batch.go, line 238 at r12 (raw file):
func (l *loader) getConflictKeys(nq *api.NQuad) []uint64 { sid, _ := strconv.ParseUint(nq.Subject, 0, 64)
error? If so, at least capture it and log it.
dgraph/cmd/live/batch.go, line 253 at r12 (raw file):
keys := make([]uint64, 0, 1)
extra vertical space.
dgraph/cmd/live/batch.go, line 267 at r12 (raw file):
if pred.Reverse { oi, _ := strconv.ParseUint(nq.ObjectId, 0, 64)
oi, err := ..
if err == nil { keys = append(...) }
dgraph/cmd/live/batch.go, line 271 at r12 (raw file):
} if nq.ObjectValue == nil || !(pred.Count || pred.Index) {
Put a comment at the top, referring to the code from which this logic is inspired.
dgraph/cmd/live/batch.go, line 275 at r12 (raw file):
} for _, tokerName := range pred.Tokenizer {
tokName
dgraph/cmd/live/batch.go, line 276 at r12 (raw file):
for _, tokerName := range pred.Tokenizer { toker, ok := tok.GetTokenizer(tokerName)
tok
dgraph/cmd/live/batch.go, line 278 at r12 (raw file):
toker, ok := tok.GetTokenizer(tokerName) if !ok { fmt.Printf("unknown tokenizer %q", tokerName)
continue?
Also, are we using fmt.Printfs or are we using glog.Infofs.
dgraph/cmd/live/batch.go, line 287 at r12 (raw file):
schemaVal, err := types.Convert(storageVal, types.TypeID(pred.ValueType)) x.Check(err)
This can trigger. Should not be a check failure.
dgraph/cmd/live/batch.go, line 289 at r12 (raw file):
x.Check(err) toks, err := tok.BuildTokens(schemaVal.Value, tok.GetLangTokenizer(toker, nq.Lang)) x.Check(err)
just return err
dgraph/cmd/live/batch.go, line 300 at r12 (raw file):
} func (l *loader) getConflicts(req *api.Mutation) []uint64 {
conflictKeysForRequest()
dgraph/cmd/live/batch.go, line 304 at r12 (raw file):
for _, nq := range req.Set { keys = append(keys, l.getConflictKeys(nq)...) }
Put a comment saying that live loader only needs to look at sets, not deletes.
dgraph/cmd/live/batch.go, line 321 at r12 (raw file):
for _, key := range keys { l.conflicts[key] = true
= struct{}{}
dgraph/cmd/live/batch.go, line 345 at r12 (raw file):
buffer := make([]api.Mutation, 0, l.opts.bufferSize)
Remove vert space.
dgraph/cmd/live/batch.go, line 347 at r12 (raw file):
drain := func(min int) { for len(buffer) > min {
for len(buffer) > maxSize
maxSize = l.opts.bufferSize
dgraph/cmd/live/batch.go, line 349 at r12 (raw file):
for len(buffer) > min { i := 0 for _, mu := range buffer {
for _, req := range buffer
dgraph/cmd/live/batch.go, line 350 at r12 (raw file):
i := 0 for _, mu := range buffer { if l.addConflictKeys(&mu) {
if !l.addConflictKeys() { buffer[i] = mu; i++; continue }
Execute req. req will no longer be part of the buffer.
dgraph/cmd/live/batch.go, line 351 at r12 (raw file):
for _, mu := range buffer { if l.addConflictKeys(&mu) { reqNum := atomic.AddUint64(&l.reqNum, 1)
Move it to l.request.
dgraph/cmd/live/batch.go, line 355 at r12 (raw file):
continue } buffer[i] = mu
Add a comment here explaining that there's a shift going on.
dgraph/cmd/live/batch.go, line 365 at r12 (raw file):
for req := range l.reqs { if l.addConflictKeys(&req) { reqNum := atomic.AddUint64(&l.reqNum, 1)
Doubt we need to pass reqNum to l.request.
dgraph/cmd/live/run.go, line 151 at r11 (raw file):
Previously, harshil-goel (Harshil Goel) wrote…
Done.
Remove this auth token business. The dgo client can deal with it on its own.
dgraph/cmd/live/run.go, line 263 at r12 (raw file):
return t(iPred) < t(jPred) }
vert space
dgraph/cmd/live/run.go, line 268 at r12 (raw file):
for len(buffer) > 0 {
vert space
dgraph/cmd/live/run.go, line 269 at r12 (raw file):
for len(buffer) > 0 { f := opt.batchSize
sz
dgraph/cmd/live/run.go, line 293 at r12 (raw file):
buffer = append(buffer, nqs...) if len(buffer) < opt.bufferSize*opt.batchSize {
All of this needs a lot of explanation for what is going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 52 unresolved discussions (waiting on @golangcibot, @manishrjain, @martinmr, and @pawanrawal)
dgraph/cmd/live/batch.go, line 55 at r11 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
could this be a const? If so, can be declared outside.
This is a user-defined variable, with default = 100
dgraph/cmd/live/batch.go, line 175 at r11 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Just expose this function in the gql package and reuse from there.
Done.
dgraph/cmd/live/batch.go, line 346 at r11 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Don't calculate again. Store the keys in a derived struct. All refs to api.Mutation, can now be a request object or something.
Done.
dgraph/cmd/live/batch.go, line 85 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Maybe make it a struct{}
Done.
dgraph/cmd/live/batch.go, line 91 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
schema
Done.
dgraph/cmd/live/batch.go, line 216 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
if err != nil { return ... }
// do something.
Done.
dgraph/cmd/live/batch.go, line 237 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
return error.
Done.
dgraph/cmd/live/batch.go, line 237 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
conflictKeysForNQuad(nq)
Done.
dgraph/cmd/live/batch.go, line 238 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
error? If so, at least capture it and log it.
Done.
dgraph/cmd/live/batch.go, line 253 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
extra vertical space.
Done.
dgraph/cmd/live/batch.go, line 267 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
oi, err := ..
if err == nil { keys = append(...) }
Done.
dgraph/cmd/live/batch.go, line 271 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Put a comment at the top, referring to the code from which this logic is inspired.
Done.
dgraph/cmd/live/batch.go, line 275 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
tokName
Done.
dgraph/cmd/live/batch.go, line 276 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
tok
Done.
dgraph/cmd/live/batch.go, line 278 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
continue?
Also, are we using fmt.Printfs or are we using glog.Infofs.
fmt.Printf
dgraph/cmd/live/batch.go, line 287 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
This can trigger. Should not be a check failure.
Done.
dgraph/cmd/live/batch.go, line 289 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
just return err
Done.
dgraph/cmd/live/batch.go, line 300 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
conflictKeysForRequest()
Done.
dgraph/cmd/live/batch.go, line 304 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Put a comment saying that live loader only needs to look at sets, not deletes.
Done.
dgraph/cmd/live/batch.go, line 321 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
= struct{}{}
Done.
dgraph/cmd/live/batch.go, line 345 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Remove vert space.
Done.
dgraph/cmd/live/batch.go, line 347 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
for len(buffer) > maxSize
maxSize = l.opts.bufferSize
Done.
dgraph/cmd/live/batch.go, line 349 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
for _, req := range buffer
Done.
dgraph/cmd/live/batch.go, line 350 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
if !l.addConflictKeys() { buffer[i] = mu; i++; continue }
Execute req. req will no longer be part of the buffer.
Done.
dgraph/cmd/live/batch.go, line 351 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Move it to l.request.
Done.
dgraph/cmd/live/batch.go, line 355 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Add a comment here explaining that there's a shift going on.
Done.
dgraph/cmd/live/batch.go, line 365 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Doubt we need to pass reqNum to l.request.
Done.
dgraph/cmd/live/run.go, line 263 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
vert space
Done.
dgraph/cmd/live/run.go, line 268 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
vert space
Done.
dgraph/cmd/live/run.go, line 269 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
sz
Done.
dgraph/cmd/live/run.go, line 293 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
All of this needs a lot of explanation for what is going on.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 7 files reviewed, 40 unresolved discussions (waiting on @golangcibot, @harshil-goel, @manishrjain, and @martinmr)
dgraph/cmd/live/run.go, line 249 at r14 (raw file):
// We collect opt.bufferSize requests and preprocess them. For the requests // to not confict between themself, we sort them on the basis of their predicates. // Predicates with count index will conflict among themselfs, so we keep them at
themselves
…evel (#4362) The live loader takes about 1 hour for 21 million dataset when run with -c=1. But when we run the same benchmark for c=10, the transactions are conflicting between themselves. This increases the time taken to process the nquads exponentially (~1 day). Benchmarks for master at c=1 ` Number of TXs run : 21240 Number of N-Quads processed : 21239870 Time spent : 1h8m12.999735722s N-Quads processed per second : 5190 ` This PR brings in a conflict detection at the client level, using which we can decide when to send the request. We also implemented a heuristic, that batches conflicting NQuads together so that there are less conflicts to start with. Benchmark for this PR at c = 10 ` Number of TXs run : 21240 Number of N-Quads processed : 21239870 Time spent : 23m42.685914685s N-Quads processed per second : 14936 ` <!-- Reviewable:start --> This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/dgraph-io/dgraph/4362) <!-- Reviewable:end --> (cherry picked from commit b9627f1)
The live loader takes about 1 hour for 21 million dataset when run with -c=1. But when we run the same benchmark for c=10, the transactions are conflicting between themselves. This increases the time taken to process the nquads exponentially (~1 day).
Benchmarks for master at c=1
Number of TXs run : 21240 Number of N-Quads processed : 21239870 Time spent : 1h8m12.999735722s N-Quads processed per second : 5190
This PR brings in a conflict detection at the client level, using which we can decide when to send the request. We also implemented a heuristic, that batches conflicting NQuads together so that there are less conflicts to start with.
Benchmark for this PR at c = 10
Number of TXs run : 21240 Number of N-Quads processed : 21239870 Time spent : 23m42.685914685s N-Quads processed per second : 14936
This change is