-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(multi-tenancy): add support for multi-tenancy in bulk loader #7399
feat(multi-tenancy): add support for multi-tenancy in bulk loader #7399
Conversation
chunker/rdf_parser.go
Outdated
if err != nil { | ||
return rnq, err | ||
} | ||
rnq.Namespace = ns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a separate field in NQuad just for the namespace? We can use the label field?
The label is string
and Namespace is uint64
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current implementation, the user cannot pass non-namespace strings in the label.
We can allow it in the following way:
The user tells that his RDFs contain a label and also tell us to load the data into a specific namespace. Then while parsing the RDFs we will not treat the label as namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whole bunch of TODOs, so let's resolve them before merging. Leave it to @jarifibrahim for final approval.
Reviewed 16 of 17 files at r1, 1 of 1 files at r2.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @NamanJain8, @pawanrawal, and @vvbalaji-dgraph)
chunker/json_parser.go, line 464 at r2 (raw file):
// this int64 case is needed for FastParseJSON, which doesn't use json.Number case int64: namespace = uint64(ns)
Don't make it a number in JSON. Use hex strings. Add a string part. Do strip spaces and use strconv.ParseUint.
chunker/rdf_parser.go, line 231 at r2 (raw file):
// TODO(Naman): Ensure the label contains valid namespace. // Append the namespace to the predicate before returning NQuad. if rnq.Label != "" {
why not rename label to namespace?
chunker/rdf_parser.go, line 232 at r2 (raw file):
// Append the namespace to the predicate before returning NQuad. if rnq.Label != "" { ns, err := strconv.ParseUint(rnq.Label, 0, 64)
do strip spaces.
dgraph/cmd/bulk/merge_shards.go, line 63 at r2 (raw file):
// Put the first map shard in the first reduce shard since it contains all the reserved // predicates. // TODO(Naman): Why do we handle first shard differently?
This is because we want all reserved stuff to be in group 1. You can add this as a comment.
dgraph/cmd/bulk/schema.go, line 92 at r2 (raw file):
// checkAndSetInitialSchema initializes the schema for namespace if it does not already exist. func (s *schemaStore) checkAndSetInitialSchema(namespace uint64) { if _, ok := s.namespaces.Load(namespace); !ok {
if ok; return
lock
if ok return
do stuff.
dgraph/cmd/debug/run.go, line 574 at r2 (raw file):
} ns, attr := x.ParseNamespaceAttr(pk.Attr) x.Check2(buf.WriteString(fmt.Sprintf(" ns: 0x%x ", ns)))
%#x would give you the 0x.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 15 files reviewed, 11 unresolved discussions (waiting on @manishrjain, @NamanJain8, @pawanrawal, and @vvbalaji-dgraph)
go.mod, line 5 at r3 (raw file):
go 1.12 replace github.com/dgraph-io/dgo/v200 => /home/algod/go/src/github.com/dgraph-io/dgo
Remove this, please.
dgraph/cmd/bulk/loader.go, line 173 at r3 (raw file):
func (ld *loader) leaseNamespaces() { var maxNs uint64 ld.namespaces.Range(func(key, value interface{}) bool {
Add a comment.
dgraph/cmd/bulk/loader.go, line 290 at r3 (raw file):
// Send the graphql triples // TODO(Naman): Handle this.
Put more details here.
worker/export_test.go, line 76 at r3 (raw file):
func populateGraphExport(t *testing.T) { rdfEdges := []string{ `<1> <friend> <5> <author0> .`,
Add namespace over here and then validate it in a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 26 files reviewed, 11 unresolved discussions (waiting on @jarifibrahim, @manishrjain, @martinmr, @NamanJain8, @pawanrawal, and @vvbalaji-dgraph)
go.mod, line 5 at r3 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Remove this, please.
Done.
chunker/json_parser.go, line 464 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Don't make it a number in JSON. Use hex strings. Add a string part. Do strip spaces and use strconv.ParseUint.
Done.
chunker/rdf_parser.go, line 231 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
why not rename label to namespace?
We have added a new field Namespace
of type uint64
in the Nquad. If we had used string
type, we will have to do strconv.ParseUint(ns...)
at a bunch of places where we need namespace for the same Nquad.
Moreover, the Label
is not used by dgraph and just adds up an extra field in pb.Posting
.
chunker/rdf_parser.go, line 232 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
do strip spaces.
Done.
dgraph/cmd/bulk/loader.go, line 173 at r3 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Add a comment.
Done.
dgraph/cmd/bulk/loader.go, line 290 at r3 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Put more details here.
Done.
dgraph/cmd/bulk/merge_shards.go, line 63 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
This is because we want all reserved stuff to be in group 1. You can add this as a comment.
Done.
dgraph/cmd/bulk/schema.go, line 92 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
if ok; return
lock
if ok return
do stuff.
Done.
dgraph/cmd/debug/run.go, line 574 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
%#x would give you the 0x.
Done.
This PR adds bulk loader support in multi-tenancy.
Adds a flag --force-namespace in bulk loader which makes it load the data into a specified namespace. If this flag is not set, then it preserves the namespace from the schema file.
This change is