-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OPTIMIZATION] Various optimizations for facets filter queries #4923
Conversation
, do it only once in preprocessFilter function. However, don't do it for all types. do for only few types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r1.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @ashish-goswami and @manishrjain)
query/outputnode.go, line 145 at r1 (raw file):
return []byte(fmt.Sprintf("%q", v.Value)), nil case types.IntID: switch v.Value.(type) {
You can store the value in val here and then use that.
query/outputnode.go, line 147 at r1 (raw file):
switch v.Value.(type) { case int64: // In types.Convert(), we always convert to int64 in types.Convert(). return []byte(strconv.Itoa(int(v.Value.(int64)))), nil
stconv.FormatInt
can be used here
types/conversion.go, line 58 at r1 (raw file):
*res = data case StringID, DefaultID: *res = *(*string)(unsafe.Pointer(&data))
Add a comment that we never modify the posting.Val in place, hence this should be safe to do.
worker/task.go, line 1906 at r1 (raw file):
v, ok := ftree.function.typesToValMap[fVal.Tid] if !ok { // Not found in map and hence convert is here.
convert it here
worker/task.go, line 1998 at r1 (raw file):
val types.Val fnType FuncType typesToValMap map[types.TypeID]types.Val
Add a comment here that we convert the value of the function arg to common types and store it here to allow fast comparison later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @ashish-goswami, @manishrjain, and @pawanrawal)
types/conversion.go, line 58 at r1 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Add a comment that we never modify the posting.Val in place, hence this should be safe to do.
How much performance does it really improve? Can we live without it? If the performance improvement is not significant, we should avoid using unsafe.
worker/task.go, line 1929 at r1 (raw file):
} res := make([]bool, 0, 2)
Add a comment as to why size 2
worker/task.go, line 1998 at r1 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Add a comment here that we convert the value of the function arg to common types and store it here to allow fast comparison later.
We could use an array here instead? We could define a type at the end of the list of types to figure out the max possible value for types.TypeID and use that as the size of the array.
Could just call it typesToVal
.
worker/task.go, line 2029 at r1 (raw file):
case compareAttrFn: ftree.function.val = types.Val{Tid: types.StringID, Value: []byte(tree.Func.Args[0])} ftree.function.typesToValMap = make(map[types.TypeID]types.Val)
If we decide to use map, we could preallocate the size.
worker/task.go, line 2030 at r1 (raw file):
ftree.function.val = types.Val{Tid: types.StringID, Value: []byte(tree.Func.Args[0])} ftree.function.typesToValMap = make(map[types.TypeID]types.Val) typeIDs := []types.TypeID{types.StringID, types.IntID, types.FloatID,
define an array instead using the syntax [...]{1, 2, 3}
and could move it outside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r1.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @ashish-goswami, @manishrjain, and @pawanrawal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 4 files reviewed, 8 unresolved discussions (waiting on @mangalaman93, @manishrjain, and @pawanrawal)
query/outputnode.go, line 145 at r1 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
You can store the value in val here and then use that.
Done.
query/outputnode.go, line 147 at r1 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
stconv.FormatInt
can be used here
strconv.Itoa()
calls it internally.
types/conversion.go, line 58 at r1 (raw file):
Previously, mangalaman93 (Aman Mangal) wrote…
How much performance does it really improve? Can we live without it? If the performance improvement is not significant, we should avoid using unsafe.
I also wanted to avoid this change. But we call types.Convert very frequently in our queries. Also most of the time conversion are to strings. Without this change profiles were showing close to 1 sec for this line. It was gone after this change.
worker/task.go, line 1906 at r1 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
convert it here
Done.
worker/task.go, line 1929 at r1 (raw file):
Previously, mangalaman93 (Aman Mangal) wrote…
Add a comment as to why size 2
Done.
worker/task.go, line 2029 at r1 (raw file):
Previously, mangalaman93 (Aman Mangal) wrote…
If we decide to use map, we could preallocate the size.
Done.
worker/task.go, line 2030 at r1 (raw file):
Previously, mangalaman93 (Aman Mangal) wrote…
define an array instead using the syntax
[...]{1, 2, 3}
and could move it outside.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd run this over a two group cluster, to ensure that we're not breaking any thing when doing tasks over the network. You could move the predicate to the other group, then hit the query to the first group, so it would have to transmit the query over -- that way, we can ensure that the results are sane.
Reviewed 1 of 4 files at r1, 3 of 3 files at r2.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @ashish-goswami, @mangalaman93, and @pawanrawal)
query/outputnode.go, line 147 at r1 (raw file):
Previously, ashish-goswami (Ashish Goswami) wrote…
strconv.Itoa()
calls it internally.
I'd say use FormatInt because it takes int64. So, you don't lose precision by converting it to int (on 32 bit machines). Also, might not need the two cases.
[]byte(Format(int64(num), 10)
query/outputnode.go, line 133 at r2 (raw file):
var ( boolTrueBytes = []byte("true")
Remove the suffix Bytes.
query/outputnode.go, line 149 at r2 (raw file):
return []byte(strconv.Itoa(int(num))), nil default: return []byte(strconv.Itoa(num.(int))), nil
This can be dangerous. If it is uint32, then num.(int) might panic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ran following mutation on 6 alpha, 3 zero node cluster:
{
set {
_:1 <name> "alice" (nick=true) .
_:1 <age> "27" (real=true) .
_:2 <name> "bob" (nick=false) .
_:2 <age> "28" (real=false) .
}
}
Ensured that name
belongs to group1
and age
belongs to group2
.
Ran below query:
{
q(func: eq(name, "bob")) {
name @facets(eq(nick,false))
age @facets(eq(real,true))
}
}
Also tried changing params values. Getting correct result every time.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @mangalaman93, @manishrjain, and @pawanrawal)
query/outputnode.go, line 147 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
I'd say use FormatInt because it takes int64. So, you don't lose precision by converting it to int (on 32 bit machines). Also, might not need the two cases.
[]byte(Format(int64(num), 10)
Done. Two cases are still required so that I get exact type inside a case. If I combine both like,
case int64, int:
I get interface type inside.
query/outputnode.go, line 133 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Remove the suffix Bytes.
Done.
query/outputnode.go, line 149 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
This can be dangerous. If it is uint32, then num.(int) might panic?
Yes this panics I pass it uint32. Since int and int64 are most common types for IntID, FormatInt is only called for them. For rest we are still using fmt.Spintf.
This PR tries to optimize queries with facets filter. It has following changes:
applyFacetsTree
function by avoiding some multiple call toparseFuncTypeHelper
andconversion calls to
ftree.function.val
types.Convert
valToBytes
This change is