-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix count with facets filter #4751
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @ashish-goswami and @manishrjain)
query/query_facets_test.go, line 2219 at r1 (raw file):
{ "name": "Michonne", "count(friend)": 2
Also fetch cf: count(friend)
query/query_facets_test.go, line 2293 at r1 (raw file):
name alt_name count(alt_name) @facets(eq(origin, "french"))
Also retrieve count(alt_name)
without filter.
worker/task.go, line 506 at r1 (raw file):
} func retrieveValuesAndFacetsOrCount(args funcArgs, pl *posting.List, facetsTree *facetsTree,
Add some comments about in which scenarios the function only returns count and in which scenarios all of count, fcs and vals.
worker/task.go, line 596 at r1 (raw file):
} func (qs *queryState) facetsFilteredUidOrCount(q *pb.Query, pl *posting.List,
This should have a similar structure to the function above, you can return list, fcs and count from here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the way this code is structured is the best way to go about this. I think it's preferable two have different functions for the facet filter and the count (even if some code ends up being repeated) than to have a function that sometimes returns some value and sometimes some other value of a different type. From what I gather about the code, queries will either want the count or the filter so we are not duplicating the work if there's separate methods.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @ashish-goswami and @manishrjain)
worker/task.go, line 394 at r1 (raw file):
return err } // If count is getting asked we don't need to populate value and facets matrix.
If count is being requested, there's no need to populate value and facets matrix
worker/task.go, line 402 at r1 (raw file):
} // Means list was empty or got nothing in filtering.
Reaching here means ...
worker/task.go, line 506 at r1 (raw file):
} func retrieveValuesAndFacetsOrCount(args funcArgs, pl *posting.List, facetsTree *facetsTree,
I am not sure if this is the right way to do structure the code. Ideally, there's a function to retrieve values and facets and another one for the count and the caller decides which one to use.
Otherwise it's hard to understand what this function is doing.
worker/task.go, line 596 at r1 (raw file):
} func (qs *queryState) facetsFilteredUidOrCount(q *pb.Query, pl *posting.List,
Same issue here. Now this function is doing two different things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 8 unresolved discussions (waiting on @manishrjain, @martinmr, and @pawanrawal)
query/query_facets_test.go, line 2219 at r1 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Also fetch
cf: count(friend)
Done.
query/query_facets_test.go, line 2293 at r1 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Also retrieve
count(alt_name)
without filter.
Done.
worker/task.go, line 394 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
If count is being requested, there's no need to populate value and facets matrix
Done.
worker/task.go, line 402 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
Reaching here means ...
Done.
worker/task.go, line 506 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
I am not sure if this is the right way to do structure the code. Ideally, there's a function to retrieve values and facets and another one for the count and the caller decides which one to use.
Otherwise it's hard to understand what this function is doing.
Had a chat with @pawanrawal as well. Now I have two functions one calculating count and other values/facets.
worker/task.go, line 506 at r1 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Add some comments about in which scenarios the function only returns count and in which scenarios all of count, fcs and vals.
Now have different functions for count and facets. Each is self explanatory.
worker/task.go, line 596 at r1 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
This should have a similar structure to the function above, you can return list, fcs and count from here as well.
All functions have same structure now.
worker/task.go, line 596 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
Same issue here. Now this function is doing two different things.
Again here, have two functions here, one for count and other for uids/facets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @ashish-goswami, @manishrjain, and @martinmr)
worker/task.go, line 633 at r2 (raw file):
} func retrieveCountForUidPostings(args funcArgs, pl *posting.List, facetsTree *facetsTree,
This function has some unnecessary parameters, like out
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ashish-goswami and @manishrjain)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 2 files reviewed, 1 unresolved discussion (waiting on @manishrjain, @martinmr, and @pawanrawal)
worker/task.go, line 633 at r2 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
This function has some unnecessary parameters, like
out
.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 2 files reviewed, 2 unresolved discussions (waiting on @ashish-goswami, @martinmr, and @pawanrawal)
worker/task.go, line 675 at r3 (raw file):
} err := pl.Postings(opts, func(p *pb.Posting) error {
Find a way to consolidate all of this into one function. But, then you can pass a func to run on l.Iterate. So, there could be many funcs, creating this internal func, and passing it to a common func which has the logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 5 unresolved discussions (waiting on @ashish-goswami, @martinmr, and @pawanrawal)
worker/task.go, line 510 at r4 (raw file):
} func facestFilterValuesPostingList(args funcArgs, pl *posting.List, facetsTree *facetsTree,
facets
worker/task.go, line 520 at r4 (raw file):
// 2. Attr type is list and len(q.Langs)>0. Just checking len(q.Langs)>0 is not enough, because // in case of "*", len(q.Langs)>0 && q.ExpandAll is true. if (listType && len(q.Langs) > 0) || !listType || !q.ExpandAll {
if len(q.Langs) > 0 && !q.ExpandAll {
}
worker/task.go, line 550 at r4 (raw file):
fn(p) }
pickAll := q.ExpandAll || (q.Langs == 0 && listType)
// continue iteration
if pickAll {
return nil
}
// We already got the posting that we need.
return posting.ErrStopIteration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 5 unresolved discussions (waiting on @ashish-goswami, @martinmr, and @pawanrawal)
worker/task.go, line 510 at r4 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
facets
Done.
worker/task.go, line 550 at r4 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
pickAll := q.ExpandAll || (q.Langs == 0 && listType)
// continue iteration
if pickAll {
return nil
}// We already got the posting that we need.
return posting.ErrStopIteration
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r4, 1 of 1 files at r6.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @ashish-goswami and @pawanrawal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r4, 1 of 1 files at r6.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @ashish-goswami and @pawanrawal)
Fixes: #4659 Currently count with facets filter is not returning correct result. This is because, while calculating count we use pl.Length() function, which returns length of posting list. This will return correct result if there are no facets filter. But if there are facets filter, we need to filter postings from the posting list which satisfy facets filter. This PR fixes above issue.
Hey @mileung, can you provide us the sample data on which it is returning incorrect result? |
How do I do that? Can I use dgo? |
Yes. You can use dgo to run mutation with data and run above query to see count. |
@ashish-goswami I thought you wanted me to send a exported db, but here's a simple query and response
|
Hey @mileung, please help me reproduce this issue. I ran following mutation:
and got the expected response.
|
I still have the issue. try making the mutation with JSON notation? |
Hi @mileung, I tried json mutation:
Still getting the correct response. |
Hey @mileung, can you share the raw rdf/json data inserted in DB? |
For me, setting |
Hi @mileung, I tried following mutation and query as suggested by you:
Query:
Got the following result:
I think, this is the expected result. |
My bad, the On |
hey guys, There is confusion here. This PR is not in version v1.2.1 @mileung . The only version this PR is in is version v20.03.0-beta.20200320. If you read the changelog you will see that there is no such commit there. And if you analyze the dates, this PR joined the master 25 days ago. And version v1.2.1 was built +24 days (February 6th, 46 days ago). At most this would be in version v1.2.2 which was 2 days ago. But it's not there. If you need this fix, I suggest you wait for a stable version or use the recently released beta. Also, you could try to ask a cherry-pick for v1.x.x. I'm not sure if we gonna have more releases for v1.x.x. Need to check it. The second point to note, the use of facet with count is wrong. The correct form is like this:
The facet and filters and any other directive must be outside the parenthesis of the count. And there's no need to use Cheers. |
Fixes: #4659
Currently
count
withfacets filter
is not returning correct result. This is because, while calculating count we usepl.Length()
function, which returns length of posting list. This will return correct resultif there are no
facets filter
. But if there arefacets filter
, we need to filter posting from the posting list which satisfyfacets filter
.This PR fixes above issue.
This change is