-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always set BlockSize in encoder. #5255
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this affect only trigram indices or @filter
(intersections) with other indices like eq()
too?
Reviewable status: 0 of 1 files reviewed, all discussions resolved (waiting on @manishrjain)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It affects all the queries that:
- Use the Uids method
- The list only has an immutable layer.
- The list of uids needs to be intersected.
I'll update the description.
Reviewable status: 0 of 1 files reviewed, all discussions resolved (waiting on @manishrjain)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @danielmai, @manishrjain, and @martinmr)
posting/list.go, line 985 at r1 (raw file):
res := make([]uint64, 0, len(l.mutationMap)+codec.ApproxLen(l.plist.Pack)) out := &pb.List{} if len(l.mutationMap) == 0 && opt.Intersect != nil && len(l.plist.Splits) == 0 {
Removing this would affect performance quite a bit I think. Better to determine why IntersectCompressedWith fails.
8416e96
to
42cdc5d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got a few comments. Please address before merging.
Reviewable status: 0 of 4 files reviewed, 4 unresolved discussions (waiting on @danielmai, @manishrjain, and @martinmr)
algo/uidlist.go, line 99 at r2 (raw file):
lq := len(q) if ld == 0 || lq == 0 {
We could perhaps keep lq at least -- that is exact and not approximate.
posting/list.go, line 935 at r2 (raw file):
}) // Finish writing the last part of the list (or the whole list if not a multi-part list). x.Check(err)
This can perhaps also be an error return.
posting/list.go, line 938 at r2 (raw file):
plist.Pack = enc.Done() if plist.Pack != nil { x.AssertTrue(plist.Pack.BlockSize == uint32(blockSize))
If there's an error return by this func, please return error. I think we should go away from doing Asserts now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 4 unresolved discussions (waiting on @danielmai and @manishrjain)
algo/uidlist.go, line 99 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
We could perhaps keep lq at least -- that is exact and not approximate.
Done.
posting/list.go, line 985 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Removing this would affect performance quite a bit I think. Better to determine why IntersectCompressedWith fails.
Done. Not removing this anymore.
posting/list.go, line 935 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
This can perhaps also be an error return.
Done.
posting/list.go, line 938 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
If there's an error return by this func, please return error. I think we should go away from doing Asserts now.
Done.
Internal Ref:
Rollup was setting the BlockSize of the encoder for multi-part list but not for
normal lists.
This caused ApproxLen to return 0. Then the intersect algorithm exited early in this case.
Existing data will be encoded with the right block size the next time the list is rolled up.
This PR does the following.
returns 0. There's no difference if the actual length is zero since the smallest list is picked
to perform the iteration. If the list has data but a block size of zero, removing this check allows
the query to return the right result. Added a test to verify this scenario.
Fixes #5102
This change is
Docs Preview: