-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dgraph v20.03.0 Regexp Not returning expected results #5102
Comments
Issue seems related to: #5131 |
I was able to reproduced this. I noticed that the first time I ran the query, I got the expected result. I ran it again after a little bit and I then I got an empty result. The logs show rollups happened in between so I suspect that had something to do with this issue. The second query is still working even after more rollups so I am not clear why the issue is only affecting some queries and not others. I checked the p directory and none of the indices have split keys so this doesn't seem to be related to that. It also doesn't seem to be related to incremental rollups because #5131 is a duplicate of this issue and the reporter of that issue is using a build from December 2019. So far it looks like there's some issue with rollups + trigram indexes. Incremental rollups might have caused the issue to bubble up more often but the issue appears to have been there for a while. I'll keep debugging. |
More debugging:
I will look on how this list of trigrams are generated.
|
Found the location of the bug. In List.Uids there is what appears to be an optimization.
After the rollup, this branch is taken (because the mutable layer is empty after a rollup). There is a bug in IntersectCompressedWith. This is the only place where this function is called in the codebase. Commenting out this code and disabling the optimization fixes the queries. Not sure if it's worth it to try to fix the bug in this method or simply remove the optimization. There are actually three versions of the intersection algorithm (linear, with linear jumps, and with binary search). I am still not sure which of the three has the bug but I can at least confirm that the rollup is not creating invalid data. |
IntersectCompressedWith has been around for a long time. Do we know what exactly caused this issue? |
SUMMARY
Using a query like
regexp(name@en, /.*alien.*/i))
returns back an empty set. This was discovered in the dgraph tour: https://dgraph.io/tour/search/3/#ENVIRONMENT
dgraph/dgraph:v20.03.0
STEPS
STEP 1 - Env
Compose File
STEP 2 - Schema
Apply the Schema in:
STEP 3 - Load Data
STEP 3 - Indexes
Apply Indexes from: https://dgraph.io/tour/search/1/
STEP 4 - Run query
From: https://dgraph.io/tour/search/3/#
EXPECTED BEHAVIOR
ACTUAL BEHAVIOR
WORKAROUND
I modified the query and this worked:
The text was updated successfully, but these errors were encountered: