Skip to content

Queries not showing with hash indexes #3103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mooncake4132 opened this issue Mar 6, 2019 · 4 comments
Closed

Queries not showing with hash indexes #3103

mooncake4132 opened this issue Mar 6, 2019 · 4 comments
Assignees
Labels
investigate Requires further investigation

Comments

@mooncake4132
Copy link

mooncake4132 commented Mar 6, 2019

This seems to be a regression from v1.0.11. A few of our tests started to fail after we switched to using v1.0.12 (docker image dgraph/dgraph:latest)

If I add a hash-indexed string-typed predicate and try to query it later in the same transaction, I will not be able to find it. However, if the index type was set to exact, the query after the mutation would be able to find the added predicate. Python script to reproduce below.

  • What version of Dgraph are you using?

Dgraph version : v1.0.12
Commit SHA-1 : 60d9ef0
Commit timestamp : 2019-03-05 17:59:30 -0800
Branch : HEAD
Go version : go1.11.5

  • Have you tried reproducing the issue with latest release?

Yes

  • What is the hardware spec (RAM, OS)?

Docker on Windows (one alpha and one zero on the same machine)

  • Steps to reproduce the issue (command/config used to run Dgraph).
import json

import pydgraph


client_stub = pydgraph.DgraphClientStub('localhost:9080')
client = pydgraph.DgraphClient(client_stub)


def mutate_and_query(client, schema):
    client.alter(pydgraph.Operation(drop_all=True))
    client.alter(pydgraph.Operation(schema=schema))

    NAME = 'test'
    txn = client.txn()
    txn.mutate(set_nquads=f"""
        <_:obj_0> <person.name> "{NAME}" .
    """)
    result = txn.query("""
        query q($person_name: string) {
            q(func: eq(person.name, $person_name)) {
                uid
                expand(_all_)
            }
        }
    """, variables={'$person_name': NAME})
    people = json.loads(result.json)
    print(people)
    try:
        txn.discard()
    except Exception:
        pass


mutate_and_query(client, 'person.name: string @index(exact) .')
mutate_and_query(client, 'person.name: string @index(hash) .')
  • Expected behaviour and actual result.

Actual (v1.0.12)

{'q': [{'uid': '0x9c58', 'person.name': 'test'}]}
{'q': []}

Expected (v1.0.11)

{'q': [{'uid': '0xc351', 'person.name': 'test'}]}
{'q': [{'uid': '0xc352', 'person.name': 'test'}]}
@srfrog srfrog added the investigate Requires further investigation label Mar 6, 2019
@martinmr martinmr self-assigned this Mar 22, 2019
@martinmr
Copy link
Contributor

I digged into this issue and it appears this stopped working when the alpha LRU cache was removed. In general, we cannot guarantee that the indices will work properly before the commit is finished as the data has not made it all the way to badger. The reason that exact works is that such a query is using the transaction cache.

We do not anticipate to support something like this in the future so I'd recommend committing the mutations beforehand if you require indices to work properly.

@manishrjain
Copy link
Contributor

This is the exact PR where I stopped supporting inequality on uncommitted secondary indices. a2e8376

Note that the lack of btree also means that a txn won't be able to read back its own uncommitted writes to secondary indices. I think that's a rare use case and hence a fair tradeoff, given the complexity and performance cost of having to overlay this structure on the DB.

@mooncake4132
Copy link
Author

Does that mean I should expect exact indices to stop working in transactions in the future? I've only used about half a dozen databases but this design sounds counter-intuitive, though mutation performance improvements are nice.

@manishrjain
Copy link
Contributor

Note that this only applies to uncommitted transaction updates on secondary indices. Exact and Hash index should work right now, but we make no guarantees whether their behavior would remain the same or change in the future. A read-modify-write cycle is the right way to approach a transaction, where you read the current state, generate your delta and then commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigate Requires further investigation
Development

No branches or pull requests

4 participants