Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Multi vector does not return correct result #1714

Closed
heemin32 opened this issue May 22, 2024 · 0 comments
Closed

[BUG] Multi vector does not return correct result #1714

heemin32 opened this issue May 22, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@heemin32
Copy link
Collaborator

heemin32 commented May 22, 2024

What is the bug?
Multi vector with faiss engine does not return correct result in OpenSearch 2.13 and 2.14.

How can one reproduce the bug?

Create index using faiss

PUT /my-knn-index-1
{
  "settings": {
    "index": {
      "knn": true,
      "knn.algo_param.ef_search": 100
    }
  },
  "mappings": {
    "properties": {
      "nested_field": {
        "type": "nested",
        "properties": {
          "my_vector": {
            "type": "knn_vector",
            "dimension": 3,
            "method": {
              "name": "hnsw",
              "space_type": "l2",
              "engine": "faiss",
              "parameters": {
                "ef_construction": 100,
                "m": 16
              }
            }
          }
        }
      }
    }
  }
}

Ingest document

{ "index": { "_index": "my-knn-index-1", "_id": "1" } }
{"nested_field":[{"my_vector":[1, 1, 1]}]}
{ "index": { "_index": "my-knn-index-1", "_id": "2" } }
{"nested_field":[{"my_vector":[2,2,2]}]}
{ "index": { "_index": "my-knn-index-1", "_id": "3" } }
{"nested_field":[{"my_vector":[3,3,3]}]}
{ "index": { "_index": "my-knn-index-1", "_id": "4" } }
{"nested_field":[{"my_vector":[4,4,4]}]}

Query

{
  "query": {
    "nested": {
      "path": "nested_field",
      "query": {
        "knn": {
          "nested_field.my_vector": {
            "vector": [
              4,
              4,
              4
            ],
            "k": 1
          }
        }
      }
    }
  }
}

Response

It should return document 4 with vector [4, 4, 4] but it returned document 1 with vector [1, 1, 1]

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.035714287,
    "hits": [
      {
        "_index": "my-knn-index-1",
        "_id": "1",
        "_score": 0.035714287,
        "_source": {
          "nested_field": [
            {
              "my_vector": [
                1,
                1,
                1
              ]
            }
          ]
        }
      }
    ]
  }
}

Query with k 2

{
  "query": {
    "nested": {
      "path": "nested_field",
      "query": {
        "knn": {
          "nested_field.my_vector": {
            "vector": [
              4,
              4,
              4
            ],
            "k": 2
          }
        }
      }
    }
  }
}

Response

When k is 2, it returns one correct vector [4, 4, 4] but still has wrong vector [1, 1, 1]

{
  "took": 9,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1.0,
    "hits": [
      {
        "_index": "my-knn-index-1",
        "_id": "4",
        "_score": 1.0,
        "_source": {
          "nested_field": [
            {
              "my_vector": [
                4,
                4,
                4
              ]
            }
          ]
        }
      },
      {
        "_index": "my-knn-index-1",
        "_id": "1",
        "_score": 0.035714287,
        "_source": {
          "nested_field": [
            {
              "my_vector": [
                1,
                1,
                1
              ]
            }
          ]
        }
      }
    ]
  }
}

When k is 3, it returns two correct vectors, [4, 4, 4], and [3, 3, 3] but still has wrong vector [1, 1, 1]

What is the expected behavior?
The search results contain one invalid value for every query.

What is your host/environment?
OS 2.13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant