Skip to content

Automatically map floats as dense vector#98512

Merged
kderusso merged 48 commits intoelastic:mainfrom
kderusso:kderusso/dense-vector-add-default-mappings
Sep 6, 2023
Merged

Automatically map floats as dense vector#98512
kderusso merged 48 commits intoelastic:mainfrom
kderusso:kderusso/dense-vector-add-default-mappings

Conversation

@kderusso
Copy link
Copy Markdown
Member

@kderusso kderusso commented Aug 15, 2023

Resolves #97532
Resolves #98512

Automatically maps float arrays between size of 128 and the max dense vector size to dense_vector type with a default similarity of cosine.

@kderusso kderusso added :EnterpriseSearch/Application Enterprise Search Team:Enterprise Search Meta label for Enterprise Search team v8.11.0 labels Aug 15, 2023
@kderusso
Copy link
Copy Markdown
Member Author

@elasticmachine test this please

@kderusso kderusso force-pushed the kderusso/dense-vector-add-default-mappings branch 4 times, most recently from b99e0aa to 30348f5 Compare August 18, 2023 19:42
@kderusso kderusso changed the title WIP: Automatically map floats as dense vector Automatically map floats as dense vector Aug 21, 2023
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @kderusso, I've created a changelog YAML for you.

@kderusso kderusso marked this pull request as ready for review August 22, 2023 13:12
@kderusso kderusso added :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team labels Aug 22, 2023
@kderusso kderusso requested review from a team and jimczi August 22, 2023 13:12
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/ent-search-eng (Team:Enterprise Search)

Copy link
Copy Markdown
Contributor

@demjened demjened left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with 1 question

@saikatsarkar056
Copy link
Copy Markdown
Contributor

From this issue: #97532

It's been suggested to automatically map float arrays as dense vectors when there are more than 100 values

Where do we check if the dense vector has more than 100 values? 🤔

@benwtrent benwtrent self-requested a review August 22, 2023 16:29
Copy link
Copy Markdown
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding my suggested tests will show some edges (especially around dynamic_templates).

@kderusso kderusso requested a review from benwtrent August 23, 2023 17:39
@kderusso
Copy link
Copy Markdown
Member Author

@elasticmachine test this please

1 similar comment
@kderusso
Copy link
Copy Markdown
Member Author

@elasticmachine test this please

Copy link
Copy Markdown
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like all the additional tests. Covers us well.

@kderusso kderusso force-pushed the kderusso/dense-vector-add-default-mappings branch from 587f052 to 1cccc2d Compare August 31, 2023 20:30
Copy link
Copy Markdown
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! 👏

Copy link
Copy Markdown
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on finding the issue with the dynamic mapping and dims!

Two more comments. This PR is looking good.

@kderusso kderusso requested a review from benwtrent September 6, 2023 18:16
@benwtrent benwtrent added the test-full-bwc Trigger full BWC version matrix tests label Sep 6, 2023
Copy link
Copy Markdown
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good. We got some great test coverage and it all seems ✅ .

Once all the tests are green and full-bwc-test run, I think this is good.

@kderusso kderusso merged commit 258d0cb into elastic:main Sep 6, 2023
benwtrent added a commit that referenced this pull request Nov 9, 2023
…101967)

After #98512 we incorrectly attempt to map an array of any single value type to dense_vector.

Instead, we should validate that ALL mappers are numeric and that ALL of them are `float`.

closes: #101965
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Nov 9, 2023
…1965 (elastic#101967)

After elastic#98512 we incorrectly attempt to map an array of any single value type to dense_vector.

Instead, we should validate that ALL mappers are numeric and that ALL of them are `float`.

closes: elastic#101965
benwtrent added a commit that referenced this pull request Nov 9, 2023
…1965 (#101967) (#101975)

* Fix incorrect dynamic mapping for non-numeric-value arrays #101965 (#101967)

After #98512 we incorrectly attempt to map an array of any single value type to dense_vector.

Instead, we should validate that ALL mappers are numeric and that ALL of them are `float`.

closes: #101965

* Update rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/60_dense_vector_dynamic_mapping.yml
@kderusso kderusso deleted the kderusso/dense-vector-add-default-mappings branch July 8, 2024 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:EnterpriseSearch/Application Enterprise Search >feature :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Enterprise Search Meta label for Enterprise Search team Team:Search Meta label for search team test-full-bwc Trigger full BWC version matrix tests v8.11.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add default dynamic mappings for dense_vector Support setting the number of dimensions dynamically on dense_vector

7 participants