Skip to content

Comments

Implement KNN comparison with ESQL for so_vector#837

Merged
svilen-mihaylov-elastic merged 13 commits intomasterfrom
svilen/esql_knn
Sep 2, 2025
Merged

Implement KNN comparison with ESQL for so_vector#837
svilen-mihaylov-elastic merged 13 commits intomasterfrom
svilen/esql_knn

Conversation

@svilen-mihaylov-elastic
Copy link
Contributor

@svilen-mihaylov-elastic svilen-mihaylov-elastic commented Aug 26, 2025

Update the so_vector rally track to also exercise knn against the ESQL frontend.

@svilen-mihaylov-elastic svilen-mihaylov-elastic marked this pull request as ready for review August 27, 2025 16:09
"num_candidates": self._params.get("num_candidates", 50),
if self._in_esql_mode:
# Construct options JSON.
options_param = '{"num_candidates":' + str(self._num_candidates)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will change to "min_candidates" in elastic/elasticsearch#132944. k will also be removed, and instead we will need to specify a LIMIT to the query.

It's not a blocker for this PR, but just a heads up that we will need to change this when the above gets merged.

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

I think we can use param-source to have specific param handling for ESQL related operations, and use operation-type to help run specific ESQL workloads.

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these changes make sense to me. Its good to race the legacy API with esql.

My concern is that all these operations are part of the default operations. Are all the commands and esql actions fully released? If so, then good. If they are still behind a snapshot, I am not sure this can be merged.

Maybe add a "include_esql" tasks option that we can flip on/off for nightlies?

@svilen-mihaylov-elastic
Copy link
Contributor Author

svilen-mihaylov-elastic commented Aug 29, 2025

All these changes make sense to me. Its good to race the legacy API with esql.

My concern is that all these operations are part of the default operations. Are all the commands and esql actions fully released? If so, then good. If they are still behind a snapshot, I am not sure this can be merged.

Maybe add a "include_esql" tasks option that we can flip on/off for nightlies?

@benwtrent Thanks for the review. To make it more specific, are you proposing something like this to specify an operation (special is_esql option):

{
  "name": "esql-script-score-query-acceptedAnswerId",
  "operation-type": "esql",
  **"is_esql": {{esql_enabled | default(true)}},**
  "param-source": "knn-param-source",
  "exact": true,
  "k": 10,
  "filter": "acceptedAnswerId IS NOT NULL"
},

Do I need to do something to specifically add this parameter?

@benwtrent
Copy link
Member

@svilen-mihaylov-elastic I am saying if this is not all available in release builds, it should be be part of the default operations.

Maybe group all of esql together in the different sections and only optionally include them if a setting is set (see how we avoid force merge for serverless for example).

"iterations": 100,
"clients": 1
},
{% if is_esql_enabled %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benwtrent Decided to make each esql conditional to avoid complications with the serverless conditional checks towards the end of the file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I think you can group together all esql related operations and use a single if block.

Another option would be to create a different challenge for esql operations, but I don't see this being used consistently, and I think we will want to execute both esql and non esql for nightlies.

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTTM, thanks @svilen-mihaylov-elastic !

"iterations": 100,
"clients": 1
},
{% if is_esql_enabled %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I think you can group together all esql related operations and use a single if block.

Another option would be to create a different challenge for esql operations, but I don't see this being used consistently, and I think we will want to execute both esql and non esql for nightlies.

@svilen-mihaylov-elastic svilen-mihaylov-elastic enabled auto-merge (squash) September 2, 2025 15:05
@svilen-mihaylov-elastic svilen-mihaylov-elastic merged commit 328d58b into master Sep 2, 2025
13 checks passed
NickDris added a commit to NickDris/rally-tracks that referenced this pull request Sep 15, 2025
* Implement KNN comparison with ESQL for so_vector (elastic#837)

Update the so_vector rally track to also exercise knn against the ESQL frontend.

* Add patterned_text_index_options parameter (elastic#840)

Add patterned_text_index_options parameter to elastic/log tracks. Can accepts value of docs and positions, defaults to docs. Sets the index_options value of the message field in all indices. Only applies if patterned_text_message_field is set to true, and message fields are patterned_text, rather than match_only_text.

* Fix so_vector for serverless operator (elastic#844)

Misplaced comma

* ES|QL: Add queries for LOOKUP JOIN with multiple join keys (elastic#838)

* Add a few runtime fields to insist-chicken challenge (elastic#841)

* Add elastic/logs patterned-text queries challenge (elastic#842)

Add challenge which queries message field with several term and phrase queries. This is meant to test the patterned_text mapping type, but can also be used to test match_only_text message fields.

* Reduce number of iterations for queries that use runtime fields. (elastic#846)

* Remove routing_path from tsdb index template (elastic#847)

Setting this is not necessary as the routing path is set automatically for data streams and this template defines `"data_stream": {}`. It would also prevent an optimization added in elastic/elasticsearch#132566.

* Set index template for ingest_mode: data_stream (elastic#849)

Rolls back changes in elastic#722 that broke the `ingest_mode: data_stream`.

* ES|QL - so_vector knn function update (elastic#850)

* Add missing p_index_mode param (elastic#853)

---------

Co-authored-by: Svilen Mihaylov <svilen.mihaylov@elastic.co>
Co-authored-by: Parker Timmins <parker.timmins@elastic.co>
Co-authored-by: Evgenia Badiyanova <evgenia.badiyanova@elastic.co>
Co-authored-by: Luigi Dell'Aquila <luigi.dellaquila@gmail.com>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
@esbenchmachine esbenchmachine added the backport pending Awaiting backport to stable release branch label Dec 19, 2025
@esbenchmachine
Copy link
Collaborator

@svilen-mihaylov-elastic
A backport is pending for this PR. Please add all required vX.Y version labels.

  • If it is intended for the current Elasticsearch release version, apply the corresponding version label.
  • If it also supports past released versions, add those labels too.
  • If it only targets a future version, wait until that version label exists and then add it.
    (Each rally-tracks version label is created during the feature freeze of a new Elasticsearch branch).

Backporting entails:

  1. Ensure the correct version labels exist in this PR.
  2. Ensure backport PRs have backport label and are passing tests.
  3. Merge backport PRs (you can approve yourself and enable auto-merge).
  4. Remove backport pending label from this PR once all backport PRs are merged.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport pending Awaiting backport to stable release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants