Skip to content

feat(es|ql): add dense_vector support in coalesce#142974

Merged
mromaios merged 19 commits intoelastic:mainfrom
mromaios:dense_vector_support_coalesce
Mar 3, 2026
Merged

feat(es|ql): add dense_vector support in coalesce#142974
mromaios merged 19 commits intoelastic:mainfrom
mromaios:dense_vector_support_coalesce

Conversation

@mromaios
Copy link
Copy Markdown
Contributor

@mromaios mromaios commented Feb 24, 2026

Closes #139928

Details:
This PR adds dense_vector support to COALSESCE.

Limitations:

  • The provided arguments need to be of type dense_vector, so passing in an array would require a to_dense_vector() function. I will address Implicit casting in a follow up PR

Note to reviewer:

Example requests:

DELETE my-index

PUT my-index
{
  "mappings": {
    "properties": {
      "fine_category_emb": {
        "type": "dense_vector",
        "dims": 3
      },
        "broader_category_emb": {
        "type": "dense_vector",
        "dims": 3
      },
      "my_text" : {
        "type" : "keyword"
      }
    }
  }
}

PUT my-index/_doc/1
{
  "my_text" : "text1",
  "fine_category_emb": [0.5, 10, 6],
  "broader_category_emb": [0.7, 15, 2]
}
PUT my-index/_doc/2
{
  "my_text" : "text2",
  "fine_category_emb": null,
  "broader_category_emb": [-0.7, 15, 2]
}

PUT my-index/_doc/3
{
    "my_text" : "text3",
    "fine_category_emb" : null,
    "broader_category_emb" : null
}

POST /_query
{
"query": "FROM my-index | eval search_emb = COALESCE(fine_category_emb, broader_category_emb, to_dense_vector([0.0, 0.0, 0.54])) "
}

Result:

{
    "took": 22,
    "is_partial": false,
    "completion_time_in_millis": 1771951486365,
    "documents_found": 3,
    "values_loaded": 12,
    "start_time_in_millis": 1771951486343,
    "expiration_time_in_millis": 1772383486308,
    "columns": [
        {
            "name": "broader_category_emb",
            "type": "dense_vector"
        },
        {
            "name": "fine_category_emb",
            "type": "dense_vector"
        },
        {
            "name": "my_text",
            "type": "keyword"
        },
        {
            "name": "search_emb",
            "type": "dense_vector"
        }
    ],
    "values": [
        [
            [
                0.7,
                15,
                2
            ],
            [
                0.5,
                10,
                5.9999995
            ],
            "text1",
            [
                0.5,
                10,
                5.9999995
            ]
        ],
        [
            [
                -0.7,
                15,
                2
            ],
            null,
            "text2",
            [
                -0.7,
                15,
                2
            ]
        ],
        [
            null,
            null,
            "text3",
            [
                0,
                0,
                0.54
            ]
        ]
    ]
}

4 | null
;

coalesceBfloat16VectorWithFallback
Copy link
Copy Markdown
Contributor Author

@mromaios mromaios Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ 💭 Do we need coalesce tests in the individual .csv-spec files, or is that an overkill? (same comment for the other types (default, bit, byte))

@github-actions
Copy link
Copy Markdown
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

@mromaios mromaios added >enhancement :Search Relevance/ES|QL Search functionality in ES|QL labels Feb 26, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @mromaios, I've created a changelog YAML for you.

@mromaios mromaios changed the title feat(es|ql): add dense_vector support to coalesce feat(es|ql): add dense_vector support in coalesce Feb 26, 2026
@mromaios mromaios marked this pull request as ready for review February 26, 2026 16:31
@mromaios mromaios requested a review from a team as a code owner February 26, 2026 16:31
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Feb 26, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@@ -0,0 +1,6 @@
id:l, float_vector:dense_vector, float_vector_2:dense_vector, byte_vector:dense_vector, byte_vector_2:dense_vector, bit_vector:dense_vector, bit_vector_2:dense_vector, bfloat16_vector:dense_vector, bfloat16_vector_2:dense_vector
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check: https://github.com/elastic/elasticsearch/pull/142974/changes#r2848341088
A question about whether we want coalesce tests in the dense_vector*.csv-spec tests.

@mromaios mromaios self-assigned this Feb 26, 2026
Copy link
Copy Markdown
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great job @mromaios ! 👏

One thing we're not doing is validating that we are always returning dense_vectors of the same dimensions 🤔 . That's not something we can validate at the analyzer level (as we don't have dimension information there), and tricky to do at runtime (we would have to change how the ExpressionEvaluator work).

Let's keep it this way - we could add an Evaluator that validates dimensions as a follow up PR if needed

Copy link
Copy Markdown
Contributor

@ioanatia ioanatia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good.

@carlosdelest I am not sure COALESCE needs to check whether the vectors have the same dimensions.

The documentation of COALESCE simply states:

Returns the first of its arguments that is not null. If all arguments are null, it returns null.

I worry we would overcomplicate the definition of COALESCE if we add a validation that is specific to dense_vector.
So what we have right now with this PR should suffice IMO.

@carlosdelest
Copy link
Copy Markdown
Member

I worry we would overcomplicate the definition of COALESCE if we add a validation that is specific to dense_vector.
So what we have right now with this PR should suffice IMO.

@ioanatia agreed - I was just thinking about how COALESCE allows users to have vector elements with different dimensions, something that the mapping guarantees us not having.

I already approved and agreed to keep it this way - makes no sense to complicate this and we can always add some validation layer afterwards.

@mromaios mromaios merged commit bca3093 into elastic:main Mar 3, 2026
35 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 3, 2026
…cations

* upstream/main: (56 commits)
  Mute org.elasticsearch.compute.lucene.read.ValueSourceReaderTypeConversionTests testLoadAll elastic#143471
  [DOCS] Fix ES|QL function and commands lists versioning metadata (elastic#143402)
  Fix MMROperatorTests (elastic#143453)
  Fix CSV-escaped quotes in generated docs examples (elastic#143449)
  Fix SQL client parsing of array header values (elastic#143408)
  ESQL: Add extended distribution tests and fault injection for external sources (elastic#143420)
  ESQL: Fix datasource test failures on Windows and FIPS (elastic#143417)
  Add circuit breaker for query construction to prevent OOM from automaton-based queries (elastic#142150)
  Cleanup SpecIT logging configuration (elastic#143365)
  ESQL: Prune unused regex extract nodes in optimizer (elastic#140982)
  Ensure supported locale outside of Entitlements check (elastic#143405)
  feat(es|ql): add dense_vector support in coalesce (elastic#142974)
  [Test] Unmute SnapshotStressTestsIT (elastic#143359)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.LookupJoinWithCoalesceFilterOnRight} elastic#143443
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndex} elastic#143442
  ESQL: Fix CCS exchange sink cleanup (elastic#143325)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndexAfterStats} elastic#143434
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyFromRow} elastic#143432
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:k8s-timeseries.Datenanos_derivative_compared_to_rate} elastic#143431
  Mute org.elasticsearch.multiproject.test.CoreWithMultipleProjectsClientYamlTestSuiteIT test {yaml=search.retrievers/result-diversification/10_mmr_result_diversification_retriever/Test MMR result diversification single index float type} elastic#143430
  ...
tballison pushed a commit to tballison/elasticsearch that referenced this pull request Mar 3, 2026
shmuelhanoch pushed a commit to shmuelhanoch/elasticsearch that referenced this pull request Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Search Relevance/ES|QL Search functionality in ES|QL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ES|QL - Support COALESCE function for dense_vector

4 participants