ESQL: Correctly manage NULL data type for SUM by astefan · Pull Request #144942 · elastic/elasticsearch

astefan · 2026-03-25T14:44:26Z

Right now SUM returns a DOUBLE when NULL is provided which is wrong, it should be NULL or LONG. This PR considers NULL output as the right and consistent with the rest of the functions that don't have a special behavior (COUNT is never returning null for example).

Fixes #144914
AI-assisted PR.

Double data type to the actual NULL data type.

astefan · 2026-03-25T15:14:34Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/null.csv-spec

+null   | null
+;
+
+multipleAggsOverNullExpressions


Tests starting with this one were inspired by my previous unmerged PR #112392.

astefan · 2026-03-25T15:15:56Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/aggregate/Sum.java

    public DataType dataType() {
        DataType dt = field().dataType();
+        if (dt == DataType.NULL) {
+            return DataType.NULL;


IMHO, a NULL data type should produce a NULL result. If it provides a LONG, it could very well provide a DOUBLE, I see no difference. Plus that a NULL result is consistent with other (most) of the functions we have now.

I thought about this in #142657.

Generally, I think a NULL input should produce an output that is compatible with ("narrower" than) any of the outputs created if the NULL input was replaced by an input with a proper data type. The NULL type is (or should be) a bottom type in ESQL, so it should always be acceptable as output for a NULL input.

For SUM specifically, the output being either long or double, having a long output may also be acceptable. I think NULL is better (and more consistent with SQL!), but long is also acceptable because it is "narrower" (can be used in more places) than double; for instance, CASE(predicate, 1::long, x) requires x to be long, whereas for CASE(predicate, 1::double, x) both a double and a long are okay because CASE implicitly does a little bit of auto-casting.

The practical implication is that, for SUM(foo) used in a nullify query, some expressions depending on SUM(foo) start breaking if foo goes mapped->unmapped as long as SUM(null) is double. They shouldn't start breaking if SUM(null) is long or NULL.

…144914_fix

elasticsearchmachine · 2026-03-26T05:55:57Z

Hi @astefan, I've created a changelog YAML for you.

elasticsearchmachine · 2026-03-26T05:55:57Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

alex-spies

LGTM, thanks @astefan !

alex-spies · 2026-03-26T08:59:55Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/aggregate/Sum.java

    public DataType dataType() {
        DataType dt = field().dataType();
+        if (dt == DataType.NULL) {
+            return DataType.NULL;


I thought about this in #142657.

Generally, I think a NULL input should produce an output that is compatible with ("narrower" than) any of the outputs created if the NULL input was replaced by an input with a proper data type. The NULL type is (or should be) a bottom type in ESQL, so it should always be acceptable as output for a NULL input.

For SUM specifically, the output being either long or double, having a long output may also be acceptable. I think NULL is better (and more consistent with SQL!), but long is also acceptable because it is "narrower" (can be used in more places) than double; for instance, CASE(predicate, 1::long, x) requires x to be long, whereas for CASE(predicate, 1::double, x) both a double and a long are okay because CASE implicitly does a little bit of auto-casting.

alex-spies · 2026-03-26T09:00:21Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/aggregate/Sum.java

Fix looks right to me.

alex-spies · 2026-03-26T09:00:33Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-nullify.csv-spec

 ;

-cost:double | time_bucket:datetime
+cost:null   | time_bucket:datetime


alex-spies · 2026-03-26T09:01:24Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/null.csv-spec

++, thank you!

alex-spies · 2026-03-26T09:02:57Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/null.csv-spec

+s:long | r:long
+null   | null


Subtle, but looks right!

…144914_fix

* upstream/main: (146 commits) Revert "[Native] Gradle-related tweaks to improve handling of the simdvec native library (elastic#144539)" Fix ArrayIndexOutOfBoundsException in fetch phase with partial results (elastic#144385) ESQL: Correctly manage NULL data type for SUM (elastic#144942) [ESQL] Fixes GroupedTopNBenchmark not executing (elastic#144944) Fix reader context leak when query response serialization fails (elastic#144708) Validate individual offset values in BULK_OFFSETS bounds checks (elastic#144643) Merge main21 source set into main in simdvec (elastic#144921) [TEST] Unmute TsidExtractingIdFieldMapperTests (elastic#144848) [Native] Gradle-related tweaks to improve handling of the simdvec native library (elastic#144539) Fix `ThreadedActionListenerTests#testRejectionHandling` (elastic#144795) Add new DLM Frozen Tier Transition execution plugin and service (elastic#144595) Prometheus: execute query_range via parsed EsqlStatement plan (elastic#144416) Investigate `testBulkIndexingRequestSplitting` failure (elastic#144766) Add test utility for wrapping directories in FilterDirectory layer (elastic#143563) Fix ES|QL decay tests with negative scale (elastic#144657) Fix circuit breaker leak in percolator query construction (elastic#144827) Use XPerFieldDocValuesFormat in AbstractTSDBSyntheticIdCodec (elastic#144744) [DOCS] Document how reindex work in CPS (elastic#144016) Fix Int4 vector library tests failing on Java 21 (elastic#144830) [DiskBBQ] Fix index sorting on flush (elastic#144938) ...

bpintea · 2026-03-26T15:48:45Z

Nice.

* Correctly manage NULL data type for SUM by switching from returning a Double data type to the actual NULL data type.

Correctly manage NULL data type for SUM by switching from returning a

813b556

Double data type to the actual NULL data type.

elasticsearchmachine added the v9.4.0 label Mar 25, 2026

astefan commented Mar 25, 2026

View reviewed changes

astefan added 4 commits March 25, 2026 17:59

add capability

ff11c3d

Merge branch 'main' of https://github.com/elastic/elasticsearch into …

5f9e939

…144914_fix

More capabilities

e31c098

Merge branch 'main' of https://github.com/elastic/elasticsearch into …

e204268

…144914_fix

astefan added >bug :Analytics/ES|QL AKA ESQL labels Mar 26, 2026

astefan requested review from alex-spies and bpintea March 26, 2026 05:55

astefan marked this pull request as ready for review March 26, 2026 05:55

Merge branch 'main' into 144914_fix

c9c8331

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Mar 26, 2026

Update docs/changelog/144942.yaml

dccf93a

alex-spies approved these changes Mar 26, 2026

View reviewed changes

Merge branch 'main' of https://github.com/elastic/elasticsearch into …

71470e7

…144914_fix

astefan added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Mar 26, 2026

astefan merged commit e5aed38 into elastic:main Mar 26, 2026
37 checks passed

astefan deleted the 144914_fix branch March 26, 2026 11:45

seanzatzdev pushed a commit to seanzatzdev/elasticsearch that referenced this pull request Mar 26, 2026

ESQL: Correctly manage NULL data type for SUM (elastic#144942)

1449765

* Correctly manage NULL data type for SUM by switching from returning a Double data type to the actual NULL data type.

seanzatzdev pushed a commit to seanzatzdev/elasticsearch that referenced this pull request Mar 27, 2026

ESQL: Correctly manage NULL data type for SUM (elastic#144942)

486add8

* Correctly manage NULL data type for SUM by switching from returning a Double data type to the actual NULL data type.

mamazzol pushed a commit to mamazzol/elasticsearch that referenced this pull request Mar 30, 2026

ESQL: Correctly manage NULL data type for SUM (elastic#144942)

1ac8f38

* Correctly manage NULL data type for SUM by switching from returning a Double data type to the actual NULL data type.

Conversation

astefan commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Mar 26, 2026

Uh oh!

elasticsearchmachine commented Mar 26, 2026

Uh oh!

alex-spies left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bpintea commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

astefan commented Mar 25, 2026 •

edited

Loading