Skip to content

[ES|QL] improve available column lists#233221

Merged
drewdaemon merged 104 commits intoelastic:mainfrom
drewdaemon:align-column-existence
Sep 10, 2025
Merged

[ES|QL] improve available column lists#233221
drewdaemon merged 104 commits intoelastic:mainfrom
drewdaemon:align-column-existence

Conversation

@drewdaemon
Copy link
Contributor

@drewdaemon drewdaemon commented Aug 27, 2025

Summary

We want one system that efficiently computes a list of available columns at any given command position in the query. This is crucial for both validation (column existence checks) and autocomplete (suggested columns).

The chosen system is getColumnsByTypeHelper which relies on each command's columnsAfter method to report any changes to the column list made by that command.

Key changes

  • collectUserDefinedColumns has been subsumed by command-specific columnsAfter methods
  • a unique column list is computed for each command during validation (ref). Prior to this change, we were collecting all columns available anywhere in the query and using that list.
  • additional fields (such as from a lookup index or an enrichment policy) are added to the column list within the columnsAfter methods of the respective commands
  • user-defined columns and fields from ES are now merged into a single list on the context object called columns. This list contains objects defined by the ESQLColumnData interface. The two types of column are still differentiated per-object (specifically by the presence of userDefined: true/false)
  • the getQueryForFields function is now internal to getColumnsByTypeHelper
  • each JOIN and ENRICH command in a query generates its own network call. These calls happen serially. (This is potentially bit slower than what was happening before. There are some optimizations we could do, but we can monitor for feedback before investing.)

User-facing improvements

lookup index fields suggested

Screenshot 2025-09-05 at 11 13 26 AM

enrichment policy fields suggested

Screenshot 2025-09-05 at 11 18 29 AM

enrichment fields correctly validated

Screenshot 2025-09-05 at 11 21 14 AM

here agent.keyword is a field available in the source index, but it is not an enrichment field so it is not available in WITH

catching columns used before they are defined

Screenshot 2025-09-05 at 11 12 14 AM

Checklist

Release note

The ES|QL autocomplete feature has been extended so that fields from the lookup index are suggested after a complete LOOKUP JOIN command. The same enhancement has been applied to columns from an enrichment policy after a complete ENRICH command.

The client-side validator's column detection has also been refined. It now reports use-before-defined errors and reliably knows the type of each column.

Copy link
Contributor

@stratoula stratoula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inlinestats bug got fixed. Added 2 more comments but I am approving as ithere is no need to review again.

Wonderful cleanup 👏

}

context?.fields.set('_fork', {
context?.columns.set('_fork', {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also remove this @drewdaemon ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gone in 383cdbd

type: 'keyword',
},
]);
context?.columns.set(targetName, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this

Copy link
Contributor

@bartoval bartoval Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer want to use custom column names like RERANK col0 = ? if we remove that probably we receive a validation error. I also need this input to remove the col0 suggestion from autocomplete

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think we will with Drew's changes in this PR but maybe I am wrong. Drew keep me honest here

Copy link
Contributor Author

@drewdaemon drewdaemon Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stratoula is right. As of this change we no longer check the left-side of an assignment against the column list. We never should have done it... it was a fluke that those columns were being included in the list even before they were even created in the query.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gone in f64c58e

{ name: '@timestamp', type: 'date', userDefined: false },
];

const queryString = `FROM a | STATS AVG(field1) BY buckets=BUCKET(@timestamp,50,?_tstart,?_tend)`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are testing stats instead of inlinestats. Why are all the previous tests gone?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh good catch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess inlinestats is just my bogeyman

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in 74432cb

Why are all the previous tests gone?

I thought this was adequate. Inlinestats now uses stats columns after under the hood which is tested extensively in stats. Not sure though. I could see an argument the other way, too if we were doing black-box testing.

@drewdaemon drewdaemon enabled auto-merge (squash) September 10, 2025 17:02
@elasticmachine
Copy link
Contributor

elasticmachine commented Sep 10, 2025

💔 Build Failed

Failed CI Steps

Test Failures

  • [job] [logs] Jest Tests #17 / INLINESTATS gets the columns after the query
  • [job] [logs] Jest Tests #17 / INLINESTATS gets the columns after the query

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
onechat 485 489 +4
unifiedSearch 453 459 +6
total +10

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/esql-validation-autocomplete 34 29 -5

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
console 195.3KB 195.3KB +15.0B
esql 270.3KB 270.3KB +15.0B
onechat 838.2KB 236.8KB -601.5KB
total -601.4KB
Unknown metric groups

API count

id before after diff
@kbn/esql-validation-autocomplete 35 32 -3

References to deprecated APIs

id before after diff
@kbn/esql-validation-autocomplete 19 13 -6

History

@drewdaemon drewdaemon merged commit e5edbf5 into elastic:main Sep 10, 2025
12 checks passed
drewdaemon added a commit that referenced this pull request Sep 12, 2025
## Summary

Fixes the column name detection (bug introduced in
#233221)

**Before**
<img width="711" height="203" alt="Screenshot 2025-09-11 at 9 00 34 AM"
src="https://github.com/user-attachments/assets/285a97a5-4341-4f25-a969-46cd00cc669b"
/>

**After**
<img width="703" height="176" alt="Screenshot 2025-09-11 at 9 42 13 AM"
src="https://github.com/user-attachments/assets/69f923f6-7d15-4cf9-aeaa-027544240c2e"
/>
KodeRad pushed a commit to KodeRad/kibana that referenced this pull request Sep 15, 2025
## Summary

We want _one_ system that efficiently computes a list of available
columns at any given command position in the query. This is crucial for
both validation (column existence checks) and autocomplete (suggested
columns).

The chosen system is
[`getColumnsByTypeHelper`](https://github.com/elastic/kibana/pull/233221/files#diff-1aaabc255965859d8acff49327980a6d72ce47858f252dab6257bf74719b971bR88)
which relies on each command's `columnsAfter` method to report any
changes to the column list made by that command.

**Key changes**
- `collectUserDefinedColumns` has been subsumed by command-specific
`columnsAfter` methods
- a unique column list is computed for each command during validation
([ref](https://github.com/elastic/kibana/pull/233221/files#diff-ff2bbabf0d966619b3e9d5449b4ae7ad70095f7f2ad0029c7e376d8460a25f08R142)).
Prior to this change, we were collecting all columns available
_anywhere_ in the query and using that list.
- additional fields (such as from a lookup index or an enrichment
policy) are added to the column list within the `columnsAfter` methods
of the respective commands
- user-defined columns and fields from ES are now merged into a single
list on the context object called `columns`. This list contains objects
defined by the `ESQLColumnData` interface. The two types of column are
still differentiated per-object (specifically by the presence of
`userDefined: true/false`)
- the `getQueryForFields` function is now internal to
`getColumnsByTypeHelper`
- each `JOIN` and `ENRICH` command in a query generates its own network
call. These calls happen serially. (This is potentially bit slower than
what was happening before. There are some optimizations we could do, but
we can monitor for feedback before investing.)

### User-facing improvements

#### lookup index fields suggested

<img width="849" height="272" alt="Screenshot 2025-09-05 at 11 13 26 AM"
src="https://github.com/user-attachments/assets/ed0972ed-86d1-4fd8-a904-eade0c672509"
/>

#### enrichment policy fields suggested

<img width="816" height="105" alt="Screenshot 2025-09-05 at 11 18 29 AM"
src="https://github.com/user-attachments/assets/6fb9c4c7-c12d-4129-96c1-a1f5aca4b171"
/>

#### enrichment fields correctly validated

<img width="516" height="81" alt="Screenshot 2025-09-05 at 11 21 14 AM"
src="https://github.com/user-attachments/assets/41e22b90-1d57-4b0d-a2ad-0bac493ccfb7"
/>

_here `agent.keyword` is a field available in the source index, but it
is not an enrichment field so it is not available in `WITH`_

#### catching columns used before they are defined

<img width="270" height="88" alt="Screenshot 2025-09-05 at 11 12 14 AM"
src="https://github.com/user-attachments/assets/4d8ecc02-5415-4f18-ab7f-154141e3630a"
/>


### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

### Release note

The ES|QL autocomplete feature has been extended so that fields from the
lookup index are suggested after a complete `LOOKUP JOIN` command. The
same enhancement has been applied to columns from an enrichment policy
after a complete `ENRICH` command.

The client-side validator's column detection has also been refined. It
now reports use-before-defined errors and reliably knows the type of
each column.
KodeRad pushed a commit to KodeRad/kibana that referenced this pull request Sep 15, 2025
## Summary

Fixes the column name detection (bug introduced in
elastic#233221)

**Before**
<img width="711" height="203" alt="Screenshot 2025-09-11 at 9 00 34 AM"
src="https://github.com/user-attachments/assets/285a97a5-4341-4f25-a969-46cd00cc669b"
/>

**After**
<img width="703" height="176" alt="Screenshot 2025-09-11 at 9 42 13 AM"
src="https://github.com/user-attachments/assets/69f923f6-7d15-4cf9-aeaa-027544240c2e"
/>
drewdaemon added a commit that referenced this pull request Sep 17, 2025
## Summary

#233221 made the column suggestion
incrementer look _behind_ the current command for user-defined columns.
This created a regression when multiple columns are defined within a
single command (minus `EVAL` which is handled specially).

In the end, we decided that it isn't important to account for dropped
columns when calculating the new column suggestion, so we just search
the whole query for column names as we did before
#233221.

Without fix
<img width="864" height="171" alt="Screenshot 2025-09-12 at 3 42 35 PM"
src="https://github.com/user-attachments/assets/967187c7-0a8a-4680-aee3-3799d4af126e"
/>

With fix
<img width="862" height="162" alt="Screenshot 2025-09-12 at 3 43 25 PM"
src="https://github.com/user-attachments/assets/64ef7247-d91e-454e-8d55-b137cf97e610"
/>

Co-authored-by: Stratou <efstratia.kalafateli@elastic.co>
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Sep 24, 2025
## Summary

We want _one_ system that efficiently computes a list of available
columns at any given command position in the query. This is crucial for
both validation (column existence checks) and autocomplete (suggested
columns).

The chosen system is
[`getColumnsByTypeHelper`](https://github.com/elastic/kibana/pull/233221/files#diff-1aaabc255965859d8acff49327980a6d72ce47858f252dab6257bf74719b971bR88)
which relies on each command's `columnsAfter` method to report any
changes to the column list made by that command.

**Key changes**
- `collectUserDefinedColumns` has been subsumed by command-specific
`columnsAfter` methods
- a unique column list is computed for each command during validation
([ref](https://github.com/elastic/kibana/pull/233221/files#diff-ff2bbabf0d966619b3e9d5449b4ae7ad70095f7f2ad0029c7e376d8460a25f08R142)).
Prior to this change, we were collecting all columns available
_anywhere_ in the query and using that list.
- additional fields (such as from a lookup index or an enrichment
policy) are added to the column list within the `columnsAfter` methods
of the respective commands
- user-defined columns and fields from ES are now merged into a single
list on the context object called `columns`. This list contains objects
defined by the `ESQLColumnData` interface. The two types of column are
still differentiated per-object (specifically by the presence of
`userDefined: true/false`)
- the `getQueryForFields` function is now internal to
`getColumnsByTypeHelper`
- each `JOIN` and `ENRICH` command in a query generates its own network
call. These calls happen serially. (This is potentially bit slower than
what was happening before. There are some optimizations we could do, but
we can monitor for feedback before investing.)

### User-facing improvements

#### lookup index fields suggested

<img width="849" height="272" alt="Screenshot 2025-09-05 at 11 13 26 AM"
src="https://github.com/user-attachments/assets/ed0972ed-86d1-4fd8-a904-eade0c672509"
/>

#### enrichment policy fields suggested

<img width="816" height="105" alt="Screenshot 2025-09-05 at 11 18 29 AM"
src="https://github.com/user-attachments/assets/6fb9c4c7-c12d-4129-96c1-a1f5aca4b171"
/>

#### enrichment fields correctly validated

<img width="516" height="81" alt="Screenshot 2025-09-05 at 11 21 14 AM"
src="https://github.com/user-attachments/assets/41e22b90-1d57-4b0d-a2ad-0bac493ccfb7"
/>

_here `agent.keyword` is a field available in the source index, but it
is not an enrichment field so it is not available in `WITH`_

#### catching columns used before they are defined

<img width="270" height="88" alt="Screenshot 2025-09-05 at 11 12 14 AM"
src="https://github.com/user-attachments/assets/4d8ecc02-5415-4f18-ab7f-154141e3630a"
/>


### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

### Release note

The ES|QL autocomplete feature has been extended so that fields from the
lookup index are suggested after a complete `LOOKUP JOIN` command. The
same enhancement has been applied to columns from an enrichment policy
after a complete `ENRICH` command.

The client-side validator's column detection has also been refined. It
now reports use-before-defined errors and reliably knows the type of
each column.
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Sep 24, 2025
## Summary

Fixes the column name detection (bug introduced in
elastic#233221)

**Before**
<img width="711" height="203" alt="Screenshot 2025-09-11 at 9 00 34 AM"
src="https://github.com/user-attachments/assets/285a97a5-4341-4f25-a969-46cd00cc669b"
/>

**After**
<img width="703" height="176" alt="Screenshot 2025-09-11 at 9 42 13 AM"
src="https://github.com/user-attachments/assets/69f923f6-7d15-4cf9-aeaa-027544240c2e"
/>
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Sep 24, 2025
## Summary

elastic#233221 made the column suggestion
incrementer look _behind_ the current command for user-defined columns.
This created a regression when multiple columns are defined within a
single command (minus `EVAL` which is handled specially).

In the end, we decided that it isn't important to account for dropped
columns when calculating the new column suggestion, so we just search
the whole query for column names as we did before
elastic#233221.

Without fix
<img width="864" height="171" alt="Screenshot 2025-09-12 at 3 42 35 PM"
src="https://github.com/user-attachments/assets/967187c7-0a8a-4680-aee3-3799d4af126e"
/>

With fix
<img width="862" height="162" alt="Screenshot 2025-09-12 at 3 43 25 PM"
src="https://github.com/user-attachments/assets/64ef7247-d91e-454e-8d55-b137cf97e610"
/>

Co-authored-by: Stratou <efstratia.kalafateli@elastic.co>
niros1 pushed a commit that referenced this pull request Sep 30, 2025
## Summary

We want _one_ system that efficiently computes a list of available
columns at any given command position in the query. This is crucial for
both validation (column existence checks) and autocomplete (suggested
columns).

The chosen system is
[`getColumnsByTypeHelper`](https://github.com/elastic/kibana/pull/233221/files#diff-1aaabc255965859d8acff49327980a6d72ce47858f252dab6257bf74719b971bR88)
which relies on each command's `columnsAfter` method to report any
changes to the column list made by that command.

**Key changes**
- `collectUserDefinedColumns` has been subsumed by command-specific
`columnsAfter` methods
- a unique column list is computed for each command during validation
([ref](https://github.com/elastic/kibana/pull/233221/files#diff-ff2bbabf0d966619b3e9d5449b4ae7ad70095f7f2ad0029c7e376d8460a25f08R142)).
Prior to this change, we were collecting all columns available
_anywhere_ in the query and using that list.
- additional fields (such as from a lookup index or an enrichment
policy) are added to the column list within the `columnsAfter` methods
of the respective commands
- user-defined columns and fields from ES are now merged into a single
list on the context object called `columns`. This list contains objects
defined by the `ESQLColumnData` interface. The two types of column are
still differentiated per-object (specifically by the presence of
`userDefined: true/false`)
- the `getQueryForFields` function is now internal to
`getColumnsByTypeHelper`
- each `JOIN` and `ENRICH` command in a query generates its own network
call. These calls happen serially. (This is potentially bit slower than
what was happening before. There are some optimizations we could do, but
we can monitor for feedback before investing.)

### User-facing improvements

#### lookup index fields suggested

<img width="849" height="272" alt="Screenshot 2025-09-05 at 11 13 26 AM"
src="https://github.com/user-attachments/assets/ed0972ed-86d1-4fd8-a904-eade0c672509"
/>

#### enrichment policy fields suggested

<img width="816" height="105" alt="Screenshot 2025-09-05 at 11 18 29 AM"
src="https://github.com/user-attachments/assets/6fb9c4c7-c12d-4129-96c1-a1f5aca4b171"
/>

#### enrichment fields correctly validated

<img width="516" height="81" alt="Screenshot 2025-09-05 at 11 21 14 AM"
src="https://github.com/user-attachments/assets/41e22b90-1d57-4b0d-a2ad-0bac493ccfb7"
/>

_here `agent.keyword` is a field available in the source index, but it
is not an enrichment field so it is not available in `WITH`_

#### catching columns used before they are defined

<img width="270" height="88" alt="Screenshot 2025-09-05 at 11 12 14 AM"
src="https://github.com/user-attachments/assets/4d8ecc02-5415-4f18-ab7f-154141e3630a"
/>


### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

### Release note

The ES|QL autocomplete feature has been extended so that fields from the
lookup index are suggested after a complete `LOOKUP JOIN` command. The
same enhancement has been applied to columns from an enrichment policy
after a complete `ENRICH` command.

The client-side validator's column detection has also been refined. It
now reports use-before-defined errors and reliably knows the type of
each column.
niros1 pushed a commit that referenced this pull request Sep 30, 2025
## Summary

Fixes the column name detection (bug introduced in
#233221)

**Before**
<img width="711" height="203" alt="Screenshot 2025-09-11 at 9 00 34 AM"
src="https://github.com/user-attachments/assets/285a97a5-4341-4f25-a969-46cd00cc669b"
/>

**After**
<img width="703" height="176" alt="Screenshot 2025-09-11 at 9 42 13 AM"
src="https://github.com/user-attachments/assets/69f923f6-7d15-4cf9-aeaa-027544240c2e"
/>
niros1 pushed a commit that referenced this pull request Sep 30, 2025
## Summary

#233221 made the column suggestion
incrementer look _behind_ the current command for user-defined columns.
This created a regression when multiple columns are defined within a
single command (minus `EVAL` which is handled specially).

In the end, we decided that it isn't important to account for dropped
columns when calculating the new column suggestion, so we just search
the whole query for column names as we did before
#233221.

Without fix
<img width="864" height="171" alt="Screenshot 2025-09-12 at 3 42 35 PM"
src="https://github.com/user-attachments/assets/967187c7-0a8a-4680-aee3-3799d4af126e"
/>

With fix
<img width="862" height="162" alt="Screenshot 2025-09-12 at 3 43 25 PM"
src="https://github.com/user-attachments/assets/64ef7247-d91e-454e-8d55-b137cf97e610"
/>

Co-authored-by: Stratou <efstratia.kalafateli@elastic.co>
@drewdaemon drewdaemon deleted the align-column-existence branch October 7, 2025 20:56
rylnd pushed a commit to rylnd/kibana that referenced this pull request Oct 17, 2025
## Summary

elastic#233221 made the column suggestion
incrementer look _behind_ the current command for user-defined columns.
This created a regression when multiple columns are defined within a
single command (minus `EVAL` which is handled specially).

In the end, we decided that it isn't important to account for dropped
columns when calculating the new column suggestion, so we just search
the whole query for column names as we did before
elastic#233221.

Without fix
<img width="864" height="171" alt="Screenshot 2025-09-12 at 3 42 35 PM"
src="https://github.com/user-attachments/assets/967187c7-0a8a-4680-aee3-3799d4af126e"
/>

With fix
<img width="862" height="162" alt="Screenshot 2025-09-12 at 3 43 25 PM"
src="https://github.com/user-attachments/assets/64ef7247-d91e-454e-8d55-b137cf97e610"
/>

Co-authored-by: Stratou <efstratia.kalafateli@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting enhancement New value added to drive a business result Feature:ES|QL ES|QL related features in Kibana release_note:enhancement Team:ESQL ES|QL related features in Kibana t// v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants