Skip to content

[ESQL] Per-file filter pushdown awareness#145755

Merged
costin merged 2 commits intoelastic:mainfrom
costin:esql/per-file-filter-pushdown
Apr 8, 2026
Merged

[ESQL] Per-file filter pushdown awareness#145755
costin merged 2 commits intoelastic:mainfrom
costin:esql/per-file-filter-pushdown

Conversation

@costin
Copy link
Copy Markdown
Member

@costin costin commented Apr 6, 2026

Make filter pushdown aware of per-file column availability and
types in UNION_BY_NAME scenarios. Files whose filter columns are
entirely absent are skipped at split discovery time. For files
that do contain the columns, pushed ESQL expressions are adapted
to the file's column set and re-translated to format-native
filters. Type-widened columns (e.g. INTEGER file vs LONG unified)
have their filter literals downcast with overflow detection.

Developed with AI-assisted tooling

@costin costin requested a review from bpintea April 6, 2026 17:54
@costin costin enabled auto-merge (squash) April 6, 2026 17:55
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 6, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @costin, I've created a changelog YAML for you.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 6, 2026

🔍 Preview links for changed docs

⏳ Building and deploying preview... View progress

This comment will be updated with preview links when the build is complete.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 6, 2026

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

@costin costin closed this Apr 6, 2026
auto-merge was automatically disabled April 6, 2026 17:59

Pull request was closed

@costin costin reopened this Apr 6, 2026
Make filter pushdown aware of per-file column availability and
types in UNION_BY_NAME scenarios. Files whose filter columns are
entirely absent are skipped at split discovery time. For files
that do contain the columns, pushed ESQL expressions are adapted
to the file's column set and re-translated to format-native
filters. Type-widened columns (e.g. INTEGER file vs LONG unified)
have their filter literals downcast with overflow detection.
@costin costin force-pushed the esql/per-file-filter-pushdown branch from bdbbba9 to 151f2a2 Compare April 7, 2026 18:38
@costin costin enabled auto-merge (squash) April 7, 2026 18:39
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @costin, I've created a changelog YAML for you.

Comment on lines +520 to +528
/**
* Infers the file's native type from the unified attribute type and the cast target.
* The cast target is the unified (wider) type; the file has the narrower type.
*/
/**
* Infers the file's native type from the cast target. Only returns a narrower type when
* the adaptation is safe for integral comparisons (LONG→INTEGER). DOUBLE→INTEGER narrowing
* is not supported because literal truncation can cause incorrect predicate semantics.
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: redundancy

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — merged the two javadoc blocks into one.

Comment on lines +533 to +534
// DOUBLE→INTEGER narrowing is intentionally not supported: Number.longValue() truncates
// fractional values, which can change comparison semantics (e.g., col < 2.7 vs col < 2).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: if the methods stays, this can be a javadoc comment and the method simplified.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — moved inline comment into the javadoc and removed the redundant one.

* the adaptation is safe for integral comparisons (LONG→INTEGER). DOUBLE→INTEGER narrowing
* is not supported because literal truncation can cause incorrect predicate semantics.
*/
private static DataType inferFileType(DataType unifiedType, DataType castTarget) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unifiedType isn't used.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed — the parameter was left over from when DOUBLE→INTEGER was considered.

if (adapted.isEmpty()) {
return formatReader.withPushedFilter(null);
}
FilterPushdownSupport.PushdownResult result = pushdownSupport.pushFilters(adapted);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would probably be useful if we could cash the resolution at a level higher than per file (at some point).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — we could cache the adapted PushdownResult keyed on the set of missing/widened columns so files with identical schemas share one translation.

@costin costin merged commit 82359d3 into elastic:main Apr 8, 2026
34 of 35 checks passed
@costin costin deleted the esql/per-file-filter-pushdown branch April 8, 2026 15:05
szybia added a commit to szybia/elasticsearch that referenced this pull request Apr 8, 2026
* upstream/main:
  Mute org.elasticsearch.xpack.esql.expression.function.aggregate.FirstDocIdGroupingAggregatorFunctionTests testSimple elastic#145923
  Reindex relocation: store source TaskResult at destination node (elastic#145488)
  Bump versions after 9.2.8 release
  [CI] DLMFrozenTransitionServiceTests testCheckForFrozenIndicesReturnsEarlyWhenCapacityExhausted failing [elastic#145778] (elastic#145906)
  Update branches.json for 9.2.8 release
  ESQL: Clarify inheriting from Attributes (elastic#145898)
  Bump versions after 9.3.3 release
  Update branches.json for 9.3.3 release
  Prune changelogs after 8.19.14 release
  Bump versions after 8.19.14 release
  Update branches.json for 8.19.14 release
  [ML] Call old inference API (elastic#145690)
  ESQL: Unmute CsvIT sumWithOverflowRow (elastic#145893)
  Index a document when testing runtime fields shadowing dimensions & metrics (elastic#145882)
  [TEST] Fix version check in testSequenceNumbersDisabled (elastic#145879)
  [ESQL] Per-file filter pushdown awareness (elastic#145755)
  Unmute testGetReindexFollowsRelocation (elastic#145841)
  Correctly ignore system indices when validating dot-prefixed indices (elastic#128868)
  [Transform] Remove tests for deleted code (elastic#145685)
  ESQL: Add generative tests for LIMIT BY (elastic#144238)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement ES|QL|DS ES|QL datasources Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants