Skip to content

ESQL: Add distribution strategy for external sources#143194

Merged
costin merged 1 commit intoelastic:mainfrom
costin:esql/ds-distributed/stage-4
Feb 27, 2026
Merged

ESQL: Add distribution strategy for external sources#143194
costin merged 1 commit intoelastic:mainfrom
costin:esql/ds-distributed/stage-4

Conversation

@costin
Copy link
Copy Markdown
Member

@costin costin commented Feb 26, 2026

Add a pluggable distribution strategy for external source queries,
enabling future distributed execution across data nodes.

  • ExternalDistributionStrategy, ExternalDistributionContext,
    ExternalDistributionPlan, NodeEligibilityStrategy: core interfaces
    for deciding how external source splits should be distributed
  • CoordinatorOnlyStrategy, RoundRobinStrategy, AdaptiveStrategy:
    concrete implementations with shared round-robin assignment logic
  • Mapper: insert ExchangeExec above ExternalSourceExec when pipeline
    breakers are present, mirroring FragmentExec handling
  • ComputeService: resolve strategy from external_distribution pragma,
    apply strategy; always collapse exchanges to coordinator-only until
    data node dispatch is implemented (PR5)
  • QueryPragmas: add external_distribution setting (adaptive/coordinator_only/round_robin)

Relates #142996

Developed using AI-assisted tooling

@costin costin added >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.4.0 labels Feb 26, 2026
@costin costin requested a review from bpintea February 26, 2026 23:03
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @costin, I've created a changelog YAML for you.

Introduce a pluggable strategy pattern for deciding whether external
source queries should be distributed across data nodes or kept on the
coordinator. Until data node dispatch is implemented (PR5), all plans
collapse back to coordinator-only execution.

- ExternalDistributionStrategy, ExternalDistributionContext,
  ExternalDistributionPlan, NodeEligibilityStrategy: core interfaces
- CoordinatorOnlyStrategy, RoundRobinStrategy, AdaptiveStrategy:
  concrete implementations with shared round-robin assignment logic
- Mapper: insert ExchangeExec above ExternalSourceExec for pipeline
  breakers
- ComputeService: resolve strategy from external_distribution pragma,
  apply strategy, collapse exchanges until PR5
- QueryPragmas: add external_distribution setting
- Formatting fixes from spotlessApply across datasource files

Developed using AI-assisted tooling
@costin costin force-pushed the esql/ds-distributed/stage-4 branch from b1d9b64 to 3cb02dc Compare February 27, 2026 06:54
Copy link
Copy Markdown
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
🤖 Reviewed AI-assisted.

@costin costin merged commit 5b422b2 into elastic:main Feb 27, 2026
35 checks passed
@costin costin deleted the esql/ds-distributed/stage-4 branch February 27, 2026 08:09
PeteGillinElastic pushed a commit to PeteGillinElastic/elasticsearch that referenced this pull request Feb 27, 2026
Add a pluggable distribution strategy for external source queries,
enabling future distributed execution across data nodes.

- ExternalDistributionStrategy, ExternalDistributionContext,
  ExternalDistributionPlan, NodeEligibilityStrategy: core interfaces
  for deciding how external source splits should be distributed
- CoordinatorOnlyStrategy, RoundRobinStrategy, AdaptiveStrategy:
  concrete implementations with shared round-robin assignment logic
- Mapper: insert ExchangeExec above ExternalSourceExec when pipeline
  breakers are present, mirroring FragmentExec handling
- ComputeService: resolve strategy from `external_distribution` pragma,
  apply strategy; always collapse exchanges to coordinator-only until
  data node dispatch is implemented (PR5)
- QueryPragmas: add `external_distribution` setting (adaptive/coordinator_only/round_robin)

Relates elastic#142996

Developed using AI-assisted tooling
szybia added a commit to szybia/elasticsearch that referenced this pull request Feb 27, 2026
…cations

* upstream/main: (35 commits)
  Create ARM bulk sqrI8 implementation (elastic#142461)
  Rework get-snapshots predicates (elastic#143161)
  Refactor downsampling fetchers and producers (elastic#140357)
  ESQL: Unmute test and add extra logging to generative test validation (elastic#143168)
  Fix metadata fields being nullified/loaded by unmapped_fields setting (elastic#143155)
  Determine remote cluster version (elastic#142494)
  Populate failure message for aborted clones (elastic#143206)
  Allow kibana_system role to read and manage logs streams (elastic#143053)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:eval.DocsLength} elastic#143224
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:eval.DocsByteLength} elastic#143223
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:docs.DocsBitLength} elastic#143222
  Fix FloatVectorScorerSupplier bulkScore bug (elastic#143211)
  ESQL: Add data node execution for external sources (elastic#143209)
  [ESQL] Cleanup commands docs (elastic#143058)
  [ML]Fix latest transforms disregarding updates when sort and sync fields are non-monotonic (elastic#142856)
  Mute org.elasticsearch.index.mapper.IpFieldMapperTests testSyntheticSourceInObject elastic#143212
  Tests: Fix StoreDirectoryMetricsIT (elastic#143084)
  ESQL: Add distribution strategy for external sources (elastic#143194)
  CSV IT spec (elastic#142585)
  Fix VectorScorerOSQBenchmark.score to read corrections properly (elastic#143137)
  ...
tballison pushed a commit to tballison/elasticsearch that referenced this pull request Mar 3, 2026
Add a pluggable distribution strategy for external source queries,
enabling future distributed execution across data nodes.

- ExternalDistributionStrategy, ExternalDistributionContext,
  ExternalDistributionPlan, NodeEligibilityStrategy: core interfaces
  for deciding how external source splits should be distributed
- CoordinatorOnlyStrategy, RoundRobinStrategy, AdaptiveStrategy:
  concrete implementations with shared round-robin assignment logic
- Mapper: insert ExchangeExec above ExternalSourceExec when pipeline
  breakers are present, mirroring FragmentExec handling
- ComputeService: resolve strategy from `external_distribution` pragma,
  apply strategy; always collapse exchanges to coordinator-only until
  data node dispatch is implemented (PR5)
- QueryPragmas: add `external_distribution` setting (adaptive/coordinator_only/round_robin)

Relates elastic#142996

Developed using AI-assisted tooling
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants