Skip to content

Adds ST_SIMPLIFY geo spatial function#136309

Merged
ncordon merged 64 commits intoelastic:mainfrom
ncordon:st_simplify
Dec 18, 2025
Merged

Adds ST_SIMPLIFY geo spatial function#136309
ncordon merged 64 commits intoelastic:mainfrom
ncordon:st_simplify

Conversation

@ncordon
Copy link
Contributor

@ncordon ncordon commented Oct 9, 2025

Adds ST_SIMPLIFY geo spatial function, which simplifies a geo-shape using the Douglas Peuker algorithm. Inspiration taken from the Postgres corresponding function

It can be tested as:

ROW geo_shape = TO_GEOSHAPE("POLYGON((0 0, 1 0.1, 2 0, 2 2, 1 1.9, 0 2, 0 0))") 
| EVAL result = st_simplify(geo_shape, 2)
| KEEP result
;

Its first argument is the geo shape to simplify, its second argument is a tolerance that must be non-negative and could be either an integer or a double.

Closes #44747

@ncordon ncordon added >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.3.0 labels Oct 9, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @ncordon, I've created a changelog YAML for you.

@ncordon ncordon changed the title Adds ST_SIMPLIFY spatial funcion Adds ST_SIMPLIFY geo spatial function Oct 9, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @ncordon, I've updated the changelog YAML for you.

@ncordon ncordon requested a review from craigtaverner October 17, 2025 09:49
@github-actions
Copy link
Contributor

github-actions bot commented Oct 20, 2025

@github-actions
Copy link
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

Copy link
Contributor

@craigtaverner craigtaverner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

double inputTolerance = getInputTolerance(toleranceExpression);
Object input = geometry.fold(foldCtx);
if (input instanceof List<?> list) {
// TODO: Consider if this should compact to a GeometryCollection instead, which is what we do for fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can fix this in a followup PR

@ncordon ncordon force-pushed the st_simplify branch 3 times, most recently from bb95796 to cc55c03 Compare December 15, 2025 16:33
Copy link
Contributor

@craigtaverner craigtaverner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I have some comments, nothing critical. A bit nervous about the change to SpatialDocValuesExtraction.java‎

exec.forEachDown(EvalExec.class, evalExec -> {
for (Alias field : evalExec.fields()) {
field.forEachDown(SpatialGridFunction.class, spatialAggFunc -> {
field.forEachDown(SpatialDocValuesFunction.class, spatialAggFunc -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting. I was planning to expand the search to more than geogrid functions, but did not expect to do that so early. The goal initially was to only search for geogrid functions, but then if we do decide to use doc-values, apply that to all functions. My PR expands a lot the list of functions to apply to, but does not expand the list of functions to search. The reason for this is geogrid functions have a canonical use case of calculating the grid and using it in the BY clause of the stats command. That means this is a key feature, and a hot path. For other functions it is mostly odd edge cases that would benefit from this, but they all to support it if triggered by geogrid functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the functions that would be affected by this decision are in tech-preview, and it could be argued that doing this for all is no more risky than doing it for grid functions only.

}
}

public void testStSimplifyUsesDocValues() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the logic I was expecting in the SpatialDocValuesExtraction class, I would have only expected to see ST_SIMPLIFY using doc-values if used together with a geogrid function. This test of course tests something else, that ST_SIMPLIFY itself can trigger the use of doc-values.

}

public void testStSimplifyUsesDocValues() {
for (boolean keepLocation : new boolean[] { false, true }) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wrote a test with STATS instead of SORT, then we would not need keep-location. Better still, have both tests, one with STATS and one with SORT and keep-location.

}
var toleranceExpression = tolerance.fold(toEvaluator.foldCtx());
double inputTolerance = getInputTolerance(toleranceExpression);
if (spatialDocValues && geometry.dataType() == DataType.GEO_POINT) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code does not consider the case when spatialDocValues==true, but the datatype is not a point. That would indicate a planner bug, so understand that it is not here. Having it would, however, lead to a nicer error message. The message we get otherwise is the block ClassCastException which can be a bit cryptic. On the other hand, most times I've seen that error have been when we don't have these evaluators at all.

@ncordon ncordon merged commit 2b82e31 into elastic:main Dec 18, 2025
35 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Dec 19, 2025
* upstream/main: (253 commits)
  Adds ST_SIMPLIFY geo spatial function (elastic#136309)
  Take control of max clause count verification in Lucene searcher (elastic#139752)
  [ML] Unmute Inference Test (elastic#139765)
  Parameterize the vector operation benchmark tests (elastic#139735)
  Fix node reduction pushdown tests for release tests (elastic#139548)
  Fix flakiness in TSDataGenerationHelper (elastic#139759)
  CPS: Copy existing resolved index expressions when constructing a new `SearchRequest` from an existing one (elastic#139596)
  Add release notes for v9.1.9 release (elastic#139674)
  Add lucene query for wildcards on high cardinality keyword fields. (elastic#139746)
  Suppress Tika entitlement warnings from AWT (elastic#139711)
  Check field storage when synthetic source is enabled, in tests (elastic#139715)
  Refactor VectorSimilarityType to know about its corresponding Function (elastic#139678)
  Merge fixes from patch branch back into main (elastic#139721)
  Define native bulk operations for vector square distance (elastic#139198)
  Use LongUpDownCounter for Linked Project Error Metrics (elastic#139657)
  ESQL: Add javadoc that explains version-aware planning (elastic#139706)
  Add helper to pick node for reindex relocation (elastic#139081)
  Fix auth serialization randomized version test (elastic#139182)
  ES|QL - Add parsing, preanalysis and analysis timing information to profile (elastic#139540)
  Mute org.elasticsearch.persistent.ClusterPersistentTasksCustomMetadataTests testMinVersionSerialization elastic#139741
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Elastic-Spatial: Simplify geometries

6 participants