Add Hive storage format ESRI GeoJsonSerDe using JTS#28789
Closed
gertjanal wants to merge 15 commits intotrinodb:user/dain/geo-jtsfrom
Closed
Add Hive storage format ESRI GeoJsonSerDe using JTS#28789gertjanal wants to merge 15 commits intotrinodb:user/dain/geo-jtsfrom
gertjanal wants to merge 15 commits intotrinodb:user/dain/geo-jtsfrom
Conversation
Adds assertSpatialEquals helper to TestGeoFunctions that uses stEquals for geometry comparison. Converts testSTGeometryType and testSTBuffer to use the new helper. testSTBuffer was updated to use property-based assertions (ST_Envelope and ST_Area with tolerance) instead of exact WKT coordinate matching. This makes the tests stable across CPU architectures (ARM vs x86) where trigonometric functions can produce slightly different floating-point results.
Migrate simple geometry functions to use JTS library. Test updates for behavior differences: - ST_Boundary returns LINESTRING instead of MULTILINESTRING for simple polygons - ST_Buffer with infinity returns POLYGON EMPTY instead of MULTIPOLYGON EMPTY - Minor floating-point precision differences in some calculations
Migrate ST_NumPoints and related accessor functions to JTS. Test updates for behavior differences: - ST_NumPoints now counts closing vertices in polygons per OGC standard - Ring vertex ordering may differ cosmetically (same geometry)
Add JTS-compatible overloads for geometry utility methods to support incremental migration from ESRI to JTS. The ESRI versions remain for existing callers until they are converted.
Rewrite stUnion to use JTS UnaryUnionOp instead of ESRI cursors. Behavior differences: - Point-on-line union does not insert vertices - Empty inputs return empty geometry collection instead of null
3db3acf to
df553d8
Compare
gertjanal
commented
Mar 20, 2026
| case ESRI -> EsriJsonParser.parseGeometry(parser); | ||
| case GEO_JSON -> { | ||
|
|
||
| String json = mapper.writeValueAsString(mapper.readTree(parser)); |
Contributor
Author
There was a problem hiding this comment.
Not very proud of this, but the GeoJsonReader only reads Reader and String input
- Migrate spatial join operator to JTS for intersection and containment tests - Switch GeoFunctions envelope operations to use JTS Envelope (deserializeEnvelope, ST_XMin/XMax/YMin/YMax, ST_IsEmpty)
Use Extended Well-Known Binary (EWKB) format for geometry serialization. EWKB is the standard used by PostGIS and retains the SRID (Spatial Reference System Identifier) for coordinate system information.
Note: TestEsriTable's expected values file was converted from Trino's old internal binary format to WKT. This change cannot be separated into an earlier commit because the old format's deserializer was deleted in the EWKB commit, and circular Maven dependencies prevent adding geospatial as a test dependency to trino-hive.
With ESRI removed JTS objects no longer need fully qualified names
Change the internal representation of geometry values to use JTS Geometry objects directly, avoiding unnecessary serialization cycles between function calls.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Support tables created with Row format
com.esri.hadoop.hive.serde.GeoJsonSerDeandcom.esri.json.hadoop.EnclosedGeoJsonInputFormat.Originally started with PR #28592 but this PR is based on the JTS branch by @dain #27881
My tests work, but the destination branch has failing tests.
See
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( X ) Release notes are required, with the following suggested text: