Skip to content

Use JTS instead of ESRI for geometries, Parts 1 and 2#13604

Merged
arhimondr merged 4 commits intoprestodb:masterfrom
jagill:use-jts
Dec 9, 2019
Merged

Use JTS instead of ESRI for geometries, Parts 1 and 2#13604
arhimondr merged 4 commits intoprestodb:masterfrom
jagill:use-jts

Conversation

@jagill
Copy link
Contributor

@jagill jagill commented Oct 25, 2019

I am in the process of refactoring the geospatial functions to use JTS instead of Esri. Why?

  • JTS is a more standard library, that is compatible with PostGIS and GEOS and most geometry libraries.
  • It conforms to the ISO standards better.
  • Serialization is about 80% of the CPU for many geometrical functions, and
  • JTS serializes about 20% faster than ESRI, and has the potential of serializing 10-20x faster.

Since geometries are always converted to/from slices, it's fine for some functions to use Esri and some to use JTS (this is the case already). This PR converts most of the "easy" functions (those that don't involve much in the way of logical or numerical changes). However, JTS uses the standard order for polygon rings (counter-clockwise) instead of Esri's clockwise. This means that many of the tests need to have the order of the vertices reversed. I've isolated all of those changes into the commit that uses JTS for conversion from/to WKT.

== RELEASE NOTES ==

General Changes
* Use JTS instead of Esri for many geometrical operations.
  + Polygon WKTs must have closed loops.  Previously Esri would close the loops for you.
  + Certain other invalid geometries will fail to be created from WKTs, such as
    `LINESTRING(0 0, 0 0, 0 0)`.
  + Returned WKTs may have a different point order.
  + Fixes incorrect calculation of extreme points in certain cases.

@jagill
Copy link
Contributor Author

jagill commented Oct 25, 2019

cc @arhimondr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make Polygon WKTs valid in test cases

Although it is by the standard - it is a breaking change. Do we care that if somebody had existing WKT stored in the database that were compatible with ESRI it will stop working?

If we are fine with that, should we have a better release note?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are this objects (CoordinateSequenceFactory, GeometryFactory) thread safe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

upd: those are threadsafe, but mutable. Let's make them private

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing this test case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, certain types of invalid geometries could be created, but would have to be checked for explicitly in functions such as these. Now, those geometries fail on construction (I've updated the release notes to mention this).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to belong to the previous commit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing these two?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? Why the standard toString from the enum is not good enough? If this is something that we use for actual serialization / deserialization, should we have a separate method and not reuse the toString interface?

Copy link
Contributor Author

@jagill jagill Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't used for serde: it is used for a user-facing error message. It's a bit nicer and conforms to the previous error message. However, I have no strong feelings and I'm happy for users to read MULTI_LINE_STRING.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed it: it's probably better to change the test error message than add a new method that must be maintained.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GeometrySerde.serialize and GeometrySerde.deserialize are confusing. Should we rename GeometrySerde to EsriGeometrySerde?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw IllegalArgument

@jagill jagill force-pushed the use-jts branch 3 times, most recently from 2e9e7d6 to 05bebdf Compare October 30, 2019 18:06
Copy link
Member

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me.

The only concern i have - is that we are braking backward compatibility by disallowing "invalid" polygons that don't have the last point. But as we discussed - that should be fine. As such polygons are "invalid".

@mbasmanova Would you like to have a look before we merge this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not using them outside anymore, could you please make them private?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, i see, you are making them private in the next commit. Could you please move it to the commit before?

@jagill jagill force-pushed the use-jts branch 2 times, most recently from dc3f559 to f6149ab Compare November 8, 2019 02:06
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Nov 8, 2019

CLA Check
The committers are authorized under a signed CLA.

  • ✅ James Gill (7aa067b9283e4f310426ce00c913c3921b295a3a, 1417e52084d2f414e47f44cd747b86863c9bbef4, 3ddf604ba86da57076a4ad67771b322d7988a641, eb2793f3e1aa871c91d03262bffd69ab34323f92)

@jagill jagill force-pushed the use-jts branch 2 times, most recently from d0b4060 to 820ea91 Compare November 8, 2019 20:59
@jagill
Copy link
Contributor Author

jagill commented Nov 8, 2019

Running the verifier suite, I found only two discrepancies:

  1. The order of polygon WKTs changed in the expected way,
  2. I found some unexpected changes to ST_XMax (and Y and Min) that -- when investigated -- showed the new values matched manual testing. So I'll consider this an unintentional bug fix.

As a first step to converting to JTS, convert the WKT serde functions to
JTS.  While the code change here is small, JTS uses the standard order
for polygon rings (counter-clockwise), while Esri uses the reverse order
(clockwise).  This means many of the test cases need to have their
coordinates reversed.

Also, a couple tests which tested invalid geometries will actually be
caught in WKT parsing; so they were removed from the other function
tests.
Converting most unary properties and operators to JTS.  Binary relations
and operators were not touched, as well as unary operators that would
require significant logic changes.
@arhimondr arhimondr merged commit 46ae108 into prestodb:master Dec 9, 2019
@jagill jagill deleted the use-jts branch December 9, 2019 19:10
@caithagoras caithagoras mentioned this pull request Jan 22, 2020
6 tasks
@bhhari
Copy link
Contributor

bhhari commented Jan 29, 2020

fixes #14031

@yupeng9 yupeng9 mentioned this pull request Jul 9, 2020
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants