Skip to content

Implement ST_Area for SphericalGeography#12315

Merged
mbasmanova merged 1 commit intoprestodb:masterfrom
ochalouhi:Add_Spherical_ST_Area
Mar 4, 2019
Merged

Implement ST_Area for SphericalGeography#12315
mbasmanova merged 1 commit intoprestodb:masterfrom
ochalouhi:Add_Spherical_ST_Area

Conversation

@ochalouhi
Copy link
Contributor

@ochalouhi ochalouhi commented Feb 8, 2019

@mbasmanova mbasmanova changed the title Added Support for Spherical ST_Area Implement ST_Area for SphericalGeography Feb 8, 2019
Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ochalouhi Olivier, thank you for adding this function. I have some initial comments below.

@ochalouhi ochalouhi force-pushed the Add_Spherical_ST_Area branch from b3b9374 to 78e5103 Compare February 12, 2019 17:50
@mbasmanova mbasmanova self-assigned this Feb 13, 2019
@ochalouhi ochalouhi force-pushed the Add_Spherical_ST_Area branch 10 times, most recently from 558f98a to 9926a1b Compare February 14, 2019 18:22
Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ochalouhi Olivier, this is pretty cool. I took a first pass over the function implementation and test and have some comments. I haven't looked at the benchmark and documentation yet.

@mbasmanova
Copy link
Contributor

CC: @jwass @jagill

@ochalouhi ochalouhi force-pushed the Add_Spherical_ST_Area branch 2 times, most recently from 423d65f to 25644de Compare February 15, 2019 05:10
@ochalouhi ochalouhi force-pushed the Add_Spherical_ST_Area branch from 25644de to 447af0f Compare February 15, 2019 06:09
Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, in the Presto codebase it is preferable to not add future looking functionality until that future arrives.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that this check makes sense. Also, why hasProcessedPoint is set to true when this check fails? Valid geometries don't have identical vertexes, hence, this check should not matter as long as input is valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have 3 options :

  1. Throw an exception because we assume that this is an invalid polygon
  2. Return the wrong result because it's an invalid polygon
  3. Filter it out, and return the right answer
    I've picked Add basic UncompressedColumnInput #3 but I'm open to any other strategy, I just want to make sure that it's a conscious decision

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postgis does not seem to care about duplicate points (it does not report an error)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ochalouhi #1 is the preferred option because it is consistent with other functions and the SQL specification.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, PostGIS doesn't seem to comply with the SQL specification which calls for errors to be raised on invalid input. Unlike PostGIS, Presto aims for full compliance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this check should be removed, but if it stays, the code would be easier to read if it was rearranged like so:

if (<same-point>) {
   return;
}

<process the point>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will get rid of it.
Wouldn't it be better to check at polygon creation time ? This way, each method doing computations on a polygon could assume it is valid

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ochalouhi Validity checks are very expensive, hence, we don't include them by default.

@ochalouhi ochalouhi force-pushed the Add_Spherical_ST_Area branch 3 times, most recently from a13efa3 to 83f4179 Compare February 19, 2019 17:12
Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ochalouhi Thanks for fixing the OH and PA polygons. Would you add a note about this to the commit message? This would allow future maintainers to understand this change without looking too closely. The commit message also needs to include the results of the benchmark. Having these will allow for determining whether there is a performance diff later without reverting to this state and re-running the benchmark.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Math.abs call seems unnecessary here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, PostGIS doesn't seem to comply with the SQL specification which calls for errors to be raised on invalid input. Unlike PostGIS, Presto aims for full compliance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this check should be removed, but if it stays, the code would be easier to read if it was rearranged like so:

if (<same-point>) {
   return;
}

<process the point>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic mutates internal state and makes it illegal to invoke getSphericalExcess multiple times. Perhaps, add a boolean to enforce that getSphericalExcess is called only once and add is never called after getSphericalExcess? It might be clearer to rename getSphericalExcess to computeSphericalExcess.

@ochalouhi ochalouhi force-pushed the Add_Spherical_ST_Area branch 3 times, most recently from 0d56f50 to 274a003 Compare March 1, 2019 18:16
Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ochalouhi Looks good to me % one comment.

Fixed OH and PA as they had duplicate vertices

STArea Benchmark :

Result "com.facebook.presto.plugin.geospatial.BenchmarkSTArea.stSphericalArea500k":
  328652966.567 ±(99.9%) 63645108.583 ns/op [Average]
  (min, avg, max) = (237252788.000, 328652966.567, 516285755.000), stdev = 73293801.693
  CI (99.9%): [265007857.984, 392298075.149] (assumes normal distribution)

Benchmark                            Mode  Cnt          Score          Error  Units
BenchmarkSTArea.stArea               avgt   20     137845.475 ±    28021.370  ns/op
BenchmarkSTArea.stArea500k           avgt   20   16663361.799 ±  2491691.163  ns/op
BenchmarkSTArea.stSphericalArea      avgt   20    3250484.666 ±   840187.416  ns/op
BenchmarkSTArea.stSphericalArea500k  avgt   20  328652966.567 ± 63645108.583  ns/op
@ochalouhi ochalouhi force-pushed the Add_Spherical_ST_Area branch from 274a003 to 7939b61 Compare March 1, 2019 20:03
@mbasmanova mbasmanova merged commit 18c69a7 into prestodb:master Mar 4, 2019
@mbasmanova
Copy link
Contributor

@ochalouhi Olivier, thank you for the contribution.

kokosing pushed a commit to kokosing/trino that referenced this pull request Apr 23, 2019
Fixed OH and PA as they had duplicate vertices

STArea Benchmark :

Result "com.facebook.presto.plugin.geospatial.BenchmarkSTArea.stSphericalArea500k":
  328652966.567 ±(99.9%) 63645108.583 ns/op [Average]
  (min, avg, max) = (237252788.000, 328652966.567, 516285755.000), stdev = 73293801.693
  CI (99.9%): [265007857.984, 392298075.149] (assumes normal distribution)

Benchmark                            Mode  Cnt          Score          Error  Units
BenchmarkSTArea.stArea               avgt   20     137845.475 ±    28021.370  ns/op
BenchmarkSTArea.stArea500k           avgt   20   16663361.799 ±  2491691.163  ns/op
BenchmarkSTArea.stSphericalArea      avgt   20    3250484.666 ±   840187.416  ns/op
BenchmarkSTArea.stSphericalArea500k  avgt   20  328652966.567 ± 63645108.583  ns/op

Extracted from: prestodb/presto#12315
kokosing pushed a commit to trinodb/trino that referenced this pull request Apr 23, 2019
Fixed OH and PA as they had duplicate vertices

STArea Benchmark :

Result "com.facebook.presto.plugin.geospatial.BenchmarkSTArea.stSphericalArea500k":
  328652966.567 ±(99.9%) 63645108.583 ns/op [Average]
  (min, avg, max) = (237252788.000, 328652966.567, 516285755.000), stdev = 73293801.693
  CI (99.9%): [265007857.984, 392298075.149] (assumes normal distribution)

Benchmark                            Mode  Cnt          Score          Error  Units
BenchmarkSTArea.stArea               avgt   20     137845.475 ±    28021.370  ns/op
BenchmarkSTArea.stArea500k           avgt   20   16663361.799 ±  2491691.163  ns/op
BenchmarkSTArea.stSphericalArea      avgt   20    3250484.666 ±   840187.416  ns/op
BenchmarkSTArea.stSphericalArea500k  avgt   20  328652966.567 ± 63645108.583  ns/op

Extracted from: prestodb/presto#12315
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants