Implement ST_Area for SphericalGeography#12315
Conversation
mbasmanova
left a comment
There was a problem hiding this comment.
@ochalouhi Olivier, thank you for adding this function. I have some initial comments below.
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
...eospatial/src/test/java/com/facebook/presto/plugin/geospatial/TestSphericalGeoFunctions.java
Outdated
Show resolved
Hide resolved
b3b9374 to
78e5103
Compare
558f98a to
9926a1b
Compare
mbasmanova
left a comment
There was a problem hiding this comment.
@ochalouhi Olivier, this is pretty cool. I took a first pass over the function implementation and test and have some comments. I haven't looked at the benchmark and documentation yet.
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
...eospatial/src/test/java/com/facebook/presto/plugin/geospatial/TestSphericalGeoFunctions.java
Outdated
Show resolved
Hide resolved
...eospatial/src/test/java/com/facebook/presto/plugin/geospatial/TestSphericalGeoFunctions.java
Outdated
Show resolved
Hide resolved
...eospatial/src/test/java/com/facebook/presto/plugin/geospatial/TestSphericalGeoFunctions.java
Outdated
Show resolved
Hide resolved
...eospatial/src/test/java/com/facebook/presto/plugin/geospatial/TestSphericalGeoFunctions.java
Outdated
Show resolved
Hide resolved
...eospatial/src/test/java/com/facebook/presto/plugin/geospatial/TestSphericalGeoFunctions.java
Outdated
Show resolved
Hide resolved
423d65f to
25644de
Compare
25644de to
447af0f
Compare
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
In general, in the Presto codebase it is preferable to not add future looking functionality until that future arrives.
There was a problem hiding this comment.
I'm not convinced that this check makes sense. Also, why hasProcessedPoint is set to true when this check fails? Valid geometries don't have identical vertexes, hence, this check should not matter as long as input is valid.
There was a problem hiding this comment.
We have 3 options :
- Throw an exception because we assume that this is an invalid polygon
- Return the wrong result because it's an invalid polygon
- Filter it out, and return the right answer
I've picked Add basic UncompressedColumnInput #3 but I'm open to any other strategy, I just want to make sure that it's a conscious decision
There was a problem hiding this comment.
Postgis does not seem to care about duplicate points (it does not report an error)
There was a problem hiding this comment.
@ochalouhi #1 is the preferred option because it is consistent with other functions and the SQL specification.
There was a problem hiding this comment.
Indeed, PostGIS doesn't seem to comply with the SQL specification which calls for errors to be raised on invalid input. Unlike PostGIS, Presto aims for full compliance.
There was a problem hiding this comment.
I think this check should be removed, but if it stays, the code would be easier to read if it was rearranged like so:
if (<same-point>) {
return;
}
<process the point>
There was a problem hiding this comment.
I will get rid of it.
Wouldn't it be better to check at polygon creation time ? This way, each method doing computations on a polygon could assume it is valid
There was a problem hiding this comment.
@ochalouhi Validity checks are very expensive, hence, we don't include them by default.
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/test/java/com/facebook/presto/plugin/geospatial/BenchmarkSTArea.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/test/java/com/facebook/presto/plugin/geospatial/BenchmarkSTArea.java
Outdated
Show resolved
Hide resolved
...o-geospatial/src/test/java/com/facebook/presto/plugin/geospatial/GeometryBenchmarkUtils.java
Outdated
Show resolved
Hide resolved
...o-geospatial/src/test/java/com/facebook/presto/plugin/geospatial/GeometryBenchmarkUtils.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/test/java/com/facebook/presto/plugin/geospatial/BenchmarkSTArea.java
Outdated
Show resolved
Hide resolved
a13efa3 to
83f4179
Compare
mbasmanova
left a comment
There was a problem hiding this comment.
@ochalouhi Thanks for fixing the OH and PA polygons. Would you add a note about this to the commit message? This would allow future maintainers to understand this change without looking too closely. The commit message also needs to include the results of the benchmark. Having these will allow for determining whether there is a performance diff later without reverting to this state and re-running the benchmark.
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Math.abs call seems unnecessary here.
There was a problem hiding this comment.
Indeed, PostGIS doesn't seem to comply with the SQL specification which calls for errors to be raised on invalid input. Unlike PostGIS, Presto aims for full compliance.
There was a problem hiding this comment.
I think this check should be removed, but if it stays, the code would be easier to read if it was rearranged like so:
if (<same-point>) {
return;
}
<process the point>
There was a problem hiding this comment.
This logic mutates internal state and makes it illegal to invoke getSphericalExcess multiple times. Perhaps, add a boolean to enforce that getSphericalExcess is called only once and add is never called after getSphericalExcess? It might be clearer to rename getSphericalExcess to computeSphericalExcess.
...eospatial/src/test/java/com/facebook/presto/plugin/geospatial/TestSphericalGeoFunctions.java
Outdated
Show resolved
Hide resolved
...eospatial/src/test/java/com/facebook/presto/plugin/geospatial/TestSphericalGeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/test/java/com/facebook/presto/plugin/geospatial/BenchmarkSTArea.java
Outdated
Show resolved
Hide resolved
0d56f50 to
274a003
Compare
mbasmanova
left a comment
There was a problem hiding this comment.
@ochalouhi Looks good to me % one comment.
presto-geospatial/src/main/java/com/facebook/presto/plugin/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
Fixed OH and PA as they had duplicate vertices STArea Benchmark : Result "com.facebook.presto.plugin.geospatial.BenchmarkSTArea.stSphericalArea500k": 328652966.567 ±(99.9%) 63645108.583 ns/op [Average] (min, avg, max) = (237252788.000, 328652966.567, 516285755.000), stdev = 73293801.693 CI (99.9%): [265007857.984, 392298075.149] (assumes normal distribution) Benchmark Mode Cnt Score Error Units BenchmarkSTArea.stArea avgt 20 137845.475 ± 28021.370 ns/op BenchmarkSTArea.stArea500k avgt 20 16663361.799 ± 2491691.163 ns/op BenchmarkSTArea.stSphericalArea avgt 20 3250484.666 ± 840187.416 ns/op BenchmarkSTArea.stSphericalArea500k avgt 20 328652966.567 ± 63645108.583 ns/op
274a003 to
7939b61
Compare
|
@ochalouhi Olivier, thank you for the contribution. |
Fixed OH and PA as they had duplicate vertices STArea Benchmark : Result "com.facebook.presto.plugin.geospatial.BenchmarkSTArea.stSphericalArea500k": 328652966.567 ±(99.9%) 63645108.583 ns/op [Average] (min, avg, max) = (237252788.000, 328652966.567, 516285755.000), stdev = 73293801.693 CI (99.9%): [265007857.984, 392298075.149] (assumes normal distribution) Benchmark Mode Cnt Score Error Units BenchmarkSTArea.stArea avgt 20 137845.475 ± 28021.370 ns/op BenchmarkSTArea.stArea500k avgt 20 16663361.799 ± 2491691.163 ns/op BenchmarkSTArea.stSphericalArea avgt 20 3250484.666 ± 840187.416 ns/op BenchmarkSTArea.stSphericalArea500k avgt 20 328652966.567 ± 63645108.583 ns/op Extracted from: prestodb/presto#12315
Fixed OH and PA as they had duplicate vertices STArea Benchmark : Result "com.facebook.presto.plugin.geospatial.BenchmarkSTArea.stSphericalArea500k": 328652966.567 ±(99.9%) 63645108.583 ns/op [Average] (min, avg, max) = (237252788.000, 328652966.567, 516285755.000), stdev = 73293801.693 CI (99.9%): [265007857.984, 392298075.149] (assumes normal distribution) Benchmark Mode Cnt Score Error Units BenchmarkSTArea.stArea avgt 20 137845.475 ± 28021.370 ns/op BenchmarkSTArea.stArea500k avgt 20 16663361.799 ± 2491691.163 ns/op BenchmarkSTArea.stSphericalArea avgt 20 3250484.666 ± 840187.416 ns/op BenchmarkSTArea.stSphericalArea500k avgt 20 328652966.567 ± 63645108.583 ns/op Extracted from: prestodb/presto#12315
See https://www.movable-type.co.uk/scripts/latlong.html
and http://osgeo-org.1560.x6.nabble.com/Area-of-a-spherical-polygon-td3841625.html
and https://www.element84.com/blog/determining-if-a-spherical-polygon-contains-a-pole
for the underlying Maths