-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Implement native ESRI reader #25241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement native ESRI reader #25241
Conversation
1361efe to
7f29bc9
Compare
...ng/trino-product-tests/src/main/java/io/trino/tests/product/hive/TestHiveStorageFormats.java
Outdated
Show resolved
Hide resolved
5b41a78 to
16784ed
Compare
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
findinpath
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After skimming the code and trying TestEsri I definitely understand the purpose of this contribution.
The referenced library geometry-api-java seems not lively anymore.
It would be useful to have a test reading all types.
However before adding any other changes, I think it is worth asking the maintainers @wendigo , @dain whether this contribution is basically fit from a functional perspective to be inclued in the Trino project code.
16784ed to
f02ca1b
Compare
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveStorageFormat.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriReader.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
06d5677 to
d28634f
Compare
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriReader.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriReader.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriReader.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/HiveClassNames.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/esri/EsriPageSourceFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/esri/EsriPageSourceFactory.java
Outdated
Show resolved
Hide resolved
8de204a to
87d9d72
Compare
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriReader.java
Outdated
Show resolved
Hide resolved
87d9d72 to
2a28c44
Compare
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/esri/EsriPageSourceFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/esri/EsriPageSource.java
Outdated
Show resolved
Hide resolved
2a28c44 to
a7df4a8
Compare
dain
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent a good while reviewing this. Overall I think the approach is sound, but the code is missing defenses against bad data files. In this code we should strive to be bug-for-bug compatible with hive, and this includes handling of "bad" files, because users often rely on these undocumented behaviors.
Additionally, Jackson has some unexpected behaviors when recursing into nested structures, and this code falls into that trap. Specifically, the code isn't properly skipping nexted data which can result in processing inside of objects that is not expected (I had to learn this the hard way a couple of years back). In general, I used (copied) the framework laid out in the Json reader, which handles these issues.
Instead of adding a lot of mundane comments, I just applied them to the code which you can find in this commit dain@8b95a73
Finally, the tests seem to be missing cases for some of the supported attribute types... I see then when running the tests with coverage.
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
|
This pull request has gone a while without any activity. Ask for help on #core-dev on Trino slack. |
|
Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time. |
a7df4a8 to
51a6492
Compare
|
@dain Thank you so much for the review and update! I just pulled your commit and made new revision with additional unit test called |
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
My comments have been integrated. James will take over the for the final review.
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/OGCType.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/test/java/io/trino/hive/formats/esri/TestEsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/test/java/io/trino/hive/formats/esri/TestEsriReader.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
51a6492 to
1ac5357
Compare
plugin/trino-hive/src/main/java/io/trino/plugin/hive/esri/EsriPageSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/esri/EsriPageSourceFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/test/java/io/trino/plugin/hive/BaseHiveConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/test/java/io/trino/plugin/hive/BaseHiveConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/test/java/io/trino/plugin/hive/TestEsriTable.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/test/java/io/trino/plugin/hive/TestHivePageSink.java
Outdated
Show resolved
Hide resolved
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/esri/EsriDeserializer.java
Outdated
Show resolved
Hide resolved
fe322fc to
f89a2d6
Compare
Co-authored-by: Dain Sundstrom <[email protected]>
f89a2d6 to
44a8841
Compare
pettyjamesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, thanks @ljw9111!
Description
This PR implements the native ESRI reader for reading Esri JSON which can be used for geospatial queries. (NOTE: we only support UTC timezone in this port)
Customer can now submit geospatial query on a table using ESRI serde.
DDL example
Example data is from https://docs.aws.amazon.com/athena/latest/ug/geospatial-example-queries.html
DML example
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: