-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List/Map is not compatible with AWS Athena/Hive/PrestoDB #30
Comments
this is the schema generated by parquet.js for a list of elements
and expected schema for PrestoDB/Hive is
|
Hey @dg3feiko, |
I have installed your version like mentioned in the comment with
Generated new files and copied that to the S3 bucket, still problems with the athena query.. |
there is also #43 you could try to install a fork that has all my outstanding PRs here merged to master (including the 43)
|
I can select simple fields in the first tier, but when i select a struct Athena crashes with message: HIVE_CURSOR_ERROR: Can not read value at 0 in block 0 with your latest fork |
i used 0.8.0 to convert a flat json file to parquet. Verified that im able to write and read it back. Uploaded it to s3 and used glue to create the athena table. Im unable to query the data for some reason though, getting a GENERIC_INTERNAL_ERROR: 0 |
I gave this a try recently in AWS with Athena + Presto using the latest from Root level primitives worked but nested lists failed:
|
+1 |
So I encountered the same issue and spend some time getting it to work. Here is a solution that seems to work at least for my case of lists with structs: ZJONSSON#34 |
Added ENUM to types
I generated a parquet file with parquet.js with data containing list and map, but the nested field is not readable by AWS Athena, which is based on PrestoDB. I checked other implementations and it seem this is the reason apache/parquet-java#411
Thank you for the great job all the same.
The text was updated successfully, but these errors were encountered: