-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-26797][SQL][WIP][test-maven] Start using the new logical types API of Parquet 1.11.1 instead of the deprecated one #31685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ParquetPartitionDiscoverySuite failed when executed after ParquetInteroperabilitySuite using Maven The reasion for that is, that ParquetInteroperabilitySuite changes the timezone in one test case, but doesn't restore the original. This could be easily fixed by restoring the original timezone in a finally block.
Parquet 1.11.0 is officially released, no need to use snapshot.
Conflicts:
dev/deps/spark-deps-hadoop-2.7
dev/deps/spark-deps-hadoop-3.2
dev/run-tests.py
pom.xml
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedParquetRecordReader.java
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRecordMaterializer.scala
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/DataSourceReadBenchmark.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarQuerySuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelationSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetEncodingSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala
695bb97 to
687e44b
Compare
sessionLocalTz and convertTz were doing the same thing; keep the version from master.
e42ade0 to
06aaa64
Compare
|
Jenkins test this please |
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Hi @srowen, thanks for stopping by. :) I think this'll need more work - there's a bunch of things that happened in between, not least the switch to Java 8+ datetime APIs, the addition of the timestamp rebasing, as well as Happy to take any pointers you'd have. |
|
Oh OK not sure myself. I see some tests to update to Parquet 1.12 already too |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
Trying to revive #23721 from @nandorKollar:
I intentionally left the conflicts in the merge commit, so that it becomes clear how I've chosen (on a best effort basis...) to resolve them - this is obviously WIP.
Also, please note that this is my first PR for spark, so I'm probably in above my head, and happy to close this PR if desired (or take any advice).