Add pushdown for parquet timestamp predicate by alexjo2144 · Pull Request #4104 · trinodb/trino

alexjo2144 · 2020-06-19T20:31:27Z

No description provided.

findepi

Nice!
editorial comments

presto-hive/src/test/java/io/prestosql/plugin/hive/TestHiveIntegrationSmokeTest.java

findepi · 2020-06-23T10:50:19Z

presto-hive/src/test/java/io/prestosql/plugin/hive/TestHiveIntegrationSmokeTest.java

That's not too long, but would be slightly better to extract QueryInfo fullQueryInfo var

presto-parquet/src/main/java/io/prestosql/parquet/predicate/TupleDomainParquetPredicate.java

presto-parquet/src/main/java/io/prestosql/parquet/predicate/ParquetTimestampStatistics.java

presto-parquet/src/test/java/io/prestosql/parquet/TestTupleDomainParquetPredicate.java

alexjo2144 · 2020-06-23T15:17:51Z

Updated, thanks

findepi · 2020-06-24T15:36:46Z

Merged, thanks!

ryanrupp · 2020-07-13T17:05:27Z

Does this cover int64 logically typed as TIMESTAMP_MICROS OR TIMESTAMP_MILLIS (also timestamp nanos I believe in newer versions of Parquet)? I wasn't familiar with ParquetTimestampUtils which is used here but it seems to reference int96 timestamps which is deprecated for Parquet. Also this is using BinaryStatistics then (probably what you get for int96 encoded timestamps? not familiar).

For reference there was an older PR #1999 that was adding pushdown for timestamps on int64 via LongStatistics

alexjo2144 · 2020-07-14T19:44:03Z

Hey Ryan, thanks for pointing that out. I think you're right, this only works for the legacy int96 timestamps. I didn't realize it was deprecated.

Do you know if Presto's Parquet read/writer supports the new format or should we add an issue to add that in? There's probably some more changed needed to work in the parametric timestamp types that Dain has been working on too.

ryanrupp · 2020-07-14T20:01:39Z

@alexjo2144 I'm not sure what the writer actually writes out currently (my write path isn't through Presto currently). For the read path I'm pretty sure it can read int64 TIMESTAMP_MILLIS but I'm not sure about int64 logically typed as TIMESTAMP_MICROS. I thought of this because there was a post in Slack asking if TIMESTAMP_MICROS is supported on the read path here

Here's the documentation about timestamp handling in Parquet. One of the newer features is support for nanos precision stored in an int64 (with nano precision in int96 being deprecated, see here)

alexjo2144 · 2020-07-14T20:54:42Z

Gotcha. Looks like the default presto parquet writer uses int96, but there's a new writer, added the same day as these changes, that supports int64 at millisecond precision #3400 That should help for testing.

I can work on putting up an updated version of #1999 to tie that in.

Added in prestosql#4104, predicate pushdown for parquet INT96 timestamp values can result in incorrect results even when stats appear valid by checking min <= max. Parquet writers that produced statistics at all were comparing min and max values as BINARY which is oblivious to the semantics of how INT96 timestamps are encoded making them unusable. Comparison of INT96 values for statistics was removed in PARQUET-1065 for all cases except when min == max, which would not be affected by the comparison issue or other byte order issues that existed with parquet BINARY types at the time. Any parquet file that contains INT96 statistics where min != max would have to have been written by an older parquet writer that compared the values incorrectly, making those statistics unusable. This change disables parquet predicate pushdown on INT96 timestamps except for when all rows have the same value (ie: min == max).

ericxiao251 · 2020-11-26T21:51:46Z

This improvement really helped with one of our queries, but I want to understand a bit more of what's actually happening with Presto and taking advantage of the parquet metadata...

What does the getDomain function in the TupleDomainParquetPredicate.java file do? How does leveraging timestamp metadata help when data is randomly stored on disk?

alexjo2144 · 2020-12-08T20:32:21Z

So, Parquet files are separated out into RowGroups. Each RowGroup gets an entry in the file footer with statistics information. This section of the code reads that footer and determines which sections of the file can be skipped by comparing the footer statistics with any filters from the query. getDomain returns that statistics information as a Domain that the engine can use for this comparison. Skipping whole groups means being able to reduce IO, and also do less filtering after the data is loaded into the engine.

cla-bot bot added the cla-signed label Jun 19, 2020

findepi approved these changes Jun 23, 2020

View reviewed changes

alexjo2144 force-pushed the parquet-timestamp-pushdown branch from 1e45799 to fba7bcc Compare June 23, 2020 16:13

Add pushdown for parquet timestamp predicate

80ee8a5

alexjo2144 force-pushed the parquet-timestamp-pushdown branch from fba7bcc to 80ee8a5 Compare June 23, 2020 17:59

findepi approved these changes Jun 23, 2020

View reviewed changes

findepi merged commit dc1c442 into trinodb:master Jun 24, 2020

findepi added this to the 337 milestone Jun 24, 2020

findepi mentioned this pull request Jun 24, 2020

Release notes for 337 #4058

Closed

9 tasks

alexjo2144 deleted the parquet-timestamp-pushdown branch July 14, 2020 18:10

pettyjamesm mentioned this pull request Sep 4, 2020

Fix parquet predicate pushdown for INT96 timestamp values #5083

Merged

alexjo2144 mentioned this pull request May 18, 2022

Enable predicate pushdown for TIMESTAMP type in Delta Lake #12344

Merged

bitsondatadev mentioned this pull request Oct 7, 2022

Use timestamp statistics #1999

Closed

Conversation

alexjo2144 commented Jun 19, 2020

Uh oh!

findepi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

findepi Jun 23, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexjo2144 commented Jun 23, 2020

Uh oh!

findepi commented Jun 24, 2020

Uh oh!

ryanrupp commented Jul 13, 2020

Uh oh!

alexjo2144 commented Jul 14, 2020

Uh oh!

ryanrupp commented Jul 14, 2020

Uh oh!

alexjo2144 commented Jul 14, 2020

Uh oh!

ericxiao251 commented Nov 26, 2020

Uh oh!

alexjo2144 commented Dec 8, 2020

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants