Skip to content

Implement partitioned tpch/tpcds cbo plan test#11738

Merged
sopel39 merged 5 commits intotrinodb:masterfrom
gaurav8297:gaurav8297/partition_test
Apr 14, 2022
Merged

Implement partitioned tpch/tpcds cbo plan test#11738
sopel39 merged 5 commits intotrinodb:masterfrom
gaurav8297:gaurav8297/partition_test

Conversation

@gaurav8297
Copy link
Member

@gaurav8297 gaurav8297 commented Mar 31, 2022

Description

Issue: #11466

Is this change a fix, improvement, new feature, refactoring, or other?

Improvement

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

CBO and benchto

How would you describe this change to a non-technical end user or system administrator?

This includes two major changes

  1. Instead of using tpch/tpcds connector, we are now using a hive connector with in-memory metastore to run CBO plan validation tests. This essentially helps to depict the reality that is actual plans generated on the benchmark cluster. For instance, the algorithm used in hive metastore to calculate partition statistics is different from tpch/tpcds connectors.
  2. Implemented CBO plan test for partitioned tpch/tpcds tables.

Related issues, pull requests, and links

Documentation

( ) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@gaurav8297 gaurav8297 marked this pull request as ready for review April 6, 2022 17:29
@gaurav8297
Copy link
Member Author

@raunaqmorarka PTAL

@martint
Copy link
Member

martint commented Apr 6, 2022

Have we looked at making the TPC-H and TPC-DS connectors produce partitioned tables instead? I'm concerned about the amount of coupling between the benchmarks module and the Hive connector that this change is introducing. It will make it much harder to evolve the Hive connector in the future.

@gaurav8297
Copy link
Member Author

gaurav8297 commented Apr 6, 2022

Have we looked at making the TPC-H and TPC-DS connectors produce partitioned tables instead?

We started with this but then the way hive connector fetches statistics for partitioned tables is different. It uses partition sampling and then estimates statistics across different samples. It also stores statistics in a different format PartitionStatistics and converts it back to TableStatistics. So, we thought it's better to use the hive connector itself to depict the reality which is how we are running benchmarks on the benchmark cluster.

I'm concerned about the amount of coupling between the benchmarks module and the Hive connector that this change is introducing. It will make it much harder to evolve the Hive connector in the future.

IIUIC, the majority of coupling is introduced to generate the gzip statistics files in GlueStatisticsGenerator, not for the actual tests. So, one option is we could somehow remove the GlueStatisticsGenerator, or we could keep it as it is and change the GlueStatisticsGenerator whenever there are changes in the hive connector. Do you think it'll be a big problem?

cc @raunaqmorarka @sopel39 @martint

@sopel39
Copy link
Member

sopel39 commented Apr 6, 2022

@gaurav8297 Have you checked RecordingHiveMetastore? It's purpose is to dump metastore metadata. Maybe it can be used to save/load statistics for plan tests? Gzipping there (HiveMetastoreRecording#writeRecording) shouldn't be an issue.

@gaurav8297
Copy link
Member Author

@raunaqmorarka PTAL again

@gaurav8297 gaurav8297 requested a review from raunaqmorarka April 8, 2022 06:28
@gaurav8297 gaurav8297 requested a review from raunaqmorarka April 8, 2022 14:27
@gaurav8297
Copy link
Member Author

@raunaqmorarka PTAL

Copy link
Member

@raunaqmorarka raunaqmorarka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % minor comments

In case of huge partitioned tables, the recording
file could be huge in size due to partition level
statistics. So, it's better to compress the recording
file which essentially makes read/write faster.
Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we might consider renaming packages here in the future as it's more related to Hive stats now rather than generic TPCH/TPCDS connector

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there are no min/max statistics for char-based columns.

Why?

Copy link
Member Author

@gaurav8297 gaurav8297 Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we support min/max statistics for varchar and char columns in hive connector.

https://github.com/trinodb/trino/blob/master/plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftMetastoreUtil.java#L961

Q: This change was from 2018. Is this still a case that min/max for char columns are not used by the optimizer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See StatsUtil#toStatsRepresentation, it's still the case, we rely more on NDV in that case

@gaurav8297
Copy link
Member Author

@sopel39 PTAL

Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

Currently recording metastore caches partition statistics
and values for a set of partition rather than per each
partition.
Instead of using connectorTableHandle, use
TableMetadata to find table name in a generic
way in JoinOrderPrinter.
Instead of using tpch/tpcds connector, use
in-memory hive metastore with corresponding tables
to depict the reality that is actual plans generated
on the benchmark cluster.

For instance, the algorithm used in hive metastore
to calculate partition statistics is different from
tpch/tpcds connectors.
@sopel39
Copy link
Member

sopel39 commented Apr 14, 2022

Failed due to #11929

@sopel39 sopel39 merged commit ef09c41 into trinodb:master Apr 14, 2022
@github-actions github-actions bot added this to the 378 milestone Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

5 participants