Skip to content

Use stage input partitioning partitionCount instead of output partitioning#16052

Merged
raunaqmorarka merged 2 commits intotrinodb:masterfrom
gaurav8297:fix_partition_count
Feb 15, 2023
Merged

Use stage input partitioning partitionCount instead of output partitioning#16052
raunaqmorarka merged 2 commits intotrinodb:masterfrom
gaurav8297:fix_partition_count

Conversation

@gaurav8297
Copy link
Member

@gaurav8297 gaurav8297 commented Feb 9, 2023

Description

Instead of using output partitioning scheme's partitionCount, use the input partitionCount for selecting the number of nodes for that stage.

Additional context and related issues

Release notes

(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

Copy link
Contributor

@radek-kondziolka radek-kondziolka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments:
0. Very good change with renaming partitioningScheme -> outputPartitioningScheme.

  1. I would not add the new field PlanFragment.partitionCount. We can just use outputPartitioningScheme.partitionCount. What is meaning of PlanFragment.partitionCount? I think you mean the number of input partition to the current stage. If yes, it could be taken from children, using StageManager.getChidlren. It is more explicit way to reason about and less confusing. Then we have clear relationship between partitioning: I am HASH partitioned stage and my children's output is HASH partitioned so I can ask them about number of partitions, what do you think?
  2. I would fire the rule DeterminePartitionCount if we detect that there is no not-SystemPartitioningHandle. Otherwise, you can have a plan with empty partitionCount and then adaptive hash-partition-count does not work.

@gaurav8297 gaurav8297 marked this pull request as ready for review February 13, 2023 06:10
Copy link
Contributor

@radek-kondziolka radek-kondziolka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I would add tests when we have a plan with stages:

S0: [any not SystemPartitioningHandle]
S1: [HASH]

and

S0: [HASH]
S1: [any not SystemPartitioningHandle]

@raunaqmorarka raunaqmorarka merged commit e705e1e into trinodb:master Feb 15, 2023
@github-actions github-actions bot added this to the 407 milestone Feb 15, 2023
@radek-kondziolka
Copy link
Contributor

radek-kondziolka commented Feb 15, 2023

What's about map

Map<PartitioningHandle, NodePartitionMap> partitioningCacheMap = new HashMap<>();
here? Don't we want to key that map with PartitioningHandle and PartitionCount?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants