Skip to content

Conversation

@skrzypo987
Copy link
Member

@skrzypo987 skrzypo987 commented Jun 21, 2022

Description

Removal of grouped execution as it can be easily replaced by Tardigrade.
The PR contains many small commits to help with the review process. Majority should be squashed/rearranged before the merge.
Most of the changes is just a mindless unused code removal. Some parts, however, required more robust refactorings. Since I am not an expert in many of the areas, some things has been done without full understanding.

Is this change a fix, improvement, new feature, refactoring, or other?

all of the above

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

all

How would you describe this change to a non-technical end user or system administrator?

Removal of grouped execution

Related issues, pull requests, and links

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
(x) Release notes entries required with the following suggested text:

# Section
* Removed grouped execution mechanism. `grouped-execution-enabled`, `dynamic-schedule-for-grouped-execution`, `concurrent-lifespans-per-task` config options and `grouped_execution`, `dynamic_schedule_for_grouped_execution` and `concurrent_lifespans_per_task` system properties are no longer available

@cla-bot cla-bot bot added the cla-signed label Jun 21, 2022
@skrzypo987 skrzypo987 marked this pull request as ready for review June 21, 2022 07:24
@skrzypo987 skrzypo987 requested review from losipiuk and sopel39 June 21, 2022 07:24
@skrzypo987
Copy link
Member Author

The testSingleVersionBackwardCompatibility is failing. I don't really know how to handle that.
@kokosing You made this test, any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this deprecated method now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My changes had nothing to do with that. The method is still used

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid issues with backward compatibility you would need to make things you remove deprecated first and add non deprecated methods.

@kokosing
Copy link
Member

The testSingleVersionBackwardCompatibility is failing. I don't really know how to handle that.

The best way is to run the test in the IDE. The diff is presented much better there than in CI.

@kokosing
Copy link
Member

2022-06-21T07:31:37.7682499Z but could not find the following element(s):
2022-06-21T07:31:37.7682783Z  <["Class: public static enum io.trino.spi.connector.ConnectorSplitManager$SplitSchedulingStrategy",
2022-06-21T07:31:37.7683229Z     "Method: public static io.trino.spi.connector.ConnectorSplitManager$SplitSchedulingStrategy[] io.trino.spi.connector.ConnectorSplitManager$SplitSchedulingStrategy.values()",
2022-06-21T07:31:37.7683703Z     "Method: public static io.trino.spi.connector.ConnectorSplitManager$SplitSchedulingStrategy io.trino.spi.connector.ConnectorSplitManager$SplitSchedulingStrategy.valueOf(java.lang.String)",
2022-06-21T07:31:37.7684233Z     "Field: public static io.trino.spi.connector.ConnectorSplitManager$SplitSchedulingStrategy io.trino.spi.connector.ConnectorSplitManager$SplitSchedulingStrategy.UNGROUPED_SCHEDULING",
2022-06-21T07:31:37.7684881Z     "Field: public static io.trino.spi.connector.ConnectorSplitManager$SplitSchedulingStrategy io.trino.spi.connector.ConnectorSplitManager$SplitSchedulingStrategy.GROUPED_SCHEDULING",
2022-06-21T07:31:37.7686488Z     "Method: public default io.trino.spi.connector.ConnectorSplitSource io.trino.spi.connector.ConnectorSplitManager.getSplits(io.trino.spi.connector.ConnectorTransactionHandle,io.trino.spi.connector.ConnectorSession,io.trino.spi.connector.ConnectorTableHandle,io.trino.spi.connector.ConnectorSplitManager$SplitSchedulingStrategy,io.trino.spi.connector.DynamicFilter,io.trino.spi.connector.Constraint)"]>

I think that is the exact list of things that you removed.

@skrzypo987 skrzypo987 force-pushed the skrzypo/073-remove-grouped-execution branch 3 times, most recently from a82dd9b to e47429f Compare June 21, 2022 12:38
@skrzypo987
Copy link
Member Author

@kokosing I rolled back the deletion of the class and put deprecated annotations around (last commit). Is that the way to do it?

@kokosing
Copy link
Member

I rolled back the deletion of the class and put deprecated annotations around (last commit). Is that the way to do it?

Almost.

  1. Since deprecated method cannot be annotated with deprecated annotation any more I guess you should avoid changing it signature too. I think that next step should be to just remove it.
  2. Your new deprecated method is actually not used so if a plugin is using that method still instead of new one it won't work. So your change is still backward incompatible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would that make sense to leave test assertion for notColocated? Or is that covered by other tests. I would assume the latter, but can you check please?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. I did not found tests that strictly mimic this one. The question remains about the name of the test since it has nothing to do with grouped execution anymore

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me that this property was misnamed and should have been called exposeTableRangePartitioning or sth. Not sure if that is applicable only for grouped exectution. Any Kudu experts on board? @MartinWeindel @grantatspothero ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very familiar with ConnectorTableProperties, are tablePartitioning and streamPartitioningColumns used for anything besides grouped execution?

Right now these two fields are only populated behind the kudu.grouped-execution.enabled flag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not look like specific to just GROUPED_EXECUTION

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left the testSimple method and removed the grouped case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do that? There are calls with preferDynamic set to true and false

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid you are right. I dropped the last commit.

Copy link
Member

@losipiuk losipiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK.
@dain would you like to skim?

@skrzypo987 skrzypo987 force-pushed the skrzypo/073-remove-grouped-execution branch from c9932c3 to 86e7745 Compare June 23, 2022 07:41
@skrzypo987
Copy link
Member Author

Addressed comments in existing commits. The Kudu question remains

@skrzypo987 skrzypo987 force-pushed the skrzypo/073-remove-grouped-execution branch from 86e7745 to 47399a5 Compare June 24, 2022 11:48
@skrzypo987
Copy link
Member Author

Rebased and added another commit that removes another stuff

@skrzypo987 skrzypo987 force-pushed the skrzypo/073-remove-grouped-execution branch from 47399a5 to c52416d Compare June 27, 2022 12:50
@skrzypo987
Copy link
Member Author

Another rebase and this time added a commit requested by @sopel39. Also addressed small comments received offline.
If tests are ok we should start squashing commits to a more logical ones and it will be good to merge.

@skrzypo987 skrzypo987 force-pushed the skrzypo/073-remove-grouped-execution branch from c52416d to 56439b4 Compare June 27, 2022 13:11
skrzypo987 added 10 commits June 28, 2022 10:11
This commit removes the feature of grouped execution. The code associated with
this feature is, however, mostly still present in the codebase. The commit only
removes config options and system session properties. The only additional
changes are for the purpose of successful compilation.
The dead code that is left in the project will be cleared in the following commits
The class is not useful after removal of grouped execution.
Major parts of PlanFragmenter class has been removed in the process, as well as
some tests strictly related to grouped execution
The class is not useful after removal of grouped execution.
The class is not useful after removal of grouped execution.
After the removal of grouped execution it is impossible for this object to be
anything else than the "taskWide" singleton. This object is at that point
completely useless and will exist just until its last occurrence is removed,
which will take place in the following commits.
The lifespan object is a useless singleton at this point so it is being
removed from:
-DriverContext
-DriverFactory
-DriverStats
-Join related classes
-Split and SplitAssignment
-LocalExchange. LocalExchangeFactory class is not needed anymore.
-SqlTaskExecution
It was always a `NOT_PARTITION` since the removal of grouped execution,
so it can be hardcoded at this point.
The class is not actually used since the removal of grouped execution.
The last usages removed in this commit are:
-TaskStatus class
-TaskContext class
-operator factories
-schedulers
It was always a `NOT_PARTITION` since the removal of grouped execution,
so it can be hardcoded at this point.
The ConnectorNodePartitioningProvider::listPartitionHandles method can also
be deprecated at this point
@skrzypo987 skrzypo987 force-pushed the skrzypo/073-remove-grouped-execution branch from 56439b4 to a5c5c21 Compare June 28, 2022 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

5 participants