Skip to content

Conversation

@HeartSaVioR
Copy link
Contributor

What changes were proposed in this pull request?

This removes the undocumented and incomplete feature of "stateful aggregation" in continuous mode, which would reduce 1100+ lines of code.

Why are the changes needed?

The work for the feature had been stopped for over an year, and no one asked/requested for the availability of such feature in community. Current state for the feature is that it only works with coalesce(1) which force the query to read and process, and write in "a" task, which doesn't make sense in production.

The remaining code increases the work on DSv2 changes as well - that's why I don't simply propose reverting relevant commits - the code path has been changed due to DSv2 evolution.

Does this PR introduce any user-facing change?

Technically no, because it's never documented and can't be used in production in current shape.

How was this patch tested?

Existing tests.

… in continuous mode

This removes the undocumented and incomplete feature of "stateful aggregation"
in continuous mode. The work for the feature had been stopped for over an year
and no one asked/requested for the availability of such feature in community.
@HeartSaVioR HeartSaVioR changed the title [SPARk-31985][SS] Remove incomplete/undocumented stateful aggregation in continuous mode [SPARK-31985][SS] Remove incomplete/undocumented stateful aggregation in continuous mode Jul 12, 2020
@HeartSaVioR
Copy link
Contributor Author

@SparkQA
Copy link

SparkQA commented Jul 13, 2020

Test build #125734 has finished for PR 29077 at commit 6a5965c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@jose-torres jose-torres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, just a question about one change that might have been accidental.

Thanks for taking care of this.

if (operationCheckEnabled) {
UnsupportedOperationChecker.checkForContinuous(analyzedPlan, outputMode)
}
UnsupportedOperationChecker.checkForContinuous(analyzedPlan, outputMode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was removing the operationCheckEnabled flag intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not intentional. Nice finding. Thanks! I'll revert.

@SparkQA
Copy link

SparkQA commented Jul 14, 2020

Test build #125792 has finished for PR 29077 at commit 5459d58.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HeartSaVioR
Copy link
Contributor Author

@jose-torres Would you mind going through another round of review? Thanks in advance!

@jose-torres
Copy link
Contributor

LGTM. I don't have the repo fully set up on my new computer, so I'll try to find time to set it up and merge tomorrow. (Or you can do it if you want to try out your new committer powers; congrats btw!)

@HeartSaVioR
Copy link
Contributor Author

Thanks for the reviewing and kind words :) I'll deal with merging.

@HeartSaVioR HeartSaVioR deleted the SPARK-31985 branch July 15, 2020 04:41
@HyukjinKwon
Copy link
Member

HyukjinKwon commented Jul 15, 2020

@HeartSaVioR, no big deal but let's make sure to leave a comment to mention which branch this PR went through.

@HeartSaVioR
Copy link
Contributor Author

Ah thanks for noticing. I missed it. It's merged only in master branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants