Skip to content

Conversation

@mukulmurthy
Copy link
Contributor

What changes were proposed in this pull request?

The leftover state from running a continuous processing streaming job should not affect later microbatch execution jobs. If a continuous processing job runs and the same thread gets reused for a microbatch execution job in the same environment, the microbatch job could get wrong answers because it can attempt to load the wrong version of the state.

How was this patch tested?

New and existing unit tests

@mukulmurthy
Copy link
Contributor Author

@tdas and @jose-torres for review

@SparkQA
Copy link

SparkQA commented Sep 11, 2018

Test build #95908 has finished for PR 22386 at commit c2f813b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


test("is_continuous_processing property should be true for continuous processing") {
val input = ContinuousMemoryStream[Int]
var x: String = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused?

@xuanyuanking
Copy link
Member

xuanyuanking commented Sep 11, 2018

If a continuous processing job runs and the same thread gets reused for a microbatch execution job in the same environment

Little confuse about this scenario, could you explain more? I mean its only happened in UT or we may meet this on product env?

Copy link
Contributor

@jose-torres jose-torres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than 1 nit + the comment from @xuanyuanking

}

object ContinuousExecution {
val IS_CONTINUOUS_PROCESSING = "__is_continuous_processing"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this belongs in StreamExecution, since both ContinuousExecution and MicroBatchExecution set it.

val currentVersion = EpochTracker.getCurrentEpoch match {
case None => storeVersion
case Some(value) => value
val isContinuous = Option(ctxt.getLocalProperty(ContinuousExecution.IS_CONTINUOUS_PROCESSING))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just simple toBoolean here is OK?Cause you set default value both MicroBatch and Continuous side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather keep it as is to be more resilient for the future.

@mukulmurthy
Copy link
Contributor Author

If a continuous processing job runs and the same thread gets reused for a microbatch execution job in the same environment

Little confuse about this scenario, could you explain more? I mean its only happened in UT or we may meet this on product env?

@xuanyuanking It theoretically could have been encountered in production, but continuous processing is considered an experimental feature. The only way to encounter it in production is to run a continuous processing stream and then a microbatch stream in the same spark cluster and have an execution thread get reused. The bug is in StateStoreRDD; EpochTracker sets a ThreadLocal variable called currentEpoch and StateStoreRDD checks for the existence of this variable to decide if the current streaming job is continuous or microbatch.

case Some(value) => value
val isContinuous = Option(ctxt.getLocalProperty(StreamExecution.IS_CONTINUOUS_PROCESSING))
.map(_.toBoolean)
val currentVersion = if (isContinuous.contains(true)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: this looks weird. rather i would do change the previous line val isContinuous = ... .map(_.toBoolean).getOrElse(false)

@mukulmurthy mukulmurthy changed the title [SPARK-25399] Continuous processing state should not affect microbatch execution jobs [SPARK-25399][SS] Continuous processing state should not affect microbatch execution jobs Sep 11, 2018
@tdas
Copy link
Contributor

tdas commented Sep 11, 2018

LGTM. Just one super nit.

@SparkQA
Copy link

SparkQA commented Sep 11, 2018

Test build #95959 has finished for PR 22386 at commit 3ebbed3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 11, 2018

Test build #95961 has finished for PR 22386 at commit 4d4beef.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Sep 11, 2018
…batch execution jobs

## What changes were proposed in this pull request?

The leftover state from running a continuous processing streaming job should not affect later microbatch execution jobs. If a continuous processing job runs and the same thread gets reused for a microbatch execution job in the same environment, the microbatch job could get wrong answers because it can attempt to load the wrong version of the state.

## How was this patch tested?

New and existing unit tests

Closes #22386 from mukulmurthy/25399-streamthread.

Authored-by: Mukul Murthy <[email protected]>
Signed-off-by: Tathagata Das <[email protected]>
(cherry picked from commit 9f5c5b4)
Signed-off-by: Tathagata Das <[email protected]>
@asfgit asfgit closed this in 9f5c5b4 Sep 11, 2018
@xuanyuanking
Copy link
Member

xuanyuanking commented Sep 12, 2018

Great thanks for your comment and fix @mukulmurthy! We'll also port this to our folk.

fjh100456 pushed a commit to fjh100456/spark that referenced this pull request Sep 13, 2018
…batch execution jobs

## What changes were proposed in this pull request?

The leftover state from running a continuous processing streaming job should not affect later microbatch execution jobs. If a continuous processing job runs and the same thread gets reused for a microbatch execution job in the same environment, the microbatch job could get wrong answers because it can attempt to load the wrong version of the state.

## How was this patch tested?

New and existing unit tests

Closes apache#22386 from mukulmurthy/25399-streamthread.

Authored-by: Mukul Murthy <[email protected]>
Signed-off-by: Tathagata Das <[email protected]>
@mukulmurthy mukulmurthy deleted the 25399-streamthread branch September 17, 2018 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants