[SPARK-31278][SS] Fix StreamingQuery output rows metric #28040

brkyvz · 2020-03-26T19:41:06Z

What changes were proposed in this pull request?

In Structured Streaming, we provide progress updates every 10 seconds when a stream doesn't have any new data upstream. When providing this progress though, we zero out the input information but not the output information. This PR fixes that bug.

Why are the changes needed?

Fixes a bug around incorrect metrics

Does this PR introduce any user-facing change?

Fixes a bug in the metrics

How was this patch tested?

New regression test

brkyvz · 2020-03-26T19:41:34Z

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala

      observedMetrics = new java.util.HashMap(observedMetrics.asJava))

-    if (hasNewData) {
+    if (hasExecuted) {


This was also incorrect for no new data micro batches

Nice finding. We haven't recognized the bug because lastNoDataProgressEventTime is set to Long.MinValue which makes next no new data micro batch to update the progress immediately, which hides the bug. (If that's intentional, well, then it's too tricky and we should have commented here.)

Maybe we should also rename lastNoDataProgressEventTime as well as the fix changes the semantic?

And we may want to revisit that our intention is updating progress immediately whenever the batch has not run after any batch run.

oh, that's why I'm facing issues... I understand better now

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala

...ore/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryStatusAndProgressSuite.scala

SparkQA · 2020-03-26T21:40:40Z

Test build #120438 has finished for PR 28040 at commit 7def4de.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingDeduplicationSuite.scala

SparkQA · 2020-03-26T23:47:57Z

Test build #120442 has finished for PR 28040 at commit b57aa74.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

brkyvz · 2020-03-27T01:02:31Z

sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala

+    val clock = new StreamManualClock()
+
    testStream(aggWithWatermark)(
+      StartStream(Trigger.ProcessingTime("interval 1 second"), clock),


@tdas This test tests append mode output for no-data microbatches

brkyvz · 2020-03-27T01:03:00Z

sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala

      }

      def stateOperatorProgresses: Seq[StateOperatorProgress] = {
-        val operatorProgress = mutable.ArrayBuffer[StateOperatorProgress]()


I didn't find this code readable, therefore removed the hack here

HeartSaVioR · 2020-03-27T01:25:38Z

I've proposed similar issue (different bug but the approach to resolve would be similar) in #25987 in Oct. 2019. It didn't get some love. Could we please revisit it as well? Thanks in advance.

SparkQA · 2020-03-27T04:57:44Z

Test build #120444 has finished for PR 28040 at commit 4655611.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

HeartSaVioR · 2020-03-27T08:10:35Z

sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSinkSuite.scala


      addTimestamp(104, 123) // watermark = 90 before this, watermark = 123 - 10 = 113 after this
      check((100L, 105L) -> 2L)  // no-data-batch emits results on 100-105,
+      assert(query.lastProgress.sink.numOutputRows === 1)


The last progress here is for "no data & no run" because of the reason I commented earlier - that's why the test fails.

actually no. FileStreamSink is a V1 sink and doesn't support output metrics it seems

The value is -1 instead of 0 if it doesn't support output metrics, and as you can see the error message in Jenkins build, here the value is 0 instead of -1, because the patch overwrites the value to 0 when the batch hasn't run. So yes the last progress here is for "no data & no run", though the new commit should fix this problem.

V1 suite

{ "id" : "1bbf91ac-0a24-4da7-bfe8-54ce1ac63e0f", "runId" : "ac738c58-a976-4c87-9547-b4a1ee1c2560", "name" : null, "timestamp" : "2020-03-28T23:33:08.567Z", "batchId" : 0, "numInputRows" : 1, "inputRowsPerSecond" : 83.33333333333333, "processedRowsPerSecond" : 0.3835826620636747, "durationMs" : { "addBatch" : 2055, "getBatch" : 2, "latestOffset" : 0, "queryPlanning" : 449, "triggerExecution" : 2607, "walCommit" : 49 }, "eventTime" : { "avg" : "1970-01-01T00:01:40.000Z", "max" : "1970-01-01T00:01:40.000Z", "min" : "1970-01-01T00:01:40.000Z", "watermark" : "1970-01-01T00:00:00.000Z" }, "stateOperators" : [ { "numRowsTotal" : 1, "numRowsUpdated" : 1, "memoryUsedBytes" : 1400, "customMetrics" : { "loadedMapCacheHitCount" : 0, "loadedMapCacheMissCount" : 0, "stateOnCurrentVersionSizeBytes" : 680 } } ], "sources" : [ { "description" : "MemoryStream[value#1L]", "startOffset" : null, "endOffset" : 0, "numInputRows" : 1, "inputRowsPerSecond" : 83.33333333333333, "processedRowsPerSecond" : 0.3835826620636747 } ], "sink" : { "description" : "FileSink[/private/var/folders/wn/3hpqx8015hjbmq43hmrw78z40000gn/T/stream.output-cf800c40-1e18-405e-b48f-71a08348a298]", "numOutputRows" : -1 } } { "id" : "1bbf91ac-0a24-4da7-bfe8-54ce1ac63e0f", "runId" : "ac738c58-a976-4c87-9547-b4a1ee1c2560", "name" : null, "timestamp" : "2020-03-28T23:33:11.185Z", "batchId" : 1, "numInputRows" : 0, "inputRowsPerSecond" : 0.0, "processedRowsPerSecond" : 0.0, "durationMs" : { "addBatch" : 935, "getBatch" : 0, "latestOffset" : 0, "queryPlanning" : 52, "triggerExecution" : 1101, "walCommit" : 70 }, "eventTime" : { "watermark" : "1970-01-01T00:01:30.000Z" }, "stateOperators" : [ { "numRowsTotal" : 1, "numRowsUpdated" : 0, "memoryUsedBytes" : 2272, "customMetrics" : { "loadedMapCacheHitCount" : 10, "loadedMapCacheMissCount" : 0, "stateOnCurrentVersionSizeBytes" : 720 } } ], "sources" : [ { "description" : "MemoryStream[value#1L]", "startOffset" : 0, "endOffset" : 0, "numInputRows" : 0, "inputRowsPerSecond" : 0.0, "processedRowsPerSecond" : 0.0 } ], "sink" : { "description" : "FileSink[/private/var/folders/wn/3hpqx8015hjbmq43hmrw78z40000gn/T/stream.output-cf800c40-1e18-405e-b48f-71a08348a298]", "numOutputRows" : -1 } } { "id" : "1bbf91ac-0a24-4da7-bfe8-54ce1ac63e0f", "runId" : "ac738c58-a976-4c87-9547-b4a1ee1c2560", "name" : null, "timestamp" : "2020-03-28T23:33:12.287Z", "batchId" : 2, "numInputRows" : 0, "inputRowsPerSecond" : 0.0, "processedRowsPerSecond" : 0.0, "durationMs" : { "latestOffset" : 0, "triggerExecution" : 0 }, "eventTime" : { "watermark" : "1970-01-01T00:01:30.000Z" }, "stateOperators" : [ { "numRowsTotal" : 1, "numRowsUpdated" : 0, "memoryUsedBytes" : 2272, "customMetrics" : { "loadedMapCacheHitCount" : 10, "loadedMapCacheMissCount" : 0, "stateOnCurrentVersionSizeBytes" : 720 } } ], "sources" : [ { "description" : "MemoryStream[value#1L]", "startOffset" : 0, "endOffset" : 0, "numInputRows" : 0, "inputRowsPerSecond" : 0.0, "processedRowsPerSecond" : 0.0 } ], "sink" : { "description" : "FileSink[/private/var/folders/wn/3hpqx8015hjbmq43hmrw78z40000gn/T/stream.output-cf800c40-1e18-405e-b48f-71a08348a298]", "numOutputRows" : 0 } } { "id" : "1bbf91ac-0a24-4da7-bfe8-54ce1ac63e0f", "runId" : "ac738c58-a976-4c87-9547-b4a1ee1c2560", "name" : null, "timestamp" : "2020-03-28T23:33:13.066Z", "batchId" : 2, "numInputRows" : 2, "inputRowsPerSecond" : 153.84615384615384, "processedRowsPerSecond" : 3.2258064516129035, "durationMs" : { "addBatch" : 482, "getBatch" : 0, "latestOffset" : 0, "queryPlanning" : 50, "triggerExecution" : 620, "walCommit" : 44 }, "eventTime" : { "avg" : "1970-01-01T00:01:53.500Z", "max" : "1970-01-01T00:02:03.000Z", "min" : "1970-01-01T00:01:44.000Z", "watermark" : "1970-01-01T00:01:30.000Z" }, "stateOperators" : [ { "numRowsTotal" : 2, "numRowsUpdated" : 2, "memoryUsedBytes" : 2584, "customMetrics" : { "loadedMapCacheHitCount" : 20, "loadedMapCacheMissCount" : 0, "stateOnCurrentVersionSizeBytes" : 920 } } ], "sources" : [ { "description" : "MemoryStream[value#1L]", "startOffset" : 0, "endOffset" : 1, "numInputRows" : 2, "inputRowsPerSecond" : 153.84615384615384, "processedRowsPerSecond" : 3.2258064516129035 } ], "sink" : { "description" : "FileSink[/private/var/folders/wn/3hpqx8015hjbmq43hmrw78z40000gn/T/stream.output-cf800c40-1e18-405e-b48f-71a08348a298]", "numOutputRows" : -1 } } { "id" : "1bbf91ac-0a24-4da7-bfe8-54ce1ac63e0f", "runId" : "ac738c58-a976-4c87-9547-b4a1ee1c2560", "name" : null, "timestamp" : "2020-03-28T23:33:13.688Z", "batchId" : 3, "numInputRows" : 0, "inputRowsPerSecond" : 0.0, "processedRowsPerSecond" : 0.0, "durationMs" : { "addBatch" : 987, "getBatch" : 0, "latestOffset" : 0, "queryPlanning" : 43, "triggerExecution" : 1117, "walCommit" : 44 }, "eventTime" : { "watermark" : "1970-01-01T00:01:53.000Z" }, "stateOperators" : [ { "numRowsTotal" : 1, "numRowsUpdated" : 0, "memoryUsedBytes" : 2512, "customMetrics" : { "loadedMapCacheHitCount" : 30, "loadedMapCacheMissCount" : 0, "stateOnCurrentVersionSizeBytes" : 720 } } ], "sources" : [ { "description" : "MemoryStream[value#1L]", "startOffset" : 1, "endOffset" : 1, "numInputRows" : 0, "inputRowsPerSecond" : 0.0, "processedRowsPerSecond" : 0.0 } ], "sink" : { "description" : "FileSink[/private/var/folders/wn/3hpqx8015hjbmq43hmrw78z40000gn/T/stream.output-cf800c40-1e18-405e-b48f-71a08348a298]", "numOutputRows" : -1 } } { "id" : "1bbf91ac-0a24-4da7-bfe8-54ce1ac63e0f", "runId" : "ac738c58-a976-4c87-9547-b4a1ee1c2560", "name" : null, "timestamp" : "2020-03-28T23:33:14.806Z", "batchId" : 4, "numInputRows" : 0, "inputRowsPerSecond" : 0.0, "processedRowsPerSecond" : 0.0, "durationMs" : { "latestOffset" : 0, "triggerExecution" : 0 }, "eventTime" : { "watermark" : "1970-01-01T00:01:53.000Z" }, "stateOperators" : [ { "numRowsTotal" : 1, "numRowsUpdated" : 0, "memoryUsedBytes" : 2512, "customMetrics" : { "loadedMapCacheHitCount" : 30, "loadedMapCacheMissCount" : 0, "stateOnCurrentVersionSizeBytes" : 720 } } ], "sources" : [ { "description" : "MemoryStream[value#1L]", "startOffset" : 1, "endOffset" : 1, "numInputRows" : 0, "inputRowsPerSecond" : 0.0, "processedRowsPerSecond" : 0.0 } ], "sink" : { "description" : "FileSink[/private/var/folders/wn/3hpqx8015hjbmq43hmrw78z40000gn/T/stream.output-cf800c40-1e18-405e-b48f-71a08348a298]", "numOutputRows" : 0 } }

That said, we may not want to overwrite the value to 0 if the value is negative - it may be odd if the value has been -1 because the sink doesn't support numOutputRows but sometimes the value is 0.

sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala

brkyvz · 2020-03-27T17:38:31Z

@HeartSaVioR Thanks for bringing that PR to my attention. We should get that in as well! Would you like to take over both?

HeartSaVioR · 2020-03-27T22:44:24Z

I'm not sure I get it. My PR fixes a different bug so while there may be conflict between twos, twos are valid by theirselves. Why not deal with this PR (or my PR) quickly and do rebase, and deal with other? I think this PR is getting close to merge.

SparkQA · 2020-03-28T22:46:42Z

Test build #120539 has finished for PR 28040 at commit 62044dc.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala

HeartSaVioR · 2020-03-28T23:57:03Z

sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala

      AssertOnQuery { _.stateOperatorProgresses.head.numRowsUpdated === 1 },
      AssertOnQuery { _.stateOperatorProgresses.head.numRowsTotal === 1 },
      AddData(inputData, 10, 12, 14),
+      AdvanceManualClock(1000L), // watermark = 5, runs no-data microbatch


This surprises me, although it's not directly related to this PR so treat it as OFF-TOPIC.

Based on the test, it sounds to me as we need to wait for next trigger interval to run no-data microbatch, and we need to run no-data microbatch even input is available. The input is handled in next trigger.

My expectation was that no-data microbatch is consolidated with data microbatch if there's input available. And ideally thinking, no data microbatch should not require to wait for next trigger interval.

I agree. I think this is because of manual clock synchronization and the next batch is already planned before the data appears to be there

HeartSaVioR · 2020-03-29T00:05:11Z

sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala

+      AssertOnQuery { _.stateOperatorProgresses.head.numRowsUpdated === 1 },
+      AssertOnQuery { _.stateOperatorProgresses.head.numRowsTotal === 3 },
+      AssertOnQuery { _.lastProgress.sink.numOutputRows == 0 },
+      AdvanceManualClock(1000L), // runs batch with no new data and watermark progresses


This is also something hard to understand (requires two trigger intervals instead of one - ideally zero - to run no-data microbatch) but yes this is OFF-TOPIC.

This is not off-topic actually because (i) not understanding this correctly can lead to flaky tests, and (ii) I was afraid that fixes made in this PR actually changed the semantic behavior of no data batches. But that is not the case. I tested in this unit test myself. I think all the confusion is starting from the fact that you dont need to advance manual clock after StartStream to trigger the first batch. So the first AdvanceManualClock not really necessary. Rather what it is doing is advancing the clock thus allowing the 2nd batch to be automatically triggered as soon as the first batch finishes. This is what is leading to the confusion on why is the second batch not picking up the new data ... that is because the next batch has been unblocked already (i.e., before AddData(10, 12, 14)) with the first AdvancedManualClock. This weird asynchronousness despite using the manual clock makes the test incomprehensible and is also a perfect recipe for flakiness.

@brkyvz here is my proposed test. @HeartSaVioR please take a look and see whether this is more understandable.

testStream(aggWithWatermark)( // batchId 0 AddData(inputData, 15), StartStream(Trigger.ProcessingTime("interval 1 second"), clock), CheckAnswer(), // watermark = 0 AssertOnQuery { _.stateNodes.size === 1 }, AssertOnQuery { _.stateNodes.head.metrics("numOutputRows").value === 0 }, AssertOnQuery { _.stateOperatorProgresses.head.numRowsUpdated === 1 }, AssertOnQuery { _.stateOperatorProgresses.head.numRowsTotal === 1 }, AssertOnQuery { _.lastExecutedBatch.sink.numOutputRows == 0 }, // batchId 1 without data AdvanceManualClock(1000L), // watermark = 5 Execute { q => // wait for the no data batch to complete eventually(timeout(streamingTimeout)) { assert(q.lastProgress.batchId === 1) } }, AssertOnQuery { _.stateNodes.head.metrics("numOutputRows").value === 0 }, AssertOnQuery { _.stateOperatorProgresses.head.numRowsUpdated === 0 }, AssertOnQuery { _.stateOperatorProgresses.head.numRowsTotal === 1 }, AssertOnQuery { _.lastExecutedBatch.sink.numOutputRows == 0 }, // batchId 2 with data AddData(inputData, 10, 12, 14), AdvanceManualClock(1000L), // watermark = 5 CheckAnswer(), AssertOnQuery { _.stateNodes.head.metrics("numOutputRows").value === 0 }, AssertOnQuery { _.stateOperatorProgresses.head.numRowsUpdated === 1 }, AssertOnQuery { _.stateOperatorProgresses.head.numRowsTotal === 2 }, AssertOnQuery { _.lastExecutedBatch.sink.numOutputRows == 0 }, // batchId 3 with data AddData(inputData, 25), AdvanceManualClock(1000L), // watermark = 5 CheckAnswer(), AssertOnQuery { _.stateNodes.head.metrics("numOutputRows").value === 0 }, AssertOnQuery { _.stateOperatorProgresses.head.numRowsUpdated === 1 }, AssertOnQuery { _.stateOperatorProgresses.head.numRowsTotal === 3 }, AssertOnQuery { _.lastExecutedBatch.sink.numOutputRows == 0 }, // batchId 4 without data AdvanceManualClock(1000L), // watermark = 15 Execute { q => // wait for the no data batch to complete eventually(timeout(streamingTimeout)) { assert(q.lastProgress.batchId === 4) } }, AssertOnQuery { _.stateNodes.head.metrics("numOutputRows").value === 1 }, AssertOnQuery { _.stateOperatorProgresses.head.numRowsUpdated === 0 }, AssertOnQuery { _.stateOperatorProgresses.head.numRowsTotal === 2 }, AssertOnQuery { _.lastExecutedBatch.sink.numOutputRows == 1 } ) }

@tdas

I think all the confusion is starting from the fact that you dont need to advance manual clock after StartStream to trigger the first batch.
Rather what it is doing is advancing the clock thus allowing the 2nd batch to be automatically triggered as soon as the first batch finishes.
This weird asynchronousness despite using the manual clock makes the test incomprehensible and is also a perfect recipe for flakiness.

Ah, nice finding. Great analysis. That's what I've missed (and very confusing behavior TBH). The proposal looks great and provides better understanding. I have comments for new proposal but since the proposal is reflected in PR, I'll comment directly to the PR.

HeartSaVioR · 2020-03-29T00:09:53Z

Looks like the change brought side-effect and build failure is related. Could you please fix these tests as well?

brkyvz · 2020-03-29T03:09:44Z

I reverted the changes with respect to the empty progress. Seemed to be a bit more risky than I'd like, as I'd like to warmfix this into Spark 3.0

SparkQA · 2020-03-29T07:05:01Z

Test build #120544 has finished for PR 28040 at commit 5fbbf41.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

HeartSaVioR · 2020-03-29T08:32:09Z

Retest this, please

SparkQA · 2020-03-29T12:41:42Z

Test build #120553 has finished for PR 28040 at commit 5fbbf41.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HeartSaVioR · 2020-04-02T00:47:32Z

sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala

      AssertOnQuery { _.lastExecutedBatch.sink.numOutputRows == 0 },
      AddData(inputData, 10, 12, 14),
-      AdvanceManualClock(1000L), // watermark = 5, runs with the just added data
+      AdvanceManualClock(1000L), // watermark = 0, runs with the just added data


Let's just remove watermark here in comment as you've done with further AdvanceManualClock

SparkQA · 2020-04-02T01:30:06Z

Test build #120702 has finished for PR 28040 at commit 68848d9.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

brkyvz · 2020-04-02T01:36:27Z

retest this please

SparkQA · 2020-04-02T03:33:33Z

Test build #120701 has finished for PR 28040 at commit d501127.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

HeartSaVioR · 2020-04-02T04:00:21Z

Looks like the flaky test is below (and this PR made it flaky), not in streaming aggregation.

org.apache.spark.sql.kafka010.KafkaSinkMicroBatchStreamingSuite.streaming - sink progress is produced

SparkQA · 2020-04-02T05:58:50Z

Test build #120703 has finished for PR 28040 at commit 68848d9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-04-03T05:19:27Z

Test build #120743 has finished for PR 28040 at commit 2c9ed55.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

brkyvz · 2020-04-04T20:15:20Z

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120743/testReport/ passed without flakiness

SparkQA · 2020-04-05T00:29:04Z

Test build #120814 has finished for PR 28040 at commit 786d921.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tdas · 2020-04-07T07:54:00Z

sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala

+      AssertOnQuery { _.stateOperatorProgresses.head.numRowsTotal === 3 },
+      AssertOnQuery { _.lastExecutedBatch.sink.numOutputRows == 0 },
+      AdvanceManualClock(1000L), // runs batch with no new data and watermark progresses
+      CheckAnswer(), // watermark = 15, but nothing yet


I feel like this will flaky. CheckAnswer() works reliably only when there is new data to process because it waits for the new data's offset to be reported as processed. Here there is no new data in the no-data-batch, so its possible that this CheckAnswer wont wait for the no-data-batch to finish before starting the last progress checks.

Instead its more reliable (probably) to use eventually, where you check that the lastprogress has the expected batchId.

SparkQA · 2020-04-07T19:33:35Z

Test build #120931 has finished for PR 28040 at commit 68bf147.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

brkyvz · 2020-04-07T20:24:16Z

Unrelated flaky test: org.apache.spark.sql.hive.thriftserver.CliSuite.SPARK-11188 Analysis error reporting

brkyvz · 2020-04-07T20:24:22Z

retest this please

HeartSaVioR · 2020-04-07T22:50:31Z

sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala

+
+      // batchId 1 without data
+      AdvanceManualClock(1000L), // watermark = 5
+      Execute { q =>             // wait for the no data batch to complete


(Good to have) It might be good to have a new operation to consolidate waiting for "no data batch" and checking the answer (as they have same pattern except the desired batch ID).

Not mandatory to do it in this PR.

yeah, it'd be nice to provide an inbuilt function for it if this pattern is used more over time

tdas · 2020-04-08T00:06:50Z

LGTM for this PR. @brkyvz feel free to merge it.

brkyvz · 2020-04-08T00:17:01Z

Thanks @HeartSaVioR @tdas for the review. Merging to master/3.0. Let's jump on #25987 next.

### What changes were proposed in this pull request? In Structured Streaming, we provide progress updates every 10 seconds when a stream doesn't have any new data upstream. When providing this progress though, we zero out the input information but not the output information. This PR fixes that bug. ### Why are the changes needed? Fixes a bug around incorrect metrics ### Does this PR introduce any user-facing change? Fixes a bug in the metrics ### How was this patch tested? New regression test Closes #28040 from brkyvz/sinkMetrics. Lead-authored-by: Burak Yavuz <[email protected]> Co-authored-by: Burak Yavuz <[email protected]> Signed-off-by: Burak Yavuz <[email protected]> (cherry picked from commit 8ab2a0c) Signed-off-by: Burak Yavuz <[email protected]>

SparkQA · 2020-04-08T00:59:28Z

Test build #120932 has finished for PR 28040 at commit 68bf147.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

### What changes were proposed in this pull request? In Structured Streaming, we provide progress updates every 10 seconds when a stream doesn't have any new data upstream. When providing this progress though, we zero out the input information but not the output information. This PR fixes that bug. ### Why are the changes needed? Fixes a bug around incorrect metrics ### Does this PR introduce any user-facing change? Fixes a bug in the metrics ### How was this patch tested? New regression test Closes apache#28040 from brkyvz/sinkMetrics. Lead-authored-by: Burak Yavuz <[email protected]> Co-authored-by: Burak Yavuz <[email protected]> Signed-off-by: Burak Yavuz <[email protected]>

jainshashank24 · 2020-09-06T06:02:28Z

Hi @HeartSaVioR regarding the above PR i didn't understand the -1 value showing for ElasticSearch Sink for numOutputRows.
What does -1 represent in numOutputRows ?
Is it that it doesnt support for ES Sink or is there some bug in that ?

I can see in the sinkProgress description it is mentioned that
"@param numOutputRows Number of rows written to the sink or -1 for Continuous Mode (temporarily)

or Sink V1 (until decommissioned)."

So here does ElasticSearch sink comes under Sink V1 ?

Fix StreamingQuery output rows metric

7def4de

brkyvz commented Mar 26, 2020

View reviewed changes

tdas reviewed Mar 26, 2020

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala Outdated Show resolved Hide resolved

tdas reviewed Mar 26, 2020

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala Show resolved Hide resolved

tdas reviewed Mar 26, 2020

View reviewed changes

...ore/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryStatusAndProgressSuite.scala Show resolved Hide resolved

add more tests

b57aa74

tdas reviewed Mar 26, 2020

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingDeduplicationSuite.scala Show resolved Hide resolved

fix broken test

4655611

brkyvz commented Mar 27, 2020

View reviewed changes

HeartSaVioR reviewed Mar 27, 2020

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala Show resolved Hide resolved

brkyvz added 2 commits March 28, 2020 14:21

Fix tests

6e1f90f

rename progress variable

62044dc

HeartSaVioR reviewed Mar 28, 2020

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala Show resolved Hide resolved

HeartSaVioR reviewed Mar 28, 2020

View reviewed changes

HeartSaVioR reviewed Mar 29, 2020

View reviewed changes

revert changes

5fbbf41

fix comment

68848d9

HeartSaVioR reviewed Apr 2, 2020

View reviewed changes

brkyvz added 2 commits April 2, 2020 17:58

Change test to avoid no-data microbatches

1cf1a3c

make test more robust

2c9ed55

brkyvz added 2 commits April 4, 2020 13:14

remove 50 retries

3822aed

Update StreamingAggregationSuite.scala

786d921

brkyvz changed the title ~~[DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric~~ [SPARK-31278][SS] Fix StreamingQuery output rows metric Apr 4, 2020

tdas reviewed Apr 7, 2020

View reviewed changes

brkyvz added 2 commits April 7, 2020 11:31

Save

9bd4445

Merge branch 'sinkMetrics' of github.com:brkyvz/spark into sinkMetrics

68bf147

HeartSaVioR reviewed Apr 7, 2020

View reviewed changes

asfgit closed this in 8ab2a0c Apr 8, 2020

HeartSaVioR mentioned this pull request Apr 29, 2020

[SPARK-31593][SS] Remove unnecessary streaming query progress update #28391

Closed

[SPARK-31278][SS] Fix StreamingQuery output rows metric #28040

[SPARK-31278][SS] Fix StreamingQuery output rows metric #28040

Uh oh!

Conversation

brkyvz commented Mar 26, 2020

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR Mar 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SparkQA commented Mar 26, 2020

Uh oh!

Uh oh!

SparkQA commented Mar 26, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR commented Mar 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SparkQA commented Mar 27, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR Mar 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

brkyvz commented Mar 27, 2020

Uh oh!

HeartSaVioR commented Mar 27, 2020

Uh oh!

SparkQA commented Mar 28, 2020

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR Mar 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tdas Apr 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR Apr 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR commented Mar 29, 2020

Uh oh!

brkyvz commented Mar 29, 2020

Uh oh!

SparkQA commented Mar 29, 2020

Uh oh!

HeartSaVioR commented Mar 29, 2020

Uh oh!

SparkQA commented Mar 29, 2020

HeartSaVioR Mar 27, 2020 •

edited

Loading

HeartSaVioR commented Mar 27, 2020 •

edited

Loading

HeartSaVioR Mar 28, 2020 •

edited

Loading

HeartSaVioR Mar 29, 2020 •

edited

Loading

tdas Apr 7, 2020 •

edited

Loading

HeartSaVioR Apr 7, 2020 •

edited

Loading

HeartSaVioR Apr 7, 2020 •

edited

Loading

jainshashank24 commented Sep 6, 2020 •

edited

Loading