-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-23887][SS] continuous query progress reporting #24537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #105155 has finished for PR 24537 at commit
|
|
Test build #105187 has finished for PR 24537 at commit
|
| reportTimeTaken("walCommit", epoch) { | ||
| synchronized { | ||
| // Record offsets before updating `committedOffsets` | ||
| recordTriggerOffsets(from = committedOffsets, to = availableOffsets, epoch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The availableOffsets seems never updated in continuous processing, maybe we should take this into consideration when report metrics.
| currentTriggerEndOffsets.remove(earliestEpochId) | ||
| } | ||
|
|
||
| earliestEpochId += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like, the earliestEpochId will be also updated even no data removed from the metrics maps, this may cause some old data never got a chance to be removed.
| val currentTriggerStartTimestamp = currentDurationsMs(epochId)._1 | ||
| val currentTriggerEndTimestamp = triggerClock.getTimeMillis() | ||
|
|
||
| val executionStats = extractExecutionStats(hasNewData, Some(epochStats)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the origin extractExecutionStats method collects information from the SQL metrics (e.g. state operator metrics and watermark) which is reported only when task ended for now, is there a plan to handle this part of data?
|
We're closing this PR because it hasn't been updated in a while. If you'd like to revive this PR, please reopen it! |
What changes were proposed in this pull request?
Enable query progress reporting in continuous mode.
MicroBatchProgressReporterandContinuousProgressReporterqueryPlanningandwalCommit. Especially, thequeryPlanningis a static and wont change in each epoch. It only happens once time at the beginning of query.How was this patch tested?
update existing uts