-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Remove code for collecting SplitCompletedEvent on workers #26436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove code for collecting SplitCompletedEvent on workers #26436
Conversation
7838114 to
43012b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
43012b4 to
c6e1991
Compare
wendigo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is backward-incompatible change but since these events were not published for over a year now and no one until now complained, I'm fine with removing the leftovers and recommending to use the QueryCompletedEvent as a source of this information instead.
c6e1991 to
5d52e74
Compare
dain
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me
JFYI we're not breaking SPI compatiblity yet to make the transition easier. Its marked for removal and will be fully deleted in a future release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR removes support for SplitCompletedEvent collection and event listener notifications throughout the Trino codebase. The removal is driven by the fact that split completion events are no longer being populated after a previous change, and detailed source stage metrics are already available through QueryStatistics#getOperatorSummaries. The collection of individual split events is deemed unnecessary, especially for data lake connectors that generate thousands of splits per query.
- Deprecates and removes
SplitCompletedEvent,SplitStatistics, andSplitFailureInfoclasses - Removes
EventListener#splitCompletedmethod implementation across all event listener plugins - Eliminates
SplitMonitorclass and related infrastructure for tracking split completion events
Reviewed Changes
Copilot reviewed 42 out of 42 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| core/trino-spi/src/main/java/io/trino/spi/eventlistener/*.java | Deprecates split-related event classes and modifies EventListener interface |
| core/trino-main/src/main/java/io/trino/event/SplitMonitor.java | Removes entire SplitMonitor class |
| core/trino-main/src/main/java/io/trino/execution/*.java | Removes split monitoring integration from task execution |
| plugin/trino-kafka-event-listener/src/main/java/io/trino/plugin/eventlistener/kafka/*.java | Removes split completion event support from Kafka event listener |
| plugin/trino-http-event-listener/src/main/java/io/trino/plugin/httpquery/*.java | Removes split completion event support from HTTP event listener |
| testing/trino-tests/src/test/java/io/trino/execution/*.java | Removes split event testing infrastructure |
| docs/src/main/sphinx/admin/*.md | Updates documentation to remove split event configuration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Description
After #22879 SplitCompletedEvent are no longer being populated.
We already get detailed source stage metrics through
io.trino.spi.eventlistener.QueryStatistics#getOperatorSummaries. These also contain histograms for scheduled and CPU time which are useful for assessing skews in processing of splits.The collection of individual split events seems overkill and unnecessary to maintain at this point, especially with data lake connectors where we're taking about thousands of splits per query.
This PR removes support for
io.trino.spi.eventlistener.EventListener#splitCompletedaltogether and makes it clear that users need to rely on the information already present inio.trino.spi.eventlistener.EventListener#queryCompletedinstead. Since the coordinator generates splits and and knows when they are finished, it can provide additional metrics to EventListener in future, if needed, rather than having all workers send events.Additional context and related issues
Supercedes #26425
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: