Skip to content

Conversation

@raunaqmorarka
Copy link
Member

@raunaqmorarka raunaqmorarka commented Aug 18, 2025

Description

After #22879 SplitCompletedEvent are no longer being populated.
We already get detailed source stage metrics through io.trino.spi.eventlistener.QueryStatistics#getOperatorSummaries. These also contain histograms for scheduled and CPU time which are useful for assessing skews in processing of splits.
The collection of individual split events seems overkill and unnecessary to maintain at this point, especially with data lake connectors where we're taking about thousands of splits per query.
This PR removes support for io.trino.spi.eventlistener.EventListener#splitCompleted altogether and makes it clear that users need to rely on the information already present in io.trino.spi.eventlistener.EventListener#queryCompleted instead. Since the coordinator generates splits and and knows when they are finished, it can provide additional metrics to EventListener in future, if needed, rather than having all workers send events.

Additional context and related issues

Supercedes #26425

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Event listener
* Remove support for `io.trino.spi.eventlistener.EventListener#splitCompleted`. ({issue}`26436`)

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@raunaqmorarka raunaqmorarka requested a review from Copilot August 18, 2025 15:48
@raunaqmorarka raunaqmorarka force-pushed the remove-split-completed branch from 43012b4 to c6e1991 Compare August 18, 2025 15:52

This comment was marked as outdated.

Copy link
Contributor

@wendigo wendigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is backward-incompatible change but since these events were not published for over a year now and no one until now complained, I'm fine with removing the leftovers and recommending to use the QueryCompletedEvent as a source of this information instead.

@raunaqmorarka raunaqmorarka force-pushed the remove-split-completed branch from c6e1991 to 5d52e74 Compare August 18, 2025 16:26
Copy link
Member

@dain dain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me

@raunaqmorarka
Copy link
Member Author

This is backward-incompatible change but since these events were not published for over a year now and no one until now complained, I'm fine with removing the leftovers and recommending to use the QueryCompletedEvent as a source of this information instead.

JFYI we're not breaking SPI compatiblity yet to make the transition easier. Its marked for removal and will be fully deleted in a future release.

@wendigo
Copy link
Contributor

wendigo commented Aug 18, 2025

@raunaqmorarka :shipit:

@raunaqmorarka raunaqmorarka requested a review from Copilot August 19, 2025 04:53
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR removes support for SplitCompletedEvent collection and event listener notifications throughout the Trino codebase. The removal is driven by the fact that split completion events are no longer being populated after a previous change, and detailed source stage metrics are already available through QueryStatistics#getOperatorSummaries. The collection of individual split events is deemed unnecessary, especially for data lake connectors that generate thousands of splits per query.

  • Deprecates and removes SplitCompletedEvent, SplitStatistics, and SplitFailureInfo classes
  • Removes EventListener#splitCompleted method implementation across all event listener plugins
  • Eliminates SplitMonitor class and related infrastructure for tracking split completion events

Reviewed Changes

Copilot reviewed 42 out of 42 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
core/trino-spi/src/main/java/io/trino/spi/eventlistener/*.java Deprecates split-related event classes and modifies EventListener interface
core/trino-main/src/main/java/io/trino/event/SplitMonitor.java Removes entire SplitMonitor class
core/trino-main/src/main/java/io/trino/execution/*.java Removes split monitoring integration from task execution
plugin/trino-kafka-event-listener/src/main/java/io/trino/plugin/eventlistener/kafka/*.java Removes split completion event support from Kafka event listener
plugin/trino-http-event-listener/src/main/java/io/trino/plugin/httpquery/*.java Removes split completion event support from HTTP event listener
testing/trino-tests/src/test/java/io/trino/execution/*.java Removes split event testing infrastructure
docs/src/main/sphinx/admin/*.md Updates documentation to remove split event configuration

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@raunaqmorarka raunaqmorarka merged commit 83c53c2 into trinodb:master Aug 19, 2025
192 of 193 checks passed
@raunaqmorarka raunaqmorarka deleted the remove-split-completed branch August 19, 2025 04:55
@github-actions github-actions bot added this to the 477 milestone Aug 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants