Skip to content

Conversation

@dillitz
Copy link
Contributor

@dillitz dillitz commented Oct 10, 2023

What changes were proposed in this pull request?

This PR adds an optional ExecuteHolder to SparkConnectPlanner.
This allows plugins to access the ExecuteHolder to for example create a QueryPlanningTracker without the need to create a new CommandPlugin interface like proposed in #42984.

Why are the changes needed?

There is currently no way to track queries executed by a CommandPlugin. For this, the plugin needs access to the ExecuteHolder.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Adjusted existing tests.

Was this patch authored or co-authored using generative AI tooling?

No.

Comment on lines 2476 to +2478
def process(
command: proto.Command,
responseObserver: StreamObserver[ExecutePlanResponse],
executeHolder: ExecuteHolder): Unit = {
responseObserver: StreamObserver[ExecutePlanResponse]): Unit = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually, this shouldn't be an issue because it's not really a public API. but it's not backwards compatible.

This is particularly interesting because you went through some length to fix this problem in the constructor though.

@hvanhovell Do we consider the SparkConenct planner to be public? Probably, not correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, between Spark 3.4 and Spark 3.5, these interfaces also changed.
For example #41618 made it take a SessionHolder instead of SparkSession, so that various parameters don't have to be passed to it lossely. This PR adding ExecuteHolder to it is in a very similar spirit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think we should. In the end that will hamper our ability to evolve the interface.

class SparkConnectPlanner(val sessionHolder: SessionHolder) extends Logging {
class SparkConnectPlanner(
val sessionHolder: SessionHolder,
val executeHolderOpt: Option[ExecuteHolder] = None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just pass in the ExecuteHolder, and create a getter for SessionHolder? IMO it is a bit weird to create this structure for just a single unit test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's used in AnalyzePlan that doesn't have an ExecuteHolder

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense.

Copy link
Contributor

@hvanhovell hvanhovell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - we can address the constructor comment in a follow-up.

@hvanhovell
Copy link
Contributor

Merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants