Skip to content
Closed
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ package org.apache.spark.sql.catalyst

import scala.collection.JavaConverters._

import org.apache.spark.internal.Logging
import org.apache.spark.util.BoundedPriorityQueue


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to remove this line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, updated

/**
* A simple utility for tracking runtime and associated stats in query planning.
*
Expand Down Expand Up @@ -90,7 +90,7 @@ object QueryPlanningTracker {
}


class QueryPlanningTracker {
class QueryPlanningTracker extends Logging {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for third-party tools to track the query status... If we want Spark to show these metrics out of the box, I don't think log is the right place. Web UI is probably better but we need some design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a good idea, i'll try to fix it

Copy link
Contributor Author

@caican00 caican00 Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan Hi, could you help to review this patch again? thanks


import QueryPlanningTracker._

Expand Down Expand Up @@ -120,6 +120,24 @@ class QueryPlanningTracker {
ret
}

/**
* print out the timeSpent for each phase of a SQL
*/
def logTimeSpent(): Unit = {
var totalTimeSpent = 0L
val timeSpentSummary: StringBuffer = new StringBuffer()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe StringBuilder is ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems so. thanks, updated

Seq(QueryPlanningTracker.PARSING, QueryPlanningTracker.ANALYSIS,
QueryPlanningTracker.OPTIMIZATION, QueryPlanningTracker.PLANNING).foreach { phase =>
val duration = phasesMap.getOrDefault(phase, new PhaseSummary(-1, -1)).durationMs
timeSpentSummary.append(s"phase: $phase, timeSpent: $duration ms\n")
totalTimeSpent += duration
}
logInfo(
s"""Query planning time spent:\n ${timeSpentSummary.toString}
|Total time spent: $totalTimeSpent ms.
""".stripMargin)
}

/**
* Record a specific invocation of a rule.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -164,11 +164,13 @@ class QueryExecution(
// We need to materialize the optimizedPlan here, before tracking the planning phase, to ensure
// that the optimization time is not counted as part of the planning phase.
assertOptimized()
executePhase(QueryPlanningTracker.PLANNING) {
val plan = executePhase(QueryPlanningTracker.PLANNING) {
// clone the plan to avoid sharing the plan instance between different stages like analyzing,
// optimizing and planning.
QueryExecution.prepareForExecution(preparations, sparkPlan.clone())
}
tracker.logTimeSpent()
plan
}

/**
Expand Down