Skip to content

Conversation

@ajwerner
Copy link
Contributor

@ajwerner ajwerner commented May 2, 2019

Before this PR, statement counters were incremented before executing a SQL
statement. This is problematic because it means that the counters include
statements which fail during execution. This changes the logic to increment the
counters when statements complete successfully.

Release note (admin ui change): Only include successfully executed statements
in the statement counters.

@ajwerner ajwerner requested review from a team and andreimatei May 2, 2019 14:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@ajwerner
Copy link
Contributor Author

ajwerner commented May 2, 2019

@piyush-singh I know we talked about this in the past, is there an issue I can link this to?

@piyush-singh
Copy link

#36126

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno bro, there's nothing in the metadata of those counters that suggests anything about "success".
I think we should have total counters and error counters, no?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)

@ajwerner
Copy link
Contributor Author

ajwerner commented May 2, 2019

I dunno bro, there's nothing in the metadata of those counters that suggests anything about "success".
I think we should have total counters and error counters, no?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)

Sure, it's just weird when you have a cluster where most of your requests are timing out and the topmost chart looks like your throughput is going up but in fact its going down. We don't have an easy way to subtract two time series from each other for display in the admin ui (a limitation of the tsdb I suppose).

How about I add two more metrics, failure and success for each statement type and then propose migrating the admin UI to the success metric?

@andreimatei
Copy link
Contributor

Sure, it's just weird when you have a cluster where most of your requests are timing out and the topmost chart looks like your throughput is going up but in fact its going down.

This is only true for benchmark clients that spam the server with as many queries as they can, not for real apps where throughput presumably stays flat.

How about I add two more metrics, failure and success for each statement type and then propose migrating the admin UI to the success metric?

I think what I'd do is simply copy the errors metric to the overview metrics dashboard, just below SQL Queries.

@ajwerner
Copy link
Contributor Author

ajwerner commented May 2, 2019

I tried to do some digging on what others do. From a cursory glance I couldn't figure out the MySQL metric semantics. Here's what I dug out of some public grafana dashboard templates:

Cassandra: Uses count from their latency histogram which implies success
Postgres: Shows transaction commits and rollbacks and rows read and rows returned, nothing about number of statements.

I'm going to update this diff to track all 3 things and we can have a separate conversation with @piyush-singh about the right thing to display. I'll admit that this will collect redundant data but if having a few extra counters breaks the bank we've got bigger problems. Reasonable?

@ajwerner
Copy link
Contributor Author

ajwerner commented May 2, 2019

I'm realizing now that this also interacts with telemetry. Let me sync up with @piyush-singh before moving forward with this.

@ajwerner ajwerner force-pushed the ajwerner/counters-only-on-success branch from b4e68ba to af087a6 Compare May 8, 2019 21:00
@ajwerner ajwerner requested a review from a team May 8, 2019 21:00
@ajwerner ajwerner changed the title sql: only increment statement counters upon success sql: track and display counters for successfully executed statements May 8, 2019
@ajwerner
Copy link
Contributor Author

ajwerner commented May 8, 2019

Still need to fix the testing, not yet ready for a look

@ajwerner ajwerner force-pushed the ajwerner/counters-only-on-success branch from af087a6 to 8ba3188 Compare May 9, 2019 22:26
@ajwerner ajwerner requested a review from a team May 9, 2019 22:26
@ajwerner ajwerner force-pushed the ajwerner/counters-only-on-success branch from 8ba3188 to a8d6e8e Compare May 10, 2019 13:09
@ajwerner
Copy link
Contributor Author

This is RFAL. I attempted to do more programatic metric meta creation but ultimately reverted it in favor of just making another copy.

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/sql/conn_executor_exec.go, line 66 at r1 (raw file):

func (ex *connExecutor) execStmt(
	ctx context.Context, stmt Statement, res RestrictedCommandResult, pinfo *tree.PlaceholderInfo,
) (ev fsm.Event, payload fsm.EventPayload, err error) {

revert this


pkg/sql/conn_executor_exec.go, line 1305 at r1 (raw file):

}

// maybeIncrementExecutedStmtCounter checks if err and payload to determine if

s/checks if/checks


pkg/sql/conn_executor_exec.go, line 1308 at r1 (raw file):

// an error occurred and if not, increments the appropriate statement counter
// for stmt's type.
func (ex *connExecutor) maybeIncrementExecutedStmtCounter(

only one of the callers passes an err. And anything that deals with two possible errors is confusing. I'd remove the err from this function and deal with it in that one caller.
And then I won't nit about commenting `maybe..(.., nil /* err */)

@ajwerner ajwerner force-pushed the ajwerner/counters-only-on-success branch from a8d6e8e to bc798dd Compare May 13, 2019 22:43
Copy link
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR! I moved the conditional logic into the caller

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)


pkg/sql/conn_executor_exec.go, line 66 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

revert this

Done.


pkg/sql/conn_executor_exec.go, line 1305 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

s/checks if/checks

Changed the logic.


pkg/sql/conn_executor_exec.go, line 1308 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

only one of the callers passes an err. And anything that deals with two possible errors is confusing. I'd remove the err from this function and deal with it in that one caller.
And then I won't nit about commenting `maybe..(.., nil /* err */)

I moved the conditional logic up to the callers.

Before this PR, statement counters were incremented before executing a SQL
statement. This is problematic because it means that the counters include
statements which fail during execution. This commit adds an additional set
of statement counters which track successfully executed statements which use
the old metric name.

The reason the new behavior uses the old name is to retain historical data
in the UI for these charts after an upgrade. This does unfortunately mean that
the definition of this metric will change in telemetry where the old definition
will have a new name. This is both probably okay from an analytics perspective
(and maybe even preferable) and can be mitigated with some fancy SQL.

Release note (admin ui change): Only include successfully executed statements
in the statement counters.
@ajwerner ajwerner force-pushed the ajwerner/counters-only-on-success branch from bc798dd to b291909 Compare May 13, 2019 23:43
@ajwerner
Copy link
Contributor Author

I added some unit tests (certainly not as exhaustive as it could be) if you want take a look before I merge it

@ajwerner
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented May 23, 2019

Build failed (retrying...)

craig bot pushed a commit that referenced this pull request May 23, 2019
37264: sql: track and display counters for successfully executed statements r=ajwerner a=ajwerner

Before this PR, statement counters were incremented before executing a SQL
statement.  This is problematic because it means that the counters include
statements which fail during execution. This changes the logic to increment the
counters when statements complete successfully.

Release note (admin ui change): Only include successfully executed statements
in the statement counters.

Co-authored-by: Andrew Werner <[email protected]>
@craig
Copy link
Contributor

craig bot commented May 23, 2019

Build succeeded

@craig craig bot merged commit b291909 into cockroachdb:master May 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants