Skip to content

Healthcheck: new simpler implementation#6155

Merged
sougou merged 39 commits intovitessio:masterfrom
planetscale:ds-new-healthcheck
May 27, 2020
Merged

Healthcheck: new simpler implementation#6155
sougou merged 39 commits intovitessio:masterfrom
planetscale:ds-new-healthcheck

Conversation

@deepthi
Copy link
Copy Markdown
Collaborator

@deepthi deepthi commented May 7, 2020

Fixes #5750
There is now a new gateway implementation called tabletgateway which uses the new healthcheck.
Setting vtgate's flag -gateway_implementation=tabletgateway will use the new gateway with the new healthcheck.
The existing healthcheck has been renamed to LegacyHealthCheck. It continues to be used by all other current users.

@deepthi deepthi requested a review from sougou as a code owner May 7, 2020 05:45
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Close looks unused.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function is unreachable (because Gateway does not have a Close).
But I see it being covered in tests.

Should Gateway export the function? Something to think about.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vtgate seems to be missing a servenv.OnClose. That would be the logical place to call Close on gateway / healthcheck.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gateway already has a Close through the QueryService interface.

@sougou
Copy link
Copy Markdown
Contributor

sougou commented May 16, 2020

This is from my initial skimming. I'm still working on the review.

Copy link
Copy Markdown
Contributor

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I glanced at topo watcher and stats cache. I assume they didn't change much.

deepthi added 26 commits May 19, 2020 22:21
Signed-off-by: deepthi <deepthi@planetscale.com>
…hcheck

Signed-off-by: deepthi <deepthi@planetscale.com>
…on params

Signed-off-by: deepthi <deepthi@planetscale.com>
…of tabletStatsCache, refactor to encapsulate more behavior that belongs in healthcheck

Signed-off-by: deepthi <deepthi@planetscale.com>
…healthcheck tests

Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
…he. rename tabletStats to tabletHealth

Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
…le tabletType change in healthcheck

Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
…ecorder interface. Move cellsAliases cache into TopologyWatcher

Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
…thData

Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
implement healthy tablet sorting by replica lag, notify buffer when
master changes.

Signed-off-by: deepthi <deepthi@planetscale.com>
… get basic unit test working

Signed-off-by: deepthi <deepthi@planetscale.com>
… TabletHealth struct that is returned to gateway. Unit tests

Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
… tests

Signed-off-by: deepthi <deepthi@planetscale.com>
… on gateway_implementation

Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
deepthi added 4 commits May 19, 2020 22:21
…ch results in incomplete error information

Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
…HealthCheck, move checkConn and finalizeConn to tabletHealthCheck, rename deleteConn -> deleteTablet

Signed-off-by: deepthi <deepthi@planetscale.com>
@deepthi deepthi force-pushed the ds-new-healthcheck branch from 10475cf to e0fb77e Compare May 20, 2020 05:21
deepthi added 3 commits May 21, 2020 21:34
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Copy link
Copy Markdown
Contributor

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great. Couple more changes.

Signed-off-by: Andres Taylor <andres@planetscale.com>
Copy link
Copy Markdown
Member

@harshit-gangal harshit-gangal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Executor test should use new healthCheck. This will unblock from adding new funtionality (replication tx, reserve connection) using this new healthcheck.

This also means that with legacyhealthcheck additional functionality will not work. So, should we not go ahead with breaking change and remove the legacy heath check all together?

Comment on lines +93 to +96
_ = vtg.Gateway().Close(context.Background())
if legacyHealthCheck != nil {
_ = legacyHealthCheck.Close()
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can change the function call
"_ = vtg.Gateway().Close(context.Background())" -> "vtg.Gateway().Close(context.Background())"
"_ = legacyHealthCheck.Close()" -> "legacyHealthCheck.Close()"

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring the return value produces warnings.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still ignored :)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think it's clearer using the underscore to mark that we are aware of the returned value, but we are ignoring it. Makes it clear that it's not by mistake

if !ts.Serving {
state = "NOT_SERVING"
if *GatewayImplementation == GatewayImplementationDiscovery {
stats := e.scatterConn.GetLegacyHealthCheckCacheStatus()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this method call the scatterConn struct contains legacyHealthCheck discovery.LegacyHealthCheck. The HealthCheck and LegacyHealthCheck can have an interface implementation that can return the rows back. This can avoid keeping legacyhealthcheck inside scatterconn

Copy link
Copy Markdown
Collaborator Author

@deepthi deepthi May 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scatterConn already had a reference to healthcheck in the legacy implementation. We are not changing that in this iteration because we want to minimize how much we touch the legacy implementation. However, I do see that we can simplify this so that there is less code repetition in executor.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually.. unless we make LegacyTabletsCacheStatusList and TabletsCacheStatusList implement the same interface, I don't see how to avoid the duplicated code.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I leave it up to you.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking is that leaving in ugly if-else blocks will force us to clean it up soon :)

…abletHealthCheck with TabletHealth, incorporate review comments to make code more readable, start using modern techniques in tests

Signed-off-by: deepthi <deepthi@planetscale.com>
@deepthi
Copy link
Copy Markdown
Collaborator Author

deepthi commented May 25, 2020

Executor test should use new healthCheck. This will unblock from adding new funtionality (replication tx, reserve connection) using this new healthcheck.

We can do that, but I would rather not have that block this PR. It can be done separately.

This also means that with legacyhealthcheck additional functionality will not work. So, should we not go ahead with breaking change and remove the legacy heath check all together?

We discussed this. There are a few reasons to keep the legacy health check.

  • If things break, there will be a fallback
  • There are other things besides vtgate that use the legacy healthcheck. We need to clean those up to go to a pure discovery (vs. health) model.

We did decide to make the new healthcheck the default, so I have committed that change.

deepthi added 4 commits May 25, 2020 15:09
…ateway

Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
…ad of thread-unsafe healthByAlias from cacheStatusMap

Signed-off-by: deepthi <deepthi@planetscale.com>
…eck also, so move it to replicationlag.go

Signed-off-by: deepthi <deepthi@planetscale.com>
@sougou
Copy link
Copy Markdown
Contributor

sougou commented May 27, 2020

🎉

@sougou sougou merged commit bcbee47 into vitessio:master May 27, 2020
@deepthi deepthi deleted the ds-new-healthcheck branch July 6, 2020 19:00
@deepthi deepthi added this to the v7.0 milestone Jul 17, 2020
@harshit-gangal harshit-gangal restored the ds-new-healthcheck branch May 7, 2021 17:13
@frouioui frouioui mentioned this pull request Mar 9, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RFC: rewrite vtgate's healthcheck

4 participants