Skip to content

Add option to GetSchema to only send the row count and data length over the wire#6985

Merged
deepthi merged 5 commits intovitessio:masterfrom
tinyspeck:am_get_schema_sizes_only
Nov 29, 2020
Merged

Add option to GetSchema to only send the row count and data length over the wire#6985
deepthi merged 5 commits intovitessio:masterfrom
tinyspeck:am_get_schema_sizes_only

Conversation

@ajm188
Copy link
Copy Markdown
Contributor

@ajm188 ajm188 commented Nov 2, 2020

If you are attempting to get aggregate table sizes across a keyspace, this requires one call to GetSchema per shard, which result in the caller to vtctl needing to json2.Unmarshal the same info (create table statement, column info) N times, when we know it's not going to change. This flag will allow a user to make one GetSchema call to get the schema, and then N-1 GetSchema -table_sizes_only calls for the rest of the shards, and results in much less data transfer and much faster (3-4x faster by some metrics I took) response times.

Closes #6987

…er the wire

Signed-off-by: Andrew Mason <amason@slack-corp.com>
@ajm188
Copy link
Copy Markdown
Contributor Author

ajm188 commented Nov 2, 2020

A comment from @ameetkotian:

think we need to handle the case when both tableNamesOnly and tableSizesOnly are specified. The logic should be clear about what is returned. Now, it looks like only table names are returned.

I think for now I'm just going to make the help text for -table_sizes_only say that this is ignored if -table_names_only is passed, but I'm open to other suggestions.

Signed-off-by: Andrew Mason <amason@slack-corp.com>
Copy link
Copy Markdown
Collaborator

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks fine, but I'm debating whether you should be adding a test case. It will have to be an endtoend test, I think.
@vitessio/ps-vitess any opinions?

Andrew Mason added 3 commits November 27, 2020 11:09
Signed-off-by: Andrew Mason <amason@slack-corp.com>
Signed-off-by: Andrew Mason <amason@slack-corp.com>
Signed-off-by: Andrew Mason <amason@slack-corp.com>
Copy link
Copy Markdown
Collaborator

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@deepthi deepthi merged commit 27c8a86 into vitessio:master Nov 29, 2020
@ajm188 ajm188 deleted the am_get_schema_sizes_only branch November 29, 2020 23:55
ajm188 pushed a commit to tinyspeck/vitess that referenced this pull request Dec 1, 2020
…er the wire (vitessio#6985)

* Add option to GetSchema to only send the row count and data length over the wire

Signed-off-by: Andrew Mason <amason@slack-corp.com>

* Update help text to clarify behavior

Signed-off-by: Andrew Mason <amason@slack-corp.com>

* Add endtoend test for vtctl GetSchema

Signed-off-by: Andrew Mason <amason@slack-corp.com>

* Alias the proto imports

Signed-off-by: Andrew Mason <amason@slack-corp.com>

* Format imports to match vtctl.go

Signed-off-by: Andrew Mason <amason@slack-corp.com>
@askdba askdba added this to the v9.0 milestone Dec 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Better support for pulling schema sizes from across a keyspace

3 participants