-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: add 'endpoint hashkv' command #8351
Conversation
etcdctl/README.md
Outdated
+------------------------+------------+ | ||
| ENDPOINT | HASH | | ||
+------------------------+------------+ | ||
| http://127.0.0.1:2379 | 1084519789 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space here.
@gyuho Thanks for getting this done SO FAST! |
@@ -28,6 +28,12 @@ Sometimes an etcd cluster will possibly have v3 data which should not be overwri | |||
ETCDCTL_API=3 etcdctl get "" --from-key --keys-only --limit 1 | wc -l | |||
``` | |||
|
|||
In case v2 data were migrated with TTLs, ensure that stores are consistent by checking the hashes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/were/was
for i := 0; i < 3; i++ { | ||
cli := clus.Client(i) | ||
|
||
hresp, err := cli.HashKV(context.Background(), clus.Members[0].GRPCAddr(), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/[0]/[i]
var hv uint32 | ||
for i := 0; i < 3; i++ { | ||
cli := clus.Client(i) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can get rid of the unreliable time.Sleep
:
if _, err := cli.Get(context.TODO(), "foo"); err != nil { t.Fatal(err) }
clientv3/maintenance.go
Outdated
@@ -50,6 +51,11 @@ type Maintenance interface { | |||
// Status gets the status of the endpoint. | |||
Status(ctx context.Context, endpoint string) (*StatusResponse, error) | |||
|
|||
// HashKV returns the state of backend database at the time of the RPC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/the state of/a hash of the/
s/backend database/KV state/? it's not the entire backend...
clientv3/maintenance.go
Outdated
@@ -50,6 +51,11 @@ type Maintenance interface { | |||
// Status gets the status of the endpoint. | |||
Status(ctx context.Context, endpoint string) (*StatusResponse, error) | |||
|
|||
// HashKV returns the state of backend database at the time of the RPC. | |||
// If revision is non-zero, hash is computed only in key bucket and keys at or below |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If revision is zero, the hash is computed on all keys. If the revision is non-zero, the hash is computed on all keys at or below the given revision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder should the comment for maintenance HashKV the same as the one in rpc?
HashKV computes the hash of all MVCC keys up to a given revision.
https://github.com/coreos/etcd/blob/master/etcdserver/etcdserverpb/rpc.proto#L187
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clientv3 Compact and the rpc.proto Compact have different comments too; the difference here seems to be similar. Although to be closer to clientv3 Compact it should talk about KV history instead of a "backend database".
Duplicated comments (/anything) can easily get out of sync and I think there are too many exceptions to using the RPC comment directly to justify tooling to keep it synced up.
etcdctl/README.md
Outdated
@@ -641,6 +641,49 @@ Get the status for all endpoints in the cluster associated with the default endp | |||
+------------------------+------------------+----------------+---------+-----------+-----------+------------+ | |||
``` | |||
|
|||
### ENDPOINT HASH | |||
|
|||
ENDPOINT HASH fetches the backend state of each endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fetches the hash of the key-value store of an endpoint.
etcdctl/ctlv3/command/ep_command.go
Outdated
Short: "Prints out backend database states specified in `--endpoints` flag", | ||
Run: epHashCommandFunc, | ||
} | ||
hc.PersistentFlags().Int64Var(&epHashRev, "rev", 0, "if non-zero, hash keys at or below the given revision") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"maximum revision to hash (default: all revisions)"
etcdctl/ctlv3/command/ep_command.go
Outdated
display.EndpointHash(hashList) | ||
|
||
if err != nil { | ||
os.Exit(ExitError) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExitWithError?
c0491b0
to
518a6f7
Compare
cc99ec8
to
33282b5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks ok in general
In case v2 data was migrated with TTLs, ensure that stores are consistent by checking the hashes: | ||
|
||
```sh | ||
ETCDCTL_API=3 etcdctl endpoint hash --cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be hashkv
? The naming should be consistent with the underlying RPC / client call if possible (e.g., what if etcdctl ever gets support for the old hash
rpc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the failure mode for inconsistent data is "member is completely broken", I would recommend increasing the warnings given here.
"You must ensure all members have consistent data before migrating, otherwise some members will not be able to correctly serve queries for new data."
This really should also explain what to do if someone has frequently expiring keys - for instance, someone with keys expiring every second will have an extremely high chance of having inconsistent data. Should they keep retrying?
etcdctl/README.md
Outdated
|
||
##### Simple format | ||
|
||
Prints a humanized table of each endpoint URL and backend hash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/backend/KV history/?
etcdctl/ctlv3/command/ep_command.go
Outdated
func newEpHashCommand() *cobra.Command { | ||
hc := &cobra.Command{ | ||
Use: "hash", | ||
Short: "Prints out backend database states specified in `--endpoints` flag", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Prints the KV history hash for each endpoint in --endpoints"
?
80b2412
to
1ee4267
Compare
For etcd v3.3+, run `ETCDCTL_API=3 etcdctl endpoint hashkv --cluster` to ensure key-value stores are consistent post migration. | ||
|
||
**Warn**: When v2 store has expiring TTL keys and migrate command intends to preserve TTLs, it is possible that inconsistent data is migrated. TODO? | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/it is possible that inconsistent data is migrated. TODO/migration may be inconsistent with the last committed v2 state when run on any member with a raft index less than the last leader's raft index./
?
Signed-off-by: Gyu-Ho Lee <[email protected]>
Signed-off-by: Gyu-Ho Lee <[email protected]>
Signed-off-by: Gyu-Ho Lee <[email protected]>
Signed-off-by: Gyu-Ho Lee <[email protected]>
Signed-off-by: Gyu-Ho Lee <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks
Fix #8348.
Address #8305.