Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc for upgrading to v3.6 from v3.5 #967

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

shivamgcodes
Copy link

@shivamgcodes shivamgcodes commented Feb 28, 2025

Created the draft doc for upgrading to etcd version 3.6 from etcd version 3.5

Did not yet put the deprecated flags (if any)
Fixes #963

@k8s-ci-robot
Copy link

Hi @shivamgcodes. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ivanvc
Copy link
Member

ivanvc commented Mar 1, 2025

/ok-to-test

@ivanvc
Copy link
Member

ivanvc commented Mar 1, 2025

Thanks for your pull request, @shivamgcodes! I would like to suggest iterating on the progress of this document. Other pull requests are failing due to the file not existing yet, and with every release candidate release that we've been doing, we've had to wipe out the link to this page.

So, to speed up the review and the merge, I think we should trim this pull request down to the basics (i.e., like what I proposed in #966).

Did not yet put the deprecated flags (if any)

Please refer initially to https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.6.md#deprecations (and etcd-io/etcd#19492).

Thanks again :)

@shivamgcodes
Copy link
Author

Sorry, I'm a bit confused. Should I trim down the PR to only include the core changes as outlined in #966 and then add the deprecated flags according to the linked changelog? Or is there something else you'd like me to adjust?
Thanks for clarifying!

Signed-off-by: Shivam Gupta [email protected]

Signed-off-by: shivamgcodes <[email protected]>
Signed-off-by: Shivam Gupta [email protected]

Signed-off-by: shivamgcodes <[email protected]>
Signed-off-by: Shivam Gupta [email protected]

Signed-off-by: shivamgcodes <[email protected]>
Signed-off-by: Shivam Gupta [email protected]

Signed-off-by: shivamgcodes <[email protected]>
Signed-off-by: shivamgcodes <[email protected]>
@shivamgcodes
Copy link
Author

fixed the linting issues.

Copy link
Member

@jmhbnz jmhbnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on this @shivamgcodes, I think it's a great start that we can iterate on once merged to finalise the details for each section.


#### Downgrade

If all members have been upgraded to v3.6, the cluster will be upgraded to v3.6, and downgrade from this completed state is **not possible**. If any single member is still v3.5, however, the cluster and its operations remains "v3.5", and it is possible from this mixed cluster state to return to using a v3.5 etcd binary on all members.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we need to refresh this given the recent work on downgrade support. We can do this as a follow-up.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. I’ll look into it and follow up.

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jmhbnz, shivamgcodes

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Co-authored-by: James Blair <[email protected]>
Signed-off-by: shivamgcodes <[email protected]>
Co-authored-by: James Blair <[email protected]>
Signed-off-by: shivamgcodes <[email protected]>
@ahrtr
Copy link
Member

ahrtr commented Mar 6, 2025

I see the title still has "draft", please remove the "draft" if it's ready to review. Please also squash the commits.

@shivamgcodes
Copy link
Author

Got it, I'm on it. I have exams until March 8th, so I'll need until around March 10th to complete this.

@siyuanfoundation
Copy link
Contributor

I think that's all. I'll follow up with the embed.Config breaking changes.


**NOTE:** When [migrating from v2 with no v3 data](https://github.com/etcd-io/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.

Highlighted breaking changes in 3.5.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be 3.6?

And this should capture the breaking changes introduced in v3.6. For example, there has been many flag migrations happened in v3.6, we need to call out that information here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could refer: #965


### Upgrade checklists

**NOTE:** When [migrating from v2 with no v3 data](https://github.com/etcd-io/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still relevant? If we feel this is still a concern, I think this could also include the migration guide here suggesting how etcd-v2 user could migrate data to etcd-v3 -- https://etcd.io/docs/v3.6/tutorials/how-to-migrate/?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


If all members have been upgraded to v3.6, the cluster will be upgraded to v3.6, and downgrade from this completed state is **not possible**. If any single member is still v3.5, however, the cluster and its operations remains "v3.5", and it is possible from this mixed cluster state to return to using a v3.5 etcd binary on all members.

Please [download the snapshot backup](../../op-guide/maintenance/#snapshot-backup) to make downgrading the cluster possible even after it has been completely upgraded.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we could make this clear as "Before upgrading your etcd cluster, please create a snapshot backup of your etcd cluster. . If you need to downgrade the cluster to 3.5 after a complete upgrade, you can use this snapshot to restore an etcd instance to its 3.5 state."

```diff
-etcd-old --name s1 \
+etcd-new --name s1 \
--data-dir /tmp/etcd/s1 \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-etcd-old --name ${name} \
+etcd-new --name ${name} \
  --data-dir /path/to/${name}.etcd \
..

instead?

@k8s-ci-robot k8s-ci-robot requested a review from ahrtr March 8, 2025 14:24
COMMENT
```

#### Step 3: stop one existing etcd server

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrtr / @siyuanfoundation -- do you think, upgrade guide could benefit by adding "If the server to be stopped is the leader, you can avoid some downtime by move-leader to another server before stopping this server." step here as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If the server to be stopped is the leader, you can avoid some downtime by move-leader to another server before stopping this server."

Yes, it's nice to have, but not mandatory. It's also better to upgrade the leader last (similar to https://github.com/ahrtr/etcd-defrag), otherwise you will need to move-leader multiple times. But again it isn't mandatory.

@shivamgcodes shivamgcodes changed the title draft doc for upgrading to v3.6 from v3.5 doc for upgrading to v3.6 from v3.5 Mar 8, 2025
@shivamgcodes shivamgcodes changed the title doc for upgrading to v3.6 from v3.5 Doc for upgrading to v3.6 from v3.5 Mar 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create v3.6 upgrade guide
7 participants