Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

signpost: cancel-upgrade behavior misleading #949

Closed
etungsten opened this issue Jun 19, 2020 · 3 comments · Fixed by #950
Closed

signpost: cancel-upgrade behavior misleading #949

etungsten opened this issue Jun 19, 2020 · 3 comments · Fixed by #950
Labels
type/bug Something isn't working type/documentation Documentation update/creation

Comments

@etungsten
Copy link
Contributor

etungsten commented Jun 19, 2020

Image I'm using:
aws-k8s-1.15 v0.3.4

What I expected to happen:

According to the usage information for the signpost commands:

    mark-inactive-valid     Marks the inactive partition as having a valid image
    upgrade-to-inactive     Sets the inactive partitions as new upgrade partitions if marked valid
    cancel-upgrade          Reverse upgrade-to-inactive

If I call mark-inactive-valid and upgrade-to-inactive to set the inactive partition as the "next-to-boot" and then call cancel-upgrade to revert upgrade-to-inactive, I should be able to call upgrade_to_inactive successfully afterwards to restore the inactive partition as next-to-boot without having to call mark-inactive-valid again.

What actually happened:
cancel-upgrade wipes all the priority bits, including number of tries_left (validity), so a subsequent upgrade-to-inactive would fail because the inactive partition is no longer marked as valid.

How to reproduce the problem:
Initial partition table

bash-5.0# signpost status        
OS disk: /dev/nvme0n1
Set A:   boot=/dev/nvme0n1p2 root=/dev/nvme0n1p3 hash=/dev/nvme0n1p4 priority=2 tries_left=0 successful=true
Set B:   boot=/dev/nvme0n1p6 root=/dev/nvme0n1p7 hash=/dev/nvme0n1p8 priority=0 tries_left=0 successful=false
Active:  Set A
Next:    Set A

Call mark-inactive-valid to set number of tries left for the inactive partition to 1 (thereby marking it valid for boot)

bash-5.0# signpost mark-inactive-valid
bash-5.0# signpost status             
OS disk: /dev/nvme0n1
Set A:   boot=/dev/nvme0n1p2 root=/dev/nvme0n1p3 hash=/dev/nvme0n1p4 priority=2 tries_left=0 successful=true
Set B:   boot=/dev/nvme0n1p6 root=/dev/nvme0n1p7 hash=/dev/nvme0n1p8 priority=0 tries_left=1 successful=false
Active:  Set A
Next:    Set A

Call upgrade-to-inactive to set inactive partition as next to boot.

bash-5.0# signpost upgrade-to-inactive
bash-5.0# signpost status             
OS disk: /dev/nvme0n1
Set A:   boot=/dev/nvme0n1p2 root=/dev/nvme0n1p3 hash=/dev/nvme0n1p4 priority=1 tries_left=0 successful=true
Set B:   boot=/dev/nvme0n1p6 root=/dev/nvme0n1p7 hash=/dev/nvme0n1p8 priority=2 tries_left=1 successful=false
Active:  Set A
Next:    Set B

Call cancel-upgrade to rollback upgrade-to-inactive, but it rolls back both upgrade-to-inactive AND mark-inactive-valid (tries_left is no longer set to 1) so subsequent upgrade-to-inactive calls no longer work

bash-5.0# signpost cancel-upgrade     
bash-5.0# signpost status        
OS disk: /dev/nvme0n1
Set A:   boot=/dev/nvme0n1p2 root=/dev/nvme0n1p3 hash=/dev/nvme0n1p4 priority=2 tries_left=0 successful=true
Set B:   boot=/dev/nvme0n1p6 root=/dev/nvme0n1p7 hash=/dev/nvme0n1p8 priority=0 tries_left=0 successful=false
Active:  Set A
Next:    Set A
bash-5.0# signpost upgrade-to-inactive
Inactive partition /dev/nvme0n1 has not been marked valid for upgrade
@etungsten etungsten added the type/bug Something isn't working label Jun 19, 2020
@etungsten etungsten changed the title signpost: upgrade-to-inactive and cancel-upgrade behavior signpost: cancel-upgrade behavior misleading Jun 19, 2020
@etungsten etungsten mentioned this issue Jun 19, 2020
@etungsten
Copy link
Contributor Author

Proposed solution is to change cancel-upgrade so that it preserves the tries_left (validity) of the inactive partition.

@etungsten etungsten added the type/documentation Documentation update/creation label Jun 19, 2020
@iliana
Copy link
Contributor

iliana commented Jun 19, 2020

Proposed solution seems fine to me.

Cross-referencing #649: signpost's idea of workflows is really really complicated and we should just move that logic into updog.

@etungsten
Copy link
Contributor Author

I will fix this issue first with #950 to unblock #942 (comment). Then follow up with a more comprehensive PR to address #649.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working type/documentation Documentation update/creation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants