Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

atomic container updates #12259

Closed
RobertBerger opened this issue Nov 10, 2021 · 10 comments
Closed

atomic container updates #12259

RobertBerger opened this issue Nov 10, 2021 · 10 comments

Comments

@RobertBerger
Copy link

Do you provide atomic container updates with podman?

If not how do you suggest this can could be done?

@rhatdan
Copy link
Member

rhatdan commented Nov 10, 2021

I am not sure what you mean here. We do have auto-updates.

https://fedoramagazine.org/auto-updating-podman-containers-with-systemd/

@RobertBerger
Copy link
Author

Maybe I should mention, that I am talking about Embedded Linux systems. You might want to have a look here[1] but let me try to explain what I mean. Think about a headless system in the field which gets over-the-air software updates. There is no user available to manually fix things. The software update process needs to be flawless and atomic and must not leave the device in a non-working state.

Classis package managers like for .deb, .rpm, .ipk are non-atomic. This means e.g. when you have a power failure while an update is in progress you can end up with a non-working/non-recoverable system.

One solution is to have an A/B partition scheme, where you write the update to partition B while keeping partition A intact. Like this, you can fall back to a "good known" version in case an update goes wrong. Switching between the two partitions is typically done "atomically" during the boot process by the boot loader.

With containers, we could have something like incremental updates and the "problem" of just downloading parts of those updates in case of power failure.

  1. One issue seems to be file system corruption, but I guess this can be fixed by a transactional file system and something which takes care of wear leveling in case flash is used.

  2. The other "problem" is: can the container runtime recover from a partially written (possibly corrupt) update?

I am thinking about a scheme, where I have the A/B update for some small basic Embedded Linux system, but where applications are delivered in containers. This would allow for a more modern development paradigm (microservices) but I need to make sure that a failed container update does not leave the system in a non-recoverable state.

Not sure, but maybe something like this:
#8005

or this:
#7157

or this:
https://andreiclinciu.net/podman-container-readlink-no-such-file

I also saw this:
https://www.tenable.com/plugins/nessus/143877
"Add --no-sync-log option to instruct conmon to not sync the logs of the containers upon shutting down. This feature fixes a regression where we unconditionally dropped the log sync. It is possible the container logs could be corrupted on a sudden power-off. If you need container logs to remain in consistent state after a sudden shutdown, please update from v2.0.19 to v2.0.20"

[1] https://elinux.org/images/3/31/Comparison_of_Linux_Software_Update_Technologies.pdf

@rhatdan
Copy link
Member

rhatdan commented Nov 11, 2021

Sounds interesting, @vrothberg WDYT?

@vrothberg
Copy link
Member

Pulling in @giuseppe since he has been working in this domain as well.

My gut feeling is that auto-updates and the recent fixes and enhancements to detect corrupt images are sufficient. But it would be a healthy exercise to go through all scenarios in detail. I am not sure about the exact limitations of detecting storage corruptions.

Since Podman 3.4, auto-updates also support rollbacks which plays in nicely into the described use cases: https://www.redhat.com/sysadmin/podman-auto-updates-rollbacks

@RobertBerger
Copy link
Author

@vrothberg Which version of podman do I need at least to run some tests? This is what's currently on offer with meta-virtualization [1]

SRCREV = "6e8de00bb224f9931d7402648f0177e7357ed079"
SRC_URI = " \
    git://github.com/containers/libpod.git;branch=v3.4;protocol=https \
"

I will use a btrfs partition while will hold the containers.

Probably, with your help, we can come up with some test cases. I'm willing to run some experiments as time permits.

[1] https://git.yoctoproject.org/cgit/cgit.cgi/meta-virtualization/tree/recipes-containers/podman/podman_git.bb#n20

@vrothberg
Copy link
Member

@RobertBerger, you seem to be on Podman 3.4 which is the latest version. We can't get any fresher :-)

@RobertBerger
Copy link
Author

@vrothberg OK cool. I am currently baking something else, but as soon as I'll find some time I'll build some image with this and we can give it a try ;)

@vrothberg
Copy link
Member

Thank you, @RobertBerger! I'd love to put our heads together on this issue and collaborate.

@RobertBerger
Copy link
Author

@vrothberg Thanks! Seems a pretty common issue, but, as far as I know, only has "not so open" solutions or rather complicated ones.

@vrothberg
Copy link
Member

Heads up: I am converting the issue into a discussion.

@containers containers locked and limited conversation to collaborators Nov 17, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants