-
Notifications
You must be signed in to change notification settings - Fork 519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport systemd fixes for udevd skipping events (and others) #2999
Conversation
This doesn't appear to be a backport? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look fine and testing looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reviewing on my phone, but this looks pretty good as far as I can tell.
The other thing to check is to verify that there are no fuzz warnings from rpm during the patch process, since that can indicate that some of the context didn't quite match.
Just exit 1
in %prep
after patches are applied and inspect for warnings.
packages/systemd/0010-udev-initialize-list-pointers-in-event_queue_assume_.patch
Outdated
Show resolved
Hide resolved
+ /* If this is a block device and the device is locked currently via the BSD advisory locks, | ||
+ * someone else is using it exclusively. We don't run our udev rules now to not interfere. | ||
+ * Instead of processing the event, we requeue the event and will try again after a delay. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to confirm via testing that this doesn't interfere with the previous fix to the fallback logic, since that relies on continued udev processing to generate the special block-device-typed UUID symlinks.
My recollection is that all the symlinks are created at the end, so we shouldn't see the case where the fallback unit starts execution as soon as the partuuid link exists, but before the typed link exists.
No warnings.
|
Push above merges the custom |
Push above adds comments to indicate what version of systemd the patches are backported from. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Nice work!
Patch0007: 0007-udev-only-ignore-ENOENT-or-friends-which-suggest-the.patch | ||
Patch0008: 0008-udev-split-worker_lock_block_device-into-two.patch | ||
Patch0009: 0009-udev-assume-block-device-is-not-locked-when-a-new-ev.patch | ||
# From v252: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Push above addresses an issue where boot hangs for 90 seconds before proceeding when the fallback data partition is used and the preferred data partition is absent. This was due to I removed the dependency and also added measures to ensure TestingBuilt Launched and cycled through 1000 Launched and cycled through 900 Built Launched and cycled through 1000 Launched and cycled through 100 Launched 500 |
The local filesystem is expected to be ready before local-fs.target is reached. Signed-off-by: Ben Cressey <[email protected]> We don't need to wait for the 'repart' units since this cause boot to hang until the repart units timeout on waiting for a potentiaily non-existent data partition. 'systemd-repart' and 'systemd-makefs' both lock on the block device before operating on it, so repart is guaranteed to finish before makefs can create the filesystem. Co-authored-by: Erikson Tung <[email protected]>
Push above fixes up the commit messages to be more precise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates look good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💸
We explicitly stop and mask any repart-data service that has a start job waiting on a non-existent data partition after 'makefs' finishes successfully. This prevents awkward start job timeout messages from being printed to the console.
Backports the following, with some minor edits: * systemd/systemd@2d40f02 * systemd/systemd#22717 * systemd/systemd@400e3d2 * systemd/systemd@4f294ff * systemd/systemd@c02fb80 This fixes an issue with kernel uevents getting skipped by udev when the data partition block device is locked by 'systemd-makefs' or 'systemd-repart'.
Push above adds masking in addition to stopping the |
Issue number:
Resolves #2980 (comment)
Description of changes:
Testing done:
Built an
aws-k8s-1.22
arm64
AMI, then launched and cycled through 2000a1.medium
instances.All boots were successful and nodes joined the cluster successfully.
Launched and cycled through 400
m6g.medium
instances.All boots were successful and nodes joined the cluster successfully.
Launched and cycled through 400
m6g.large
instances.All boots were successful and nodes joined the cluster successfully.
Built an
aws-k8s-1.22
x86_64
AMI, then launched and cycled through 100m1.small
instances.All boots were successful and nodes joined the cluster successfully.
Launched and cycled through 400
m5.large
instances.All boots were successful and nodes joined the cluster successfully.
Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.