Set NVMe IO timeouts according to AWS recommendations #2820
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue number: Closes #2758
Description of changes: Amazon EC2 instances based on the Nitro hypervisor expose EBS volumes via NVMe. That does not mean that the EBS volumes behave like a directly attached disk in every way: Delayed responses to IO requests may happen due to reasons outside of the controller's influence. Hence, the documentation recommends disabling IO request timeouts for EBS volumes attached via NVMe.
On Linux, the IO request timeout cannot be disabled completely. Setting it to the maximum value of an unsigned 32 bit integer gets us close enough: Since the timeout is expected to be given in milliseconds, requests will time out after a little more than 49 days have passed.
Note that this configures timeouts via the block layer. At the time of writing, the documentation recommends setting it as a module parameter for the NVMe driver (where the expected unit is seconds, not milliseconds). However, since the switch to blk-mq this timeout value goes unused for regular IO requests during normal operation.
Testing done: I built the aws-k8s-1.25 variant for x86_64 and launched a c5d.large EC2 instance using it. Alongside its two EBS volumes, this instance type also offers a directly attached NVMe disk (ephemeral storage). I verified that the increased IO timeout only applies to EBS volumes:
Eagle-eyed reviewers will note that the timeout for EBS volumes differs slightly from the one specified in the udev rules. This is due to the timeout value being clamped to that of a signed 32 bit integer before being converted into a unit based on timer ticks, and then being treated as the longest possible timeout in that new unit.
Reviewer notes:
Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.