Bug 1927047: Handling packet sizes greater than pod MTU#559
Bug 1927047: Handling packet sizes greater than pod MTU#559openshift-merge-robot merged 1 commit intoopenshift:masterfrom
Conversation
With OVN-kubernetes we set the MTU of the pods to be 100 less than physical network MTU. This becomes a problem when something outside of the cluster tries to access a pod (local or shared gw mode) or via service (shared gateway mode) with a packet larger than the pod MTU. The packet will be dropped by OVS because OVS will not re-fragment at the lower pod MTU when trying to send the packet to the pod. This solves the problem by checking packet size for packets destined towards OVN. If they are > pod MTU + 12 bytes (eth overhead) they are forwarded to the kernel. The kernel has a route for pods via mp0 interface, which is a -100 byte MTU interface. This means the kernel will automatically send ICMP needs frag back to the client. For services, we add back the PREROUTING iptables rule to DNAT nodeport towards the cluster IP service. This means incoming nodeport packets that are too large are forwarded into the kernel, and then DNAT'ed to cluster ip. After DNAT, kernel looks up the routing table and finds the cluster IP route has an MTU of -100 on it, which triggers the kernel to also send ICMP needs frag back to the client. Signed-off-by: Tim Rozet <trozet@redhat.com>
|
@trozet: This pull request references Bugzilla bug 1927047, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Bugzilla (anusaxen@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: astoycos, trozet The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
17 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
4 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/override ci/prow/e2e-metal-ipi-ovn-dualstack |
|
@dcbw: Overrode contexts on behalf of dcbw: ci/prow/e2e-metal-ipi-ovn-dualstack DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/override ci/prow/e2e-metal-ipi-ovn-dualstack |
|
@trozet: Overrode contexts on behalf of trozet: ci/prow/e2e-metal-ipi-ovn-dualstack DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
5 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/override ci/prow/e2e-metal-ipi-ovn-dualstack |
|
@trozet: Overrode contexts on behalf of trozet: ci/prow/e2e-metal-ipi-ovn-dualstack DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
5 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@trozet: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@trozet: An error was encountered checking the state of a related pull request at ovn-kubernetes/ovn-kubernetes#2225 for bug 1927047 on the Bugzilla server at https://bugzilla.redhat.com. No known errors were detected, please see the full error message for details. Full error message.
Get "http://ghproxy/repos/ovn-org/ovn-kubernetes/pulls/2225": failed to get installation id for org ovn-org: the github app is not installed in organization ovn-org
Please contact an administrator to resolve this issue, then request a bug refresh with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
With OVN-kubernetes we set the MTU of the pods to be 100 less than
physical network MTU. This becomes a problem when something outside of
the cluster tries to access a pod (local or shared gw mode) or via
service (shared gateway mode) with a packet larger than the pod MTU.
The packet will be dropped by OVS because OVS will not re-fragment
at the lower pod MTU when trying to send the packet to the pod.
This solves the problem by checking packet size for packets destined
towards OVN. If they are > pod MTU + 12 bytes (eth overhead) they are
forwarded to the kernel. The kernel has a route for pods via mp0
interface, which is a -100 byte MTU interface. This means the kernel
will automatically send ICMP needs frag back to the client.
For services, we add back the PREROUTING iptables rule to DNAT nodeport
towards the cluster IP service. This means incoming nodeport packets
that are too large are forwarded into the kernel, and then DNAT'ed to
cluster ip. After DNAT, kernel looks up the routing table and finds the
cluster IP route has an MTU of -100 on it, which triggers the kernel to
also send ICMP needs frag back to the client.
Signed-off-by: Tim Rozet trozet@redhat.com