Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cross-Node service access when AntreaProxy is disabled #2318

Merged
merged 1 commit into from
Jun 29, 2021

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented Jun 28, 2021

When AntreaProxy is disabled, if the reply traffic of a connection that
has been processed by iptables/ipvs rules (of kube-proxy) is received
from the tunnel interface, its destination MAC would be rewritten twice
because it would have both gatewayCTMark and macRewriteMark set. The
latter rewriting would overwrite the former one and would cause the
packets to be delivered to the destination Pod directly without doing
reversed NAT in the host netns.

This patch fixes it by making the pipeline rewrite the destination MAC
as most once. It moves the gatewayCTMark related MAC rewriting flow to
l3ForwardingTable, to make L3 forwarding decision in same table
uniformly. It also simplifies the two gatewayCTMark related flows by
matching the direction of traffic which ensures the flow doesn't apply
to traffic from the gateway interface.

e2e tests for cross-Node service access are added to cover this scenario
with AntreaProxy disabled.

Fixes #2319

Need to backport to 0.13, 1.0, and 1.1.

@tnqn
Copy link
Member Author

tnqn commented Jun 28, 2021

/test-all

@codecov-commenter
Copy link

codecov-commenter commented Jun 28, 2021

Codecov Report

Merging #2318 (c637f57) into main (2d19196) will increase coverage by 3.42%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2318      +/-   ##
==========================================
+ Coverage   62.04%   65.47%   +3.42%     
==========================================
  Files         281      281              
  Lines       21746    21735      -11     
==========================================
+ Hits        13493    14231     +738     
+ Misses       6849     6057     -792     
- Partials     1404     1447      +43     
Flag Coverage Δ
e2e-tests 55.57% <100.00%> (?)
kind-e2e-tests 52.41% <100.00%> (-0.01%) ⬇️
unit-tests 41.63% <0.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/openflow/client.go 68.22% <ø> (+10.24%) ⬆️
pkg/agent/openflow/pipeline.go 81.86% <100.00%> (+9.42%) ⬆️
pkg/agent/flowexporter/connections/connections.go 80.48% <0.00%> (-7.32%) ⬇️
pkg/agent/flowexporter/exporter/exporter.go 78.89% <0.00%> (-0.69%) ⬇️
pkg/ovs/openflow/ofctrl_bridge.go 50.00% <0.00%> (+0.34%) ⬆️
pkg/agent/openflow/network_policy.go 76.41% <0.00%> (+0.59%) ⬆️
...ntroller/networkpolicy/networkpolicy_controller.go 70.09% <0.00%> (+0.64%) ⬆️
pkg/agent/cniserver/pod_configuration.go 54.68% <0.00%> (+0.78%) ⬆️
pkg/controller/egress/store/egressgroup.go 1.38% <0.00%> (+1.38%) ⬆️
pkg/ovs/openflow/ofctrl_flow.go 47.36% <0.00%> (+1.75%) ⬆️
... and 44 more

@tnqn
Copy link
Member Author

tnqn commented Jun 28, 2021

/test-all

@tnqn
Copy link
Member Author

tnqn commented Jun 28, 2021

/test-whole-conformance

Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the commit description:

It moves the gatewayCTMark related mac rewritten flow to
l3ForwardingTable

"mac rewritten flow" -> "MAC rewriting flow" or "MAC rewrite flow".

// kube-proxy is used). In this case the destination IP address of the reply traffic is the Pod which initiated the
// connection to the Service (no SNAT). We need to make sure that these packets are sent back through the gateway
// so that the source IP can be rewritten (Service backend IP -> Service ClusterIP).
// 2) when hair-pinning is involved, i.e. for connections between 2 local Pods belonging to this Node and for which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"for connections between 2 local Pods belonging to this Node and for which NAT Bis performed"

How about: "connections between 2 local Pods on the Node, for which NAT is performed"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will update after running all CI tests

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the first sentence is almost same as the one in ovs-pipeline, unified them. Not sure if you mean removing the leading "for" or not. The latest version is "when hair-pinning is involved, i.e. for connections between 2 local Pods, for which NAT is performed." Let me know if it works for you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the leading "for", but we could keep it too.

@jianjuns
Copy link
Contributor

@tnqn : question - is it a regression (hope it was not introduced by my SNAT changes..)? Otherwise, we did not find it earlier.

@tnqn
Copy link
Member Author

tnqn commented Jun 28, 2021

@tnqn : question - is it a regression (hope it was not introduced by my SNAT changes..)? Otherwise, we did not find it earlier.

It was a regression introduced since 0.13.0. I described detail and why it was not caught by CI tests in #2319. I guess it's related to 53e407e.

@tnqn
Copy link
Member Author

tnqn commented Jun 28, 2021

/test-all

@tnqn
Copy link
Member Author

tnqn commented Jun 28, 2021

/test-ipv6-all

@tnqn
Copy link
Member Author

tnqn commented Jun 28, 2021

/test-all-ipv6

antoninbas
antoninbas previously approved these changes Jun 28, 2021
jianjuns
jianjuns previously approved these changes Jun 29, 2021
Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you revise "gatewayCTMark related mac rewritten flow" in the commit message to "gatewayCTMark related MAC rewrite flow" before merging?

When AntreaProxy is disabled, if the reply traffic of a connection that
has been processed by iptables/ipvs rules (of kube-proxy) is received
from the tunnel interface, its destination MAC would be rewritten twice
because it would have both gatewayCTMark and macRewriteMark set. The
latter rewriting would overwrite the former one and would cause the
packets to be delivered to the destination Pod directly without doing
reversed NAT in the host netns.

This patch fixes it by making the pipeline rewrite the destination MAC
as most once. It moves the gatewayCTMark related MAC rewriting flow to
l3ForwardingTable, to make L3 forwarding decision in same table
uniformly. It also simplifies the two gatewayCTMark related flows by
matching the direction of traffic which ensures the flow doesn't apply
to traffic from the gateway interface.

Signed-off-by: Quan Tian <[email protected]>
@tnqn
Copy link
Member Author

tnqn commented Jun 29, 2021

Would you revise "gatewayCTMark related mac rewritten flow" in the commit message to "gatewayCTMark related MAC rewrite flow" before merging?

Removed leading "for" and corrected the commit message, thanks for reminding.

@tnqn
Copy link
Member Author

tnqn commented Jun 29, 2021

/test-all

@tnqn
Copy link
Member Author

tnqn commented Jun 29, 2021

/test-e2e

@tnqn
Copy link
Member Author

tnqn commented Jun 29, 2021

/skip-e2e

e2e once succeeded and the latest failure was for "TestFlowAggregator" which has been tracked in #2283

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot access ClusterIP service if the endpoint is on another Node when AntreaProxy is disabled
4 participants