Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tcplbl3dsrha fail #502

Closed
nitzan-tz opened this issue Jan 21, 2024 · 6 comments
Closed

tcplbl3dsrha fail #502

nitzan-tz opened this issue Jan 21, 2024 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@nitzan-tz
Copy link

Describe the bug
I am running the tcplbl3dsrha cicd scenario and it fail but the tcplbl3dsr works

To Reproduce
Run the tcplbl3dsrha config.sh

Expected behavior
Hosts should be up and the VIP should answer

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

  • OS: [Ubuntu 22.04 ]
  • Kernel Version: [5.15.0-91]
  • LoxiLB Version: [v0.9.0]

Additional context

  1. In the validate script you try to access the ep1-3 56.56.56.1... but r1 doesn't have route to them (Only the llb has )
  2. looks like you have to disable ICMP redirects on the hosts
  3. the llb receives a SYN to port 8080 but I see a routing loop of this packet

Thanks

Nitzan

@nitzan-tz nitzan-tz added the bug Something isn't working label Jan 21, 2024
UltraInstinct14 added a commit that referenced this issue Jan 23, 2024
PR : gh-502 L3-DSR mode issues fixes among others
@TrekkieCoder
Copy link
Collaborator

tcplbl3dsrha was a work in progress. However tcplbl3dsrha cicd scenario has been updated after this report. As you correctly pointed , ep3 was not being setup properly. And yes, there was no need to check connectivity to ep1, ep2 ep3. So, now it skips those and checks VIP connectivity directly instead. Also, some code fixes were needed to completely fix this scenario.

Request to update to latest images/scripts and give it a try !! Thanks !!

@nitzan-tz
Copy link
Author

Hi,

I cloned the latest branch and delete the docker image before I run the config again

root@554d7e3a98d5:/# loxicmd version 
v0.9.0 2024_01_23-main-3ecac9f

I still see few issues

  1. Ping from the user container to the VIP receives ICMP redirect and then there is a routing loop. maybe disable ICMP redirect for the docker container will be good idea
root@554d7e3a98d5:/# tcpdump -nni vlan11  icmp 
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vlan11, link-type EN10MB (Ethernet), capture size 262144 bytes
18:23:28.893097 IP 1.1.1.1 > 20.20.20.1: ICMP echo request, id 13, seq 1, length 64
18:23:28.893175 IP 11.11.11.2 > 1.1.1.1: ICMP redirect 20.20.20.1 to host 11.11.11.254, length 92
18:23:28.893183 IP 1.1.1.1 > 20.20.20.1: ICMP echo request, id 13, seq 1, length 64
18:23:28.893205 IP 1.1.1.1 > 20.20.20.1: ICMP echo request, id 13, seq 1, length 64
18:23:28.893225 IP 1.1.1.1 > 20.20.20.1: ICMP echo request, id 13, seq 1, length 64
18:23:28.893234 IP 1.1.1.1 > 20.20.20.1: ICMP echo request, id 13, seq 1, length 64
18:23:28.893244 IP 1.1.1.1 > 20.20.20.1: ICMP echo request, id 13, seq 1, length 64
  1. For application traffic there is a loop as you can see in the attached image. It looks like the loxilb doesn't pick up the packet

image

  1. On r1 you didn't enable ECMP so it has only one path adding "maximum-paths 4" solved it
4144d1e7872d# sh ip route 20.20.20.1/32
Routing entry for 20.20.20.1/32
  Known via "bgp", distance 20, metric 0, best
  Last update 00:00:15 ago
  * 11.11.11.2, via vlan11

4144d1e7872d# conf t
4144d1e7872d(config)# router bgp 65001 
4144d1e7872d(config-router)# maximum-paths 4
4144d1e7872d(config-router)# end
4144d1e7872d# sh ip route 20.20.20.1/32
Routing entry for 20.20.20.1/32
  Known via "bgp", distance 20, metric 0, best
  Last update 00:00:02 ago
  * 11.11.11.2, via vlan11
  * 11.11.11.1, via vlan11

Thanks

Nitzan

@UltraInstinct14
Copy link
Contributor

UltraInstinct14 commented Jan 24, 2024

By default, loxilb just serves only "VIP+ServicePort" combination. All other traffic will be ignored. So, ping to 20.20.20.1 is routed via some default route and creates the problem that you mentioned. You can add ip addr add 20.20.20.1/32 dev lo manually to llb1 and llb2 and it should be fine.

@inhogog2
Copy link
Collaborator

Hi @nitzan-tz,
As @UltraInstinct14 mentioned, traffic for undefined rules appears to have resulted in a loop.
If you add an IP address for 20.20.20.1 or add a rule for 20.20.20.1 with ICMP like
loxicmd create lb 20.20.20.1 --icmp --endpoints=56.56.56.1:1,57.57.57.1:1,58.58.58.1:1 --mode dsr --select hash, then it is confirmed that the loop disappears.

Thank you very much for your opinion. Regarding undefined rules, we will consider disabling ICMP redirect in the next release.

@UltraInstinct14
Copy link
Contributor

We can have a mode where loxilb can simply blackhole all untrusted traffic.If tcp rule is available, only allow that. All other streams can be blackholed.

@TrekkieCoder
Copy link
Collaborator

The original issue is considered fixed. Suggestion will be taken up as enhancements in future release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants