Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does total throughput increase when the delay on one of the paths increases? #381

Closed
mrconter1 opened this issue Mar 29, 2023 · 8 comments
Assignees
Labels

Comments

@mrconter1
Copy link

Hi!

So I have the following topology in Mininet:

image

You find the whole setup here.

Right now I am running experiments comparing single path throughput (mptcp disabled and h1-eth1 disabled) with multipath throughput (where I vary the parameters on the both paths):

image

The colored cells value is the percentual throughput difference between single path mode and multipath mode. 100% is equal throughput and 125%, for example, mean that MPTCPv1 were 25% better.

What I am wondering though is why throughput in MPTCPv1 sometimes seems to improve when the delay increases? I have selected one of the examples of this in the image. Vertical axis is the current parameters for h1-et0 and the horizontal axis is the current parameters for h1-eth1.

_ _ _ 2 Mbps 2 Mbps 2 Mbps 2 Mbps 2 Mbps 2 Mbps
Transfer Size Mbps Delay (ms) 1 ms 10 ms 25 ms 50 ms 100 ms 300 ms
1 100 10 15.7 26.6 99.1 99.3 99.2 99.0

So what can explain this increase? My intuition say that the performance should become worse as delay increases on the other path. Sample size is 5 runs per cell. Perhaps something is wrong with my setup? Otherwise all values seems to be reasonable.

Does anyone here have any ideas?

Regards, Rasmus

@mrconter1
Copy link
Author

Hm.. I have one potential explanation for this. Could it be so that MPTCPv1 will prioritize the path with the smallest RTT (no matter what)? The problem seems to stop when the 2 Mbps path has a RTT larger than the 100 Mbps...

@matttbe
Copy link
Member

matttbe commented Mar 29, 2023

Hi,

Thank you for sharing the tests you are doing.

Before replying to your questions linked to performances, please note that some work is needed on the scheduler side to improve some corner cases, see #350. (Any help is welcomed)

Also, just to avoid confusions, MPTCPv1 is about the protocol, not the implementation: https://github.com/multipath-tcp/mptcp_net-next/wiki#upstream-vs-out-of-tree-implementations

So I have the following topology in Mininet:

  • Which kernel version are you using?
  • What's the default route? (via eth0 I suppose)
  • What TCP buffers sizes are you using? (net.ipv4.tcp_*mem sysctl's)
  • What kind of transfer is done (app (+ parameters) and from/to who)

The colored cells value is the percentual throughput difference between single path mode and multipath mode. 100% is equal throughput and 125%, for example, mean that MPTCPv1 were 25% better.

On the image you shared, all numbers are < 101%: so in your setup, MPTCP on multiple paths is never better than TCP on a single path?

Hm.. I have one potential explanation for this. Could it be so that MPTCPv1 will prioritize the path with the smallest RTT (no matter what)? The problem seems to stop when the 2 Mbps path has a RTT larger than the 100 Mbps...

Yes, likely the current packet scheduler will be impacted by paths with a low RTT but also a lower bandwidth. It is supposed to counter that but some optimisations are missing, probably linked to #332 (and #345), especially for small file transfer I guess.

@matttbe
Copy link
Member

matttbe commented Mar 29, 2023

I forgot to say that to understand such issues, analysing packet traces from at least the sender will be needed (and might take a bit of time)

@mrconter1
Copy link
Author

mrconter1 commented Mar 29, 2023

Thank you for the quick reply!

Which kernel version are you using?

I am using 6.3.0-rc2+.

What's the default route? (via eth0 I suppose)

I don't really know... I didn't know you could set a default route. I just create subflows on both of the interfaces like this:

image

If you mean what route the single path TCP uses it is h1-eth0 yes. In this case single path TCP should be 100 Mbps.

What TCP buffers sizes are you using?

net.ipv4.tcp_mem = 85887	114519	171774

What kind of transfer is done (app (+ parameters) and from/to who)

The client is asking the server for N bytes (in this case 1 Mb) and the server replies with that many bytes. It is an ordinary TCP/MPTCP connection as far as I know. You can find the client and server scripts here. I set the receive window size parameter in the client:

client_socket.recv(15 * 1024)

and also set some system parameters:

image

On the image you shared, all numbers are < 101%: so in your setup, MPTCP on multiple paths is never better than TCP on a single path?

No. You are just seeing part of all results. The improvement becomes much larger (> several hundred percent) when the file transfer sizes are bigger (> 10 Mb) and when the h1-eth1 path has a larger bandwidth (it's only 2 Mbps and 5 Mbps now). In the specific image I sent before the "maximum" improvement only would be around 105%.

Yes, likely the current packet scheduler will be impacted by paths with a low RTT but also a lower bandwidth. It is supposed to counter that but some optimisations are missing, probably linked to #332 (and #345), especially for small file transfer I guess. To understand such issues, analysing packet traces from at least the sender will be needed (and might take a bit of time)

I understand.


I ran some more experiments testing that area of the results specifically (but adding more delay steps):

image

Those results seems to indicate that the total throughput gradually will become better as the h1-eth1 delay (~RTT/2) approaches the delay on h1-eth0 (10 ms, 20 ms and 10 ms in this case). After a while it seems to go over almost completely to the h1-eth0 path due to the delay being too large compared to the h1-eth1 path which means that the throughput increases drastically.

Perhaps that makes the problem a bit clearer? Do you have any comment?

The most important question right now is if this has anything to due with my setup or if it is due to the protocol. If it's just is a consequence of how the current implementation works I don't think it is that big of a deal. But I will have to see what my supervisor says.

@mrconter1
Copy link
Author

mrconter1 commented Mar 29, 2023

Hm... But one thing I still don't understand is why the:

  • h1-eth0 path
    • 100 Mbps
    • 10 ms delay
  • h1-eth1 path
    • 2 Mbps
    • 20 ms delay

performance (24.2%). Would be better than the:

  • h1-eth0 path
    • 100 Mbps
    • 10 ms delay
  • h1-eth1 path
    • 2 Mbps
    • 10 ms delay

performance (14.8%).

It's a bit weird as well. The delay difference is larger in the 24.2%-example...

@matttbe
Copy link
Member

matttbe commented Mar 30, 2023

Thank you for the quick reply!

Which kernel version are you using?

I am using 6.3.0-rc2+.

Good, the dev version built from our export branch then?

What's the default route? (via eth0 I suppose)

I don't really know... I didn't know you could set a default route. I just create subflows on both of the interfaces like this:

The default route is not linked to MPTCP but to the kernel routing rules.

ip route get 10.0.2.2
ip route show default

If you mean what route the single path TCP uses it is h1-eth0 yes. In this case single path TCP should be 100 Mbps.

I guess it is h1-eth0, the first interface you created if you didn't explicitly change the routes

What TCP buffers sizes are you using?

net.ipv4.tcp_mem = 85887	114519	171774

Can you also share: net.ipv4.tcp_rmem and net.ipv4.tcp_wmem. (even if I guess you didn't modify them)

In short:

  • tcp_mem can be increased if you need to handle a lot of connections in //.
  • tcp_[rw]mem can be increased if the buffers are not big enough for the BDP. If you work with high delay and/or high throughput, this can be useful to increase the speed.

What kind of transfer is done (app (+ parameters) and from/to who)

The client is asking the server for N bytes (in this case 1 Mb) and the server replies with that many bytes. It is an ordinary TCP/MPTCP connection as far as I know. You can find the client and server scripts here. I set the receive window size parameter in the client:

client_socket.recv(15 * 1024)

That's the userspace buffer you set here, right? You don't change the socket buffers (setsockopt(SNDBUF / RCVBUF)) I hope, right? (I didn't see that when looking quickly)

and also set some system parameters:

(please share code and not images)

Note that lia Cc is not implemented (and not recommended in most cases anyway). Please use the same as for TCP to have a fair comparison.

(but I guess the default one is used as lia doesn't exist)

On the image you shared, all numbers are < 101%: so in your setup, MPTCP on multiple paths is never better than TCP on a single path?

No. You are just seeing part of all results. The improvement becomes much larger (> several hundred percent) when the file transfer sizes are bigger (> 10 Mb) and when the h1-eth1 path has a larger bandwidth (it's only 2 Mbps and 5 Mbps now). In the specific image I sent before the "maximum" improvement only would be around 105%.

OK thanks. It would be nice if you could share the full results at the end. It could be useful to have a reference to see what's important to improve first.

Yes, likely the current packet scheduler will be impacted by paths with a low RTT but also a lower bandwidth. It is supposed to counter that but some optimisations are missing, probably linked to #332 (and #345), especially for small file transfer I guess. To understand such issues, analysing packet traces from at least the sender will be needed (and might take a bit of time)

I understand.

I ran some more experiments testing that area of the results specifically (but adding more delay steps):

image

Those results seems to indicate that the total throughput gradually will become better as the h1-eth1 delay (~RTT/2) approaches the delay on h1-eth0 (10 ms, 20 ms and 10 ms in this case). After a while it seems to go over almost completely to the h1-eth0 path due to the delay being too large compared to the h1-eth1 path which means that the throughput increases drastically.

Perhaps that makes the problem a bit clearer? Do you have any comment?

Yes it does. It clearly shows the current packet scheduler is impacted by low throughput paths with a different latency (it should not). (a bit similar to #307)

The most important question right now is if this has anything to due with my setup or if it is due to the protocol. If it's just is a consequence of how the current implementation works I don't think it is that big of a deal. But I will have to see what my supervisor says.

Clearly, there is room for improvement for the implementation. It is not an issue with the protocol. The out-of-tree implementation was supporting such cases and had been tweaked along the years to do that. We still need to apply these techniques (+ maybe some shared in scientific papers only) here with the upstream implementation (we mainly need someone with time to do that I think :) ).

@matttbe
Copy link
Member

matttbe commented Apr 5, 2023

Hi @mrconter1

I'm not sure what's the situation here: did I reply to all your questions?

I think the issues you saw will be fixed by tickets #332 (and #345).
Can I then close this ticket here?

@matttbe matttbe self-assigned this Apr 5, 2023
@matttbe
Copy link
Member

matttbe commented Jul 12, 2023

I suggest to close this old ticket. Feel free to re-open it and provide answers to my previous questions if you still have the issue.

@matttbe matttbe closed this as completed Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants