Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tox-bootstrapd goes into an infinite loop #2332

Open
emdee-is opened this issue Sep 26, 2022 · 9 comments
Open

tox-bootstrapd goes into an infinite loop #2332

emdee-is opened this issue Sep 26, 2022 · 9 comments
Labels
bug Bug fix for the user, not a fix to a build script network Network P1 High priority security Security
Milestone

Comments

@emdee-is
Copy link

emdee-is commented Sep 26, 2022

I'm seeing from LOG=TRACE that tox-bootstrapd goes into an infinite loop if you give it a tcp_port packet of less than the expected number 128 of bytes.

If you compile and run other/bootstrap_daemon/src/tox-bootstrapd.c
and change in bootstrap_node_info.py

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)

to

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

and run it against the BS daemon, it fills the TRACE logs with:

TCP_server.c:1055(do_incoming) handling incoming TCP connection 0
TCP_common.c:203(read_TCP_packet) recv buffer has 82 bytes, but requested 128 bytes
TCP_server.c:375(read_connection_handshake) connection handshake is not ready yet

I'm guessing that the cause is network.c#524 where the MSG_NOSIGNAL makes it loop forever, even after the clent has died. So you also get lots of tcp sockets left dangling in CLOSE_WAIT.

@Green-Sky
Copy link
Member

Green-Sky commented Sep 27, 2022

I did some testing under the suspicion that a file descriptor starvation might occur.
After a little testing it occurred, but only if ulimit -n 256 or lower, since my system seems to auto close extra connections after that.
Additionally, most bootstrap nodes will be running with the recommended 32k fd limit anyway (5530e41)

related note from the issue:

So you also get lots of tcp sockets left dangling in CLOSE_WAIT.

@emdee-is
Copy link
Author

emdee-is commented Sep 27, 2022

The ulimit -n on the machine I was testing on was 1048576

I would tag this with Security as it is a DOS attack on bootstrapd at least.

@emdee-is
Copy link
Author

I'm guessing here but could this be related to the hard to reproduce similar TCP error #2352

@emdee-is
Copy link
Author

I did see dangling sockets in CLOSE_WAIT.

"tcp sockets that are left open and not closed only cause a little extra cpu and a file descriptor "leak" " except that if your file limit is small it could bring down the machine. Some Unixes has 1024!!

@iphydf iphydf added this to the v0.2.x milestone Nov 13, 2023
@emdee-is
Copy link
Author

emdee-is commented Dec 6, 2023

It would be nice to have a test for this in the testsuite. It's easy enough to create a tcp_port packet of less than the expected number 128 of bytes in python and send it to tox-bootstrapd to see if it goes into an infinite loop.

@iphydf iphydf added bug Bug fix for the user, not a fix to a build script P1 High priority network Network security Security labels Dec 7, 2023
@iphydf
Copy link
Member

iphydf commented Dec 7, 2023

Agreed. Can you write the python script? I can hook it up to CI for automated tests.

@emdee-is
Copy link
Author

emdee-is commented Feb 1, 2024

@iphydf I'd be happy to write the script but a quick and dirty one is the one-line change mentioned in #2332 (comment) to c-toxcore/other/fun/bootstrap_node_info.py
It just does the UDP probe with TCP and the daemon goes infinite.

I assume that means any bsdaemon is DOSable.

@Green-Sky if sockets are left dangling it may always exceed whatever the ulimit is - it's just a matter of time.

@emdee-is
Copy link
Author

emdee-is commented Feb 1, 2024

@iphydf to put this in context: if you make the oneline change in bootstrap_node_info.py to trigger and confirm this condition, I'm sure you'll find the cause soon enough. If it gets fixed, AND #2331 is done so I can have a simple way of getting a version response back over TCP, then I can write a fancier version of bootstrap_node_info.py that works over TCP including behind tor.

These are both needed if you are "interested in making tox in tor (e.g. hidden node) work." #2331 (comment) which I am, and the fancier script could go in the testsuite.

If you trigger and confirm this condition I'd be interested to know what your thoughts are on why the current tests didn't pick it up. In another issue I came to the conclusion that the current proxy test is a noop. #2469 (comment) Now the IPv6 issue is solved, the current proxy test could never work behind a firewall because the ctest would not have access to clearnet to run its proxy.

@emdee-is
Copy link
Author

emdee-is commented Feb 3, 2024

@iphydf If you are "interested in making tox in tor (e.g. hidden node) work."
you'll also need to break the hardcoded BSnodes out of ctest if you want to use ctest over Tor #2467

Or you'll need a testrunner that is proxy aware with runtime setable timeouts: https://git.plastiras.org/emdee/toxygen_wrapper/src/branch/main/src/tox_wrapper/tests/tests_wrapper.py Testing over Tor is a different beast: you can have order-of-magnitude timing changes from week to week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug fix for the user, not a fix to a build script network Network P1 High priority security Security
Projects
None yet
Development

No branches or pull requests

3 participants