-
Notifications
You must be signed in to change notification settings - Fork 20
Fix leaks / update error handling #178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* modernise imports * introduce `raises` for most `async` functions * remove lots of spurious memory allocations and zeroing * check decompression max size before allocating memory (!) * simplify error handling when reading frames * avoid blocking connection for too long just to send errors * don't swallow original error when trying to send error to client
Better error handling: status-im/nim-websock#178
Better error handling: status-im/nim-websock#178
Better error handling: status-im/nim-websock#178
introduces a little error in the encoding too :)
48923b6 to
b0a9060
Compare
websock/http/server.nim
Outdated
| stream.close() | ||
| raise exc | ||
|
|
||
| await stream.readHttpRequest(server.headersTimeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! I have "accidentally" tested this particular change (while trying to fix another issue) and it makes nwaku fail with "too many files open" if I stress-test it locally with 1500+ garbage connections:
DBG 2025-11-13 19:38:40.349-03:00 About to accept incoming connection topics="libp2p switch" tid=197343 file=switch.nim:270
DBG 2025-11-13 19:38:40.349-03:00 Too many files opened topics="libp2p wstransport" tid=197343 file=wstransport.nim:332 description="[EMFILE] Too many open files in the process"
DBG 2025-11-13 19:38:40.349-03:00 Unable to get a connection topics="libp2p switch" tid=197343 file=switch.nim:291
The problem is the stream.close() on error being removed. And this is how I fixed it (should probably have been just stream.close()):
try:
trace "Got new request", isTls = server.secure
return await stream.readHttpRequest(server.headersTimeout)
except CatchableError as exc:
await stream.closeWait()
raise exc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, thanks! e73768d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have the setup to run that test on this branch btw? I wasn't looking for that issue specifically, but it looks like a good test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm generating a dummy self-signed cert (self-signed.crt and self-signed.key via openssl req -x509 -newkey rsa:2048 -nodes -sha256 -days 365 -keyout self-signed.key -out self-signed.crt -subj "/CN=localhost") on the nwaku root and then simply running: ./build/wakunode2 --log-level=DEBUG --listen-address=0.0.0.0 --websocket-secure-support=true --websocket-port=8000 --websocket-secure-cert-path=./self-signed.crt --websocket-secure-key-path=./self-signed.key. That enables the nwaku secure websocket endpoint. Then I just hammer it with that stress-test I linked. Before make wakunode2 I just manually go to the vendor/nim-websock, fetch/pull and check out your branch to reproduce the issue. Then I put the fix above on top and the FD exhaustion goes away. Also, I just used nwaku master/head for this.
|
this warning is pre-existing - ie it's a problem with supporting multiple nim versions. it will need a more comprehensive change to fix |
| except CatchableError as exc: | ||
| # Can't hold up the accept loop | ||
| stream.close() | ||
| await stream.readHttpRequest(server.headersTimeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing handshakeTimeout with headersTimeout for the parsing timeout in nim-websock won't play well with current nim-libp2p since the HttpHeadersTimeout default is 120.seconds.
To override the 120-second default, nim-libp2p passes a 3-second "DefaultHeadersTimeout" handshakeTimeout to the HTTP server. (The handoff to nim-websock is actually here and here)
I think it would be safer if we fix nim-libp2p first, for example by making it send correct values for both headersTimeout and handshakeTimeout, fulfilling the current nim-websock contract. Then nim-websock can later ignore one of these values. (Or we can just go ahead and apply this fix here, then fix nim-libp2p. In any case, I just wanted to mention this)
I'm currently fixing a head-of-line blocking issue that will interact with these changes (I'll be basing my PR(s) off of this PR to minimize conflicts in any case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e2a1a1d - might as well deprecate it fully to avoid confusion as to how they work, it was somewhat subtle and redundant - one could argue that sendError should have a configurable timeout the way this PR shapes the code but I suspect there will be one more round of refactoring neede on this API anyway.
Better error handling: status-im/nim-websock#178
raisesfor mostasyncfunctions