-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
properly close connections #128
Conversation
what does basically, for streams that have a parent stream, it should mean roughly "request underlying stream to close" and then the underlying stream closes and notifies child streams that it closed by firing its close event, "rippling" the close back up the chain, and then the stream cleans up its resources - this way, no matter what caused the stream to close, the cleanup is run at the right time. So there are two data flows here: from parent to child and from child to parent - what's confusing is that inside the parent-to-child handler, the child-to-parent is invoked, creating a loop - this loop complicates matters because now you have to introduce state (that might be take on unexpected values in out-of-order async execution) to handle it. it seems that |
Yes,
It looks like that, however the way
Ideally yes, but I'm not sure if that's possible. Basically there is only one place where we want to notify the child/parent streams that it's being closed, which is when the close happens, notifying of the close and then disposing of resources actually complicates matters because now you have handle an essentially atomic operation in several places. I'm not arguing that this is perfect, but right now it seems that tweaking it won't necessarily help us solve this issues, this requires a refactor that as you say will simplify all of this machinery (wip & wip2), this is next in my priority list. The biggest culprit of this leak seems to be https://github.com/status-im/nim-libp2p/pull/128/files#diff-54deac2904add92d29f284d9eb946cc7R33-R40, the connections where being stored in the transports internal table and never disposed of, this table is probably not needed anymore and it's a left over of an early design assumption that doesn't hold anymore. |
well, I guess there are two schools of thought here: one is where "close" means disconnecting from parent stream: it would mean removing the connection to the parent close event handler as well as cancelling any pending reads and writes etc, so that the child stream is entirely disconnected from the parent. the parent stream is then notified and can do what it pleases because the child no longer cares: keep reading, closing itself etc the other is that "close" is a notification to the parent stream to "start" closing, but on the child stream nothing else happens: no cleanup, no notification, no event to children of the child streams: this is delayed until the parent stream as carried out the close request. I know there is a flag in there to break the loop - but the loop can also be broken "structurally" by any of the two methods above - true, there's little distinction, but it's worth thinking about what "principle" close should operate by, and how one ensures that there is no race condition, multiple events etc. |
This is the close method on the connection - https://github.com/status-im/nim-libp2p/blob/master/libp2p/connection.nim#L110-L117. I think this does what you describe in the above paragraph.
I'm not entirely sure what the advantage of this approach is?
I think this is the ideal and all the close event cruft is a consequence of the current approach, which needs simplifying. I'm not sure tho how to fix this structurally without significant refactoring (wip & wip2) right now tho. |
it's simpler: you don't have to handle close failures which makes closing more predictable and less risky from a race point of view. |
@@ -154,5 +153,10 @@ method newStream*(m: Mplex, | |||
|
|||
method close*(m: Mplex) {.async, gcsafe.} = | |||
trace "closing mplex muxer" | |||
if not m.connection.closed(): | |||
await m.connection.close() | |||
|
|||
await allFutures(@[allFutures(toSeq(m.remote.values).mapIt(it.reset())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might help to merge this #125 but anyway can be the opposite way too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably more urgent since it fixes some pretty egregious mem leaks. I'd rather get this in first to see how it behaves in NBC.
if not isNil(sconn) and not sconn.closed: | ||
asyncCheck sconn.close() | ||
result = newConnection(newBufferStream(writeHandler)) | ||
asyncCheck readLoop(sconn, result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda wanna remove this too and track the future but I guess can be done in another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs to happen in a more focused refactor, we'll do that after we get the current implementation more stable
No description provided.