fixed ipc leak#277
Conversation
|
thanks @gnunicorn for debugging the issue! |
|
|
||
| self.flag = !self.flag; | ||
|
|
||
| match b.poll()? { |
There was a problem hiding this comment.
Am I right that this whole match can be replaced with simple b.poll() call? :)
Just pointing out, that this doesn't come into effect until Paritiy's cargo.toml branch references are upgraded, too as they don't point to master (I suspect this ther is a branching whenever there is a release?). |
|
@gnunicorn I know, we need to backport these fixes to that branch |
|
hold on with merging this pr, tests are randomly failing. Most likely because |
dvdplm
left a comment
There was a problem hiding this comment.
Looks good (and gnunicorns bug report was amazing!) but I get randomly failing tests in ipc: sometimes all is green, but most of the times there's at least one failure. Can you take a look at that?
| }) | ||
| .filter_map(|x| x) | ||
| .select(receiver.map_err(|e| { | ||
| .select_both(receiver.map_err(|e| { |
There was a problem hiding this comment.
I for one would appreciate a comment as to why we're using select_both here – doesn't seem super-obvious! ;)
|
Io it seems like tests are failing, because every request uses it's own, new event loop. And order of the event loop initialization is not guaranteed. So sometimes tests fail, because server hasn't started yet or started and was already dropped (because nothing has an ownership of the event loop) |
|
Hold your beer - it seems this doesn't fix openethereum/parity-ethereum#8618. Please see openethereum/parity-ethereum#8618 (comment) |
|
🍺 |
debris
left a comment
There was a problem hiding this comment.
ipc server tests are just terrible, contain multiple bugs and they should be rewritten
| "Response does not exactly match the expected response", | ||
| ); | ||
| ); | ||
| server.close(); |
There was a problem hiding this comment.
test issue no. 1
- we were dropping the server before we even connected to it. everything was working only because of the leak
| let _server = builder.start(path).expect("Server must run with no issues"); | ||
| thread::sleep(::std::time::Duration::from_millis(50)); | ||
| let server = builder.start(path).expect("Server must run with no issues"); | ||
| thread::sleep(::std::time::Duration::from_millis(5000)); |
There was a problem hiding this comment.
test issue no. 2 (not solved)
- server is not started synchronously, therefore we cannot assume that it's already available at the time of making the request. fixing this issue is actually quite complex, so I added only this workaround
| type Error = S1::Error; | ||
|
|
||
| fn poll(&mut self) -> Poll<Option<S1::Item>, S1::Error> { | ||
| let (a, b) = if self.flag { |
There was a problem hiding this comment.
eating a virtual dispatch on every call to poll doesn't seem great to me.
There was a problem hiding this comment.
(a macro poll_inner!(a, b) would probably end up being less verbose in the end)
There was a problem hiding this comment.
it is just modified the version of select from futures library https://docs.rs/futures/0.1.21/src/futures/stream/select.rs.html#11-15
| Async::NotReady => (), | ||
| }; | ||
|
|
||
| self.flag = !self.flag; |
There was a problem hiding this comment.
is this second negation of self.flag intended?
| @@ -332,13 +339,15 @@ mod tests { | |||
|
|
|||
There was a problem hiding this comment.
This test still occasionally fails here. Running for i in {1..20}; do cargo test; done it fails 2-3 times. Running it on by itself (for i in {1..20}; do cargo test req_parallel; done) makes things worse: about a fourth fails.
| let server = run(path); | ||
| thread::sleep(::std::time::Duration::from_millis(1000)); | ||
| thread::sleep(Duration::from_millis(100)); | ||
| let (stop_signal, stop_receiver) = mpsc::channel(400); |
There was a problem hiding this comment.
How about we parametrize the nr of threads/loops so we can do mpsc::channel(TEST_THRDS * TEST_ITERS);?
| }) | ||
| ); | ||
| } | ||
| thread::sleep(Duration::from_millis(100)); |
There was a problem hiding this comment.
Just curious: why do we still need to nap?
|
this issue is super annoying. it works on my machine always now. But it looks like the server is not yet ready, when the tests start |
I still get failures, but only using 1.26.1. No failures on nightly. |
|
|
||
| [target.'cfg(not(windows))'.dev-dependencies] | ||
| tokio-uds = "0.1" | ||
| tokio-uds = "0.2" |
There was a problem hiding this comment.
Updating the package helped me diagnose tests issues
| .select(receiver.map_err(|e| { | ||
| // we use `select_with_weak` here, instead of `select`, to close the stream | ||
| // as soon as the ipc pipe is closed | ||
| .select_with_weak(receiver.map_err(|e| { |
There was a problem hiding this comment.
maintains the stream as long as the connection exists. the outgoing stream may be closed earlier
|
|
||
| Ok(()) | ||
| }); | ||
| start_signal.send(Ok(())).expect("Cannot fail since receiver never dropped before receiving"); |
There was a problem hiding this comment.
start_signal was triggered before the server was fully set up. After fixing the order of initialisation, we no longer need thread::sleep in tests
* fixed ipc connection leak, closes #275 * fixed indentation * fixed broken pipe issue in tests * empirical tests fixes * fix tests * fix tests * fix tests * move ipc start_signal.send after the incoming.for_each * log ipc traces on travis * keep writer in memory as long as possible * select_with_weak * remove redundant thread::sleep * test session end * fixed race condition in test_session_end
* fixed ipc connection leak, closes #275 * fixed indentation * fixed broken pipe issue in tests * empirical tests fixes * fix tests * fix tests * fix tests * move ipc start_signal.send after the incoming.for_each * log ipc traces on travis * keep writer in memory as long as possible * select_with_weak * remove redundant thread::sleep * test session end * fixed race condition in test_session_end
* fixed ipc connection leak, closes #275 * fixed indentation * fixed broken pipe issue in tests * empirical tests fixes * fix tests * fix tests * fix tests * move ipc start_signal.send after the incoming.for_each * log ipc traces on travis * keep writer in memory as long as possible * select_with_weak * remove redundant thread::sleep * test session end * fixed race condition in test_session_end
* fixed ipc connection leak, closes #275 * fixed indentation * fixed broken pipe issue in tests * empirical tests fixes * fix tests * fix tests * fix tests * move ipc start_signal.send after the incoming.for_each * log ipc traces on travis * keep writer in memory as long as possible * select_with_weak * remove redundant thread::sleep * test session end * fixed race condition in test_session_end
closes #275
closes openethereum/parity-ethereum#8774
openethereum/parity-ethereum#8618The leak existed, cause we were merging incoming message responses stream with outgoing messages. If either of them existed, we were not dropping a connection.
To fix the issue, we finish the merged stream as soon as one of the merged streams ends.