Skip to content
This repository was archived by the owner on Nov 15, 2023. It is now read-only.

Conversation

@gterzian
Copy link
Contributor

FIX #2200

@gterzian gterzian added the A3-in_progress Pull request is in progress. No review needed at this stage. label Apr 10, 2019
@gterzian
Copy link
Contributor Author

gterzian commented Apr 10, 2019

@tomaka One issue I ran into is that one can't keep exactly track of the "last message" for a given peer, since some messages come with a PeerId, and others with a IncomingIndex.

So for example if a Reject(index) is received, how can we make sure that this didn't come after a Connect(peer_id) message for that specific peer?

Would there be a way to figure out one from the other? Perhaps we should add the PeerId to Reject and Accept messages?

@tomaka
Copy link
Contributor

tomaka commented Apr 10, 2019

Will properly review later, but I don't think we should change the API because of tests.
Why not keep a hashmap that associates incoming indices with peerids?

Also, since it's pretty large and only deals with the public API, I suggest we move the code that this PR adds into the tests/ directory instead.

@gterzian
Copy link
Contributor Author

Why not keep a hashmap that associates incoming indices with peerids?

When a message comes in with an indice, how to associate it with a peer id?

I suggest we move the code that this PR adds into the tests/ directory instead

Ok

@tomaka
Copy link
Contributor

tomaka commented Apr 10, 2019

Oh I didn't realize you were only dumping a lot of messages into the peerset then reading the results.
That's not a good enough coverage in my opinion. We should simulate an actual network and react to what the peerset sends to us and in a random way. It really is far from easy to do unfortunately.

@gterzian
Copy link
Contributor Author

gterzian commented Apr 11, 2019

Ok so re the index vs id, they can be associated when setting things up.

We should simulate an actual network and react to what the peerset sends to us and in a random way.

Why do we need a network setup? It seems easier to test the peerset as a blackbox by randomly using the API, even in ways that currently isn't used as part of an actually running network.

For example, from the current test failure it appears to be quite easy to have an Accept message following a Connect message for a peer by doing a dropped followed by an incoming. So perhaps that is not the intended usage of the API, and "it will never happen". Yet now we have the information that it can happen, we can then either change our expectation, and say that it's OK, or change the peerset so that the expectation holds even in that specific case.

@tomaka
Copy link
Contributor

tomaka commented Apr 15, 2019

Why do we need a network setup?

By "simulate an actual network", I mean "use the API to make the peerset believe that we have an actual network".

it appears to be quite easy to have an Accept message following a Connect message for a peer by doing a dropped followed by an incoming.

"Connect -> dropped -> incoming -> Accept" is indeed a valid API usage.
Unfortunately figuring out what is correct and incorrect is quite hard, both in terms of code and in terms of intuition.

@gterzian gterzian added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Apr 16, 2019
@gterzian gterzian requested a review from tomaka April 16, 2019 15:38
@gterzian gterzian added A3-in_progress Pull request is in progress. No review needed at this stage. and removed A0-please_review Pull request needs code review. labels Apr 16, 2019
@gterzian
Copy link
Contributor Author

gterzian commented Apr 16, 2019

Ok I've added some checks that can actually express an expectations of a sequence of actions like "Connect -> dropped -> incoming -> Accept".

It seems that " dropped -> incoming -> Accept" holds so far, but that serie comes out often without first having received a Connect message. If we relax to dropped -> incoming -> Accept the tests pass...

"use the API to make the peerset believe that we have an actual network"

If we focus more on using the API in a way that an actual network would, are we not going to exclude weird unexpected cases? The "random" input could be adapted to not be completely random and match what would otherwise be expected...

@tomaka tomaka mentioned this pull request Apr 30, 2019
@gavofyork
Copy link
Member

@tomaka @gterzian how is this looking? will it be in a mergeable state any time soon?

@gterzian
Copy link
Contributor Author

gterzian commented May 6, 2019

Currently tests are failing on the assertions of "Connect -> dropped -> incoming -> Accept" always being received in that order. If we reduce this to only "dropped -> incoming -> Accept", they would pass.

Other then that technicality, we need to decide on an overall approach.

I am in favor of the approach taken here, which is to send completely random messages to the peer set as input, and then make assertions on the output we get from the peerset, ensuring that no matter the sequence of input, certain invariant hold on the output.

@tomaka I think you wanted to take a different approach? Do you have suggestions on steps that could be taken in that direction?

@tomaka
Copy link
Contributor

tomaka commented May 6, 2019

I think you wanted to take a different approach?

I'm fine with the approach in this PR, but I think it's insufficient. We should adjust the messages we send to the PSM based on what it is sending back to us.

I opened #2470 with that other approach, but we could merge both.

@gterzian
Copy link
Contributor Author

gterzian commented May 6, 2019

Ok, I've taken a look and I think it wouldn't hurt to have both tests run side by side.

I've also fixed a logical error in the assertions on this one, it now passes.

The folder introduced here is named test, while the other PR introduces a tests, but I'm hopeful that's a difference we can overcome...

(Some(drop), Some(incoming)) => {
assert!(drop < incoming);
println!("Last messages: {:?}", last_messages);
assert!(last_messages.contains(&Message::Connect(peer_id.clone())));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the Connect message has already been removed from the VecDeque above, this was a faulty assertion...

@tomaka
Copy link
Contributor

tomaka commented May 6, 2019

The folder introduced here is named test, while the other PR introduces a tests, but I'm hopeful that's a difference we can overcome...

I went for an external test since we test the API surface of the crate, as opposed to an internal API.

@gterzian
Copy link
Contributor Author

gterzian commented May 6, 2019

Ok, I'd be happy to rebase after the other PR merges, and just add this test to the same file.

Fixed a couple of bugs in this test. Perhaps you could also take a look that the main assertions, about the order of operations, make sense?

I'll look more into the other PR, and see if I can dig up the reason for the current failure...

@tomaka
Copy link
Contributor

tomaka commented May 6, 2019

I'll look more into the other PR, and see if I can dig up the reason for the current failure...

It's entirely possible that there's an issue in the PSM. It wouldn't be a bad idea to test this PR (and mine as well) against #2440.

@gavofyork gavofyork added the A5-stale Pull request did not receive any updates in a long time. No review needed at this stage. Close it. label May 14, 2019
@gavofyork
Copy link
Member

@gterzian this looks stale - should it be closed?

@gterzian
Copy link
Contributor Author

Ok, closing because I think #2470 gives us the coverage we want...

@gterzian gterzian closed this May 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

A3-in_progress Pull request is in progress. No review needed at this stage. A5-stale Pull request did not receive any updates in a long time. No review needed at this stage. Close it.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fuzz the peerset manager to test coherency

3 participants