Skip to content
This repository was archived by the owner on Aug 2, 2021. It is now read-only.

Swarm rather stable: update syncing#1336

Merged
nonsense merged 16 commits intoswarm-rather-stablefrom
swarm-rather-stable-update-syncing
Apr 30, 2019
Merged

Swarm rather stable: update syncing#1336
nonsense merged 16 commits intoswarm-rather-stablefrom
swarm-rather-stable-update-syncing

Conversation

@janos
Copy link
Copy Markdown
Member

@janos janos commented Apr 10, 2019

This is an experiment for changing update syncing functionality.

@janos janos marked this pull request as ready for review April 11, 2019 08:21
@janos janos requested a review from zelig April 16, 2019 14:46
@janos
Copy link
Copy Markdown
Member Author

janos commented Apr 16, 2019

@zelig could you check latest changes in this pr?

Copy link
Copy Markdown
Member

@zelig zelig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent job @janos
approved pending mending /replacing skipped test (or you want annther pr?)

Comment thread swarm/network/stream/peer.go Outdated
// (with po peerPO) needs to be subscribed after kademlia neighbourhood depth
// change from prevDepth to newDepth. Max argument limits the number of
// proximity order bins. Returned values are slices of integers which represent
// proximity order bins, the firs one to which additional subscriptions need to
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

firs -> first

Comment thread swarm/network/stream/peer_test.go Outdated
},
{
po: 4, prevDepth: 0, newDepth: 4, // 0-16 -> 4-16
quitBins: intRange(0, 4),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

odd one out using intRange

}

func (p *Peer) runUpdateSyncing() error {
timer := time.NewTimer(p.streamer.syncUpdateDelay)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need this?
we used the delay globally in order to wait till healthy, not sure what is the purpose here

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose is to avoid very frequent updates when multiple peers are added added or removed in short period of time, like on swarm start or when there are connectivity issues. I think that this delay can be lowered, but I am still for keeping it.

Comment thread swarm/network/kademlia_test.go Outdated
t.Run("no new peers", func(t *testing.T) {
k := newTestKademlia(t, "00000000")

c, u := k.SubscribeToNeighbourhoodDepthChange()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more verbose varnames @frncmx?
like changeCand unsubscribe?

Comment thread swarm/network/stream/peer.go Outdated
}
}

func (p *Peer) runUpdateSyncing() error {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe consider commenting these next 4 functions

@janos
Copy link
Copy Markdown
Member Author

janos commented Apr 17, 2019

@zelig thanks for the review. I think that it would be better to fix the test in this pr, and I'll do it.

@janos janos changed the title [WIP] Swarm rather stable: update syncing Swarm rather stable: update syncing Apr 18, 2019
@janos janos requested a review from zelig April 18, 2019 15:01
Comment thread swarm/network/stream/peer_test.go Outdated
func waitForSubscriptions(t *testing.T, r *Registry, ids ...enode.ID) {
t.Helper()

for reties := 0; reties < 100; reties++ {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reties -> retries

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, corrected.

Copy link
Copy Markdown
Contributor

@holisticode holisticode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically great work. I think it improves a lot on the problems we had with updateSyncing.

Please just make sure we actually have a test case for the original trigger to refactor this code (new peer enters nearest neighborhood without depth change and all subscriptions are made).

Also, there is an open question if we want to add the TestPureRetrieval test, which was added in the subs-by-peer branch (which was a precursor to this PR), to this PR, or if it should be added as a separate PR.

}
waitForSubscriptions(t, pivotRegistry, ids...)

newDepth := pivotKademlia.NeighbourhoodDepth()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add here a check even if the depth did not change.

We started all this because we had situations where if a new peer was added and it did not change the depth, then no new subscriptions were added to this new node.

Is this tested somewhere? Should it be tested here or somewhere else?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point. I will see to add this test case.

Copy link
Copy Markdown
Member Author

@janos janos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Fabio, test TestPureRetrieval is great and probably is better to have its own PR.

}
waitForSubscriptions(t, pivotRegistry, ids...)

newDepth := pivotKademlia.NeighbourhoodDepth()
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point. I will see to add this test case.

Comment thread swarm/network/stream/peer_test.go Outdated
// nodes proximities from the pivot node
nodeProximities := make(map[string]int)
for _, id := range ids[1:] {
nodeProximities[id.String()] = chunk.Proximity(pivotKademlia.BaseAddr(), id.Bytes())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id is enode.ID, so id.Bytes() is not a bzz address. It appears that those proximities are not correct as a result.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, and thanks for the PR.

Comment thread swarm/network/stream/peer.go Outdated
return subBins, quitBins
}

// subs returns the range to which proximity order bins syncing
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment seems to be wrong, at least the function name is different.

Comment thread swarm/network/stream/peer_test.go Outdated
}
// add new nodes to sync subscriptions check
for _, id := range ids {
nodeProximities[id.String()] = chunk.Proximity(pivotKademlia.BaseAddr(), id.Bytes())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about id.Bytes() here. I don't think this is an overlay bzz address, but rather an enode.ID.

Comment thread swarm/network/stream/peer.go Outdated
}

kad := p.streamer.delivery.kad
po := chunk.Proximity(network.NewAddr(p.Node()).Over(), kad.BaseAddr())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having looked at network.NewAddr, I doubt this is used everywhere in Swarm, because our bzz addrs are different to enode.IDs, whereas this function is defined as:

func NewAddr(node *enode.Node) *BzzAddr {
	return &BzzAddr{OAddr: node.ID().Bytes(), UAddr: []byte(node.String())}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong, but it seems to me that network.NewAddr is used only in tests and simulations, and in production Swarm the BzzAddr is generated differently. See Swarm.Start where we actually override the Underlay Address, and we keep the previously set Overlay Address:

	// update uaddr to correct enode
	newaddr := s.bzz.UpdateLocalAddr([]byte(srv.Self().String()))
	log.Info("Updated bzz local addr", "oaddr", fmt.Sprintf("%x", newaddr.OAddr), "uaddr", fmt.Sprintf("%s", newaddr.UAddr))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handleChunkDelivery also appears to be calculating proximity incorrectly based on enode.ID and not based on its BzzAddr:

	case *ChunkDeliveryMsgRetrieval:
		msg = (*ChunkDeliveryMsg)(r)
		peerPO := chunk.Proximity(sp.ID().Bytes(), msg.Addr)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find, thanks for the fix in your PR.

Comment thread swarm/network/stream/stream.go Outdated
r.peersMu.Lock()
r.peers[peer.ID()] = peer
metrics.GetOrRegisterGauge("registry.peers", nil).Update(int64(len(r.peers)))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be hidden in setPeer but explicitly part of Run function

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree, it is correct to have it in Run function. Changed.

// to create new ones or quit existing ones based on the new neighbourhood depth
// and if peer enters or leaves nearest neighbourhood by using
// syncSubscriptionsDiff and updateSyncSubscriptions functions.
func (p *Peer) runUpdateSyncing() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function should be called from Registry.Run.
There BzzAddr is available from BzzPeer
the remote overlay address should simply be passed as an argument to this function

Copy link
Copy Markdown
Member Author

@janos janos Apr 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BzzPeer is encapsulated by stream Peer, so that BzzAddr is available, in the latest change.

@janos janos requested a review from zelig April 29, 2019 09:49
Copy link
Copy Markdown
Contributor

@nonsense nonsense left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nonsense nonsense merged commit 9c5be52 into swarm-rather-stable Apr 30, 2019
@acud acud deleted the swarm-rather-stable-update-syncing branch May 5, 2019 18:37
nonsense pushed a commit that referenced this pull request May 10, 2019
nonsense pushed a commit that referenced this pull request May 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants