Skip to content
This repository has been archived by the owner on Feb 1, 2023. It is now read-only.

Speed up sessions Round #1 #27

Merged
merged 5 commits into from
Dec 22, 2018
Merged

Speed up sessions Round #1 #27

merged 5 commits into from
Dec 22, 2018

Conversation

hannahhoward
Copy link
Contributor

@hannahhoward hannahhoward commented Dec 4, 2018

Goals

This is an initial PR to speed up sessions and reduce dupes. It is not the ONLY improvement I'm working on, but it makes significant progress. It matches Jeremy's PR on benchmarks for dupes (for the most part), but matches master on speed (for the most part). In addition, it shows impressive results with LS on Wikipedia. See results in comments.

Implementation

This PR has basically 3 improvements:

  1. The session peer manager is just SLIGHTLY smarter in that:
    a. It returns a maximum of 32 peers (why would you ever need to ask more at once?)
    b. It somewhat prioritizes the best peers. The strategy is simply if you receive a block from a peer first, you assume that's a good peer and put it at the front of the list for peers on your next want request. This is not ordering peers by latency, it's just assuming whoever responds is a good person to ask next.

  2. The split broadcast vs target max want list for sessions from jeremy's code is copied over. Basically, this is just the assumption that you broadcast only 4 wants at once (previously 16), but once you have peers, you go up to 32 on targeted requests

  3. Requests for a series of blocks are split among peers, similarly to Jeremy's PR. However, where Jeremy's PR kept the splitting right in the Session, this moves the splitting out to a separate SessionRequestSplitter module. The strategy pursued is slightly different, and is simply attempting to minimize dupes w/o slowing down the overall request speed by:
    a. If dupes go above a certain threshold, increase the split factor
    b. If they fall below a certain threshold, decrease the split factor (to increase speed, just in case)
    Honestly, there are a lot of better ways to do this, but this is functional and works well. (though it vacillates between extremes, cause there's minimal throttling of these adjustments)
    c. To make this work, dupes are now tracked on a per session basis (and we keep a list of past wants to know what's a dupe)

Next

Once this PR is merged, there are three further optimizations:

  1. Peer optimization should be tracked more closely by monitoring actual latency on requests
  2. Request splitting should be more intelligent about reducing dupes, and throttled on its adjustments, and possibly become more "fine-grained" (as in fewer adjustments, or smaller adjustments) as a session is open longer, cause at that point, you know more about what the network looks like for whatever it is you're requesting.
  3. Request splitting should also take into account how many requests a given node has at once -- that way we don't overload the pipe on a given peer. This first needs: Add bandwidth limitations to benchmarks #40

For discussion

This is not a "end goal" PR for session speedup, but a first step and a framework for further optimizations (basically, improvements to the peer manager and the request splitter can proceed independently). I actually have progress on some of the above next steps, but I feel like that should be in a separate PR, so the number of changes aren't overwhelming, and also, I want to make sure this approach feels ok to people.

This also includes some minor changes to the benchmarks, primarily to put the saved json file in a folder that isn't included in git commits.

@ghost ghost assigned hannahhoward Dec 4, 2018
@ghost ghost added the status/in-progress In progress label Dec 4, 2018
@hannahhoward
Copy link
Contributor Author

hannahhoward commented Dec 4, 2018

Name Dups-Master Dups-Jeremy Dups-New MsgSent-master MsgSent-Jeremy MsgSent-New MsgRecd-Master MsgRecd-Jeremy MsgRecd-New Time-Master Time-Jeremy Time-new
BenchmarkDups2Nodes/AllToAll-OneAtATime 1 1 1 199 200 200 101 101 101 2s 202ms 2s 209ms 2s 214ms
BenchmarkDups2Nodes/AllToAll-BigBatch 16 4 4 85 138 141 60 85 64 159ms 548ms 94ms
BenchmarkDups2Nodes/Overlap1-OneAtATime 0 0 0 251 249 248 100 100 100 2s 766ms 2s 788ms 2s 768ms
BenchmarkDups2Nodes/Overlap2-BatchBy10 34 32 32 131 132 203 20 24 22 221ms 857ms 836ms
BenchmarkDups2Nodes/Overlap3-OneAtATime 34 32 34 394 392 379 134 132 134 2s 764ms 4s 474ms 2s 759ms
BenchmarkDups2Nodes/Overlap3-BatchBy10 34 32 32 129 129 208 20 25 22 218ms 857ms 840ms
BenchmarkDups2Nodes/Overlap3-AllConcurrent 34 29 21 194 203 291 99 111 90 150ms 1s 179ms 705ms
BenchmarkDups2Nodes/Overlap3-BigBatch 31 2 24 179 165 260 95 80 87 767ms 4s 945ms 722ms
BenchmarkDups2Nodes/Overlap3-UnixfsFetch 34 1 31 173 105 267 90 40 79 178ms 1s 312ms 692ms
BenchmarkDups2Nodes/10Nodes-AllToAll-OneAtATime 8 8 8 215 216 215 108 108 108 2s 194ms 2s 216ms 2s 185ms
BenchmarkDups2Nodes/10Nodes-AllToAll-BatchFetchBy10 80 32 32 89 74 146 18 22 19 223ms 268ms 254ms
BenchmarkDups2Nodes/10Nodes-AllToAll-BigBatch 128 32 32 141 103 170 70 71 71 155ms 554ms 96ms
BenchmarkDups2Nodes/10Nodes-AllToAll-AllConcurrent 128 32 32 182 163 188 98 99 83 154ms 545ms 99ms
BenchmarkDups2Nodes/10Nodes-AllToAll-UnixfsFetch 8 8 8 82 114 137 55 57 59 176ms 115ms 116ms
BenchmarkDups2Nodes/10Nodes-OnePeerPerBlock-OneAtATime 0 0 0 1604 1652 1695 100 100 100 6s 697ms 9s 540ms 6s 720ms
BenchmarkDups2Nodes/10Nodes-OnePeerPerBlock-BigBatch 0 0 0 607 885 1071 80 89 62 1s 55ms 12s 626ms 1s 287ms
BenchmarkDups2Nodes/10Nodes-OnePeerPerBlock-UnixfsFetch 0 0 0 649 670 915 71 41 64 1s 489ms 2s 478ms 1s 303ms
BenchmarkDups2Nodes/200Nodes-AllToAll-BigBatch 3168 792 792 785 1026 1003 201 214 200 384ms 322ms 369ms
BenchmarkDupsManyNodesRealWorldNetwork/200Nodes-AllToAll-BigBatch-FastNetwork 5250 1192 1192 3226 1871 985 863 493 430 1s 705ms 4s 602ms 1s 576ms
BenchmarkDupsManyNodesRealWorldNetwork/200Nodes-AllToAll-BigBatch-AverageVariableSpeedNetwork 2560 1192 444 2995 1863 1694 305 493 237 1s 382ms 3s 855ms 1s 363ms
BenchmarkDupsManyNodesRealWorldNetwork/200Nodes-AllToAll-BigBatch-SlowVariableSpeedNetwork 3240 1040 620 6127 1872 1602 943 454 283 2s 71ms 5s 340ms 1s 590ms

@hannahhoward
Copy link
Contributor Author

Some pretty exciting results with ls-ing Wikipedia

Master ipfs bitswap stat results at end:

bitswap status
	provides buffer: 0 / 256
	blocks received: 255280
	blocks sent: 5213
	data received: 142597822
	data sent: 8654147
	dup blocks received: 181268
	dup data received: 99 MB
	wantlist [0 keys]
	partners [637]

Master time command results:

cmd/ipfs/ipfs ls --resolve-type=false   7.45s user 3.51s system 0% cpu 2:14:45.69 total

This PR ipfs bitswap stat results at end:

bitswap status
	provides buffer: 0 / 256
	blocks received: 91229
	blocks sent: 1103
	data received: 53514594
	data sent: 3224259
	dup blocks received: 16839
	dup data received: 10 MB
	wantlist [0 keys]

This PR time command results:

cmd/ipfs/ipfs ls --resolve-type=false   8.27s user 3.91s system 0% cpu 31:02.45 total

@hannahhoward
Copy link
Contributor Author

Table updated:

  • Mostly tracking Jeremy PR in terms of dupes (except BenchmarkDups2Nodes/Overlap3-BigBatch & BenchmarkDups2Nodes/Overlap3-UnixfsFetch)

  • Mostly tracking master in terms of time (except Real-world benchmarks, BenchmarkDups2Nodes/AllToAll-BigBatch, BenchmarkDups2Nodes/Overlap2-BatchBy10,
    BenchmarkDups2Nodes/10Nodes-AllToAll-BigBatch, BenchmarkDups2Nodes/10Nodes-AllToAll-AllConcurrent)

@hannahhoward
Copy link
Contributor Author

Table updated w/ cde89cb

  • Now outperforming master in some tests, only minimally behind master on others. Cumulative benchmark run time is comparable

  • Outperforming Jeremy PR on dups on some tests, still behind on others

Remaining:

  • Feel like the most recent changes are hacking to beat the benchmarks, and have also found some of the gaps in benchmarks as a result (for example, we do not deal with a maximum bandwidth on a node in the test net -- probably need to add that)
  • Need to understand more about why the wikipedia results are what they are -- cause there is it seems still based on various config settings

That said this is approaching the point where if nothing else, it does no further harm and seems to be an improvement in most cases. Need a bit more work, but closing in.

@hannahhoward hannahhoward force-pushed the feat/refactor-sessions branch 3 times, most recently from 0666ee8 to 16f00de Compare December 13, 2018 19:10
@hannahhoward hannahhoward force-pushed the feat/speed-up-sessions branch 2 times, most recently from d955a72 to 1fcf6a7 Compare December 17, 2018 16:59
@hannahhoward hannahhoward changed the base branch from feat/refactor-sessions to master December 17, 2018 17:03
@ghost ghost removed the status/in-progress In progress label Dec 17, 2018
@hannahhoward hannahhoward reopened this Dec 17, 2018
@ghost ghost added the status/in-progress In progress label Dec 17, 2018
@ghost ghost removed the status/in-progress In progress label Dec 17, 2018
@hannahhoward hannahhoward reopened this Dec 17, 2018
@ghost ghost added the status/in-progress In progress label Dec 17, 2018
@hannahhoward hannahhoward changed the title WIP: Speed up sessions Speed up sessions Round #1 Dec 17, 2018
@hannahhoward
Copy link
Contributor Author

I've removed the WIP label on this cause I think it can be merged in its current state and is ready for review.

spm.tagPeer(p)
} else {
if isOptimized {
if spm.optimizedPeersArr[0] == p {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put this in function

}
}

func (s *Session) updateReceiveCounters(ctx context.Context, blk blkRecv) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move RecordUniqueBlock to receiveBlock

Copy link

@eingenito eingenito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hannah and I walked through this PR together and it's complicated, in large part because it's building additional smarts on top of the existing (somewhat complicated) sessions implementation. There's a good chance that simpler splitting implementations will be effective. But this implementation offers some solid benefits in some real world testing.

Order optimized peers by most recent to receive a block
Reduce duplicates through splits of requests
As soon as peers appear, consume all of the want budget
Move the job of splitting requests to its own package
@hannahhoward
Copy link
Contributor Author

FYI: Something is wrong with Jenkins for this branch -- I cannot get anything to build any more. If I checkout a seperate branch based on the latest commit from this branch, and push it, it builds and passes. I dunno what's wrong, but I am just going to continue this review until people LGTM it, and then create a seperate PR at the end to verify build and merge. Ug.

@hannahhoward
Copy link
Contributor Author

@Stebalien + @magik6k would love a review from either of you before this gets merged. If it's complicated to read, I can pair with you. If it feels two complicated, and needs to be simpler in order to merge, that's feedback I can work off and improve. I'm going to do that anyway but am about to go away for two weeks so I'm trying to get something merged before I leave :)

@hannahhoward
Copy link
Contributor Author

& @magik6k if you have any insight into what's up with Jenkins I'd super duper appreciate it!

@magik6k
Copy link
Member

magik6k commented Dec 21, 2018

stderr: error: merge is not possible because you have unmerged files.

Rebase should help

@hannahhoward hannahhoward force-pushed the feat/speed-up-sessions branch 3 times, most recently from a2c6bad to 02cabda Compare December 22, 2018 00:16
Encapsulate functions for readability, and move code for understanding
@hannahhoward
Copy link
Contributor Author

merging, per @eingenito

@hannahhoward hannahhoward merged commit 43c65d4 into master Dec 22, 2018
@ghost ghost removed the status/in-progress In progress label Dec 22, 2018
@whyrusleeping
Copy link
Member

Stated changes sound good to me! I havent read the code in detail, but it looks fairly clean and well tested. I'd still like one of us to get a review in on it at some point, but i'm happy to get things pushed through if youre confident in them.

Also, you can refer to me as @whyrusleeping ;) I try to keep human names off github

Jorropo pushed a commit to Jorropo/go-libipfs that referenced this pull request Jan 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants