Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore pacing-gain to understand early termination #315

Open
gfr10598 opened this issue Jul 29, 2020 · 10 comments
Open

Explore pacing-gain to understand early termination #315

gfr10598 opened this issue Jul 29, 2020 · 10 comments

Comments

@gfr10598
Copy link
Contributor

We have a lot of ndt7 data now, and should explore the pacing-gain behavior, to understand how well early termination could work based on pacing-gain, and also verify whether there is an earlier fixed termination time that would be effective for 95+ percent of clients.

@gfr10598
Copy link
Contributor Author

gfr10598 commented Jul 29, 2020

https://console.cloud.google.com/bigquery?sq=581276032543:5a15dedd9d1a4133b498357ddc54929f

Looks like pacing gain usually drops below 1.25 within less than 5 seconds. Need more analysis.

@gfr10598
Copy link
Contributor Author

gfr10598 commented Jul 29, 2020

WRONG - bug in SQL

ok1 ok2 ok5 ok10 total
3988220 4233782 4254709 4275755 4275755
-- -- -- -- --

92% converge within 1 second
99% converge within 2 seconds
99.5% converge within 5 seconds
100% converge within 9-13 second test duration.

@gfr10598
Copy link
Contributor Author

ok1 ok2 ok5 ok10 total
2226638 3189769 3990049 4184933 4249948

Of those that converge:
50% converge with 0.8 seconds
90% converge within 3.3 seconds
95% converge within 4.8 seconds
99% converge within 8.1 seconds

@gfr10598
Copy link
Contributor Author

gfr10598 commented Jul 29, 2020

The average convergence time is about 1.4 seconds, and average BytesAcked at convergence is about 69MB.
Very fast tests tend to converge faster. Slower tests tend to converge slower. Tests that converge in more than 1 second converge after, on average, about 50MB of transfer.

Spreadsheet (google.com)

@gfr10598
Copy link
Contributor Author

Found another SQL bug. The spreadsheet has been updated.
The average BytesAcked trends up with convergence time, and averages about 5.5 MBytes. (log mean around 1.7MB)

The worst 5%, with convergence time > 4.8 seconds, average 28 MBytes (log mean 4MB), and the latest converging average around 40 to 60 MBytes.

The BBRInfo.MinRTT averages around 1 msec (log mean 0.88) for the fastest converging tests, and around 200 msec for tests that take 6 seconds or more to converge.

On average, it looks like it takes around 25 to 30 minRTT to converge, but sometimes as few as 5 or 10, and sometimes 1000s of MinRTTs.

@gfr10598
Copy link
Contributor Author

gfr10598 commented Jul 30, 2020

Median number of round trips to convergence is around 25 for fastest convergence, up to 50 for the slowest convergence.
Query

@gfr10598
Copy link
Contributor Author

gfr10598 commented Jul 30, 2020

I've refined the query to actually look for first two crossings, from <=1 to <1, and from <1 to >=1. This changes the results slightly, but not dramatically.
NOTE: With this convergence metric, only about 3/4 of the ndt7 tests reach convergence. For another 10%, the PacingGain drops below 1.0, but does not cross 1.0 again.

Spreadsheet (google.com only)
Query

@gfr10598
Copy link
Contributor Author

gfr10598 commented Jul 31, 2020

BQ connected spreadsheet comparing throughput, BBR_BW at convergence, and BBR_BW at 10 sec.

Convergence vs speed

Screen Shot 2020-07-30 at 8 32 31 PM

@gfr10598
Copy link
Contributor Author

gfr10598 commented Aug 4, 2020

Convergence Time CDF

@laiyi-ohlsen
Copy link

Needs design discussion with @pboothe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants