Benchmarks to compare autohisto with date histogram #40

pcsanwald · 2018-04-02T17:28:54Z

This PR pairs with this autohisto PR and adds two aggregations for comparison, which @colings86 suggested.

To test, I ran rally locally using esrally --track-path=/Users/paulsanwald/Code/rally-tracks/nyc_taxis. Using the full NYC taxi data set on my local laptop gave a few errors that I was unable to reproduce using a subset of the dataset, adjusting the number of buckets to correspond to date_histogram to ensure we're doing an apples to apples comparison. Here's the results from a local run:

|   All |                Min Throughput |       autohisto_agg |       2.01 |  ops/s |
|   All |             Median Throughput |       autohisto_agg |       2.02 |  ops/s |
|   All |                Max Throughput |       autohisto_agg |       2.04 |  ops/s |
|   All |       50th percentile latency |       autohisto_agg |    7.04785 |     ms |
|   All |       90th percentile latency |       autohisto_agg |    9.29401 |     ms |
|   All |       99th percentile latency |       autohisto_agg |    9.47358 |     ms |
|   All |      100th percentile latency |       autohisto_agg |    9.50069 |     ms |
|   All |  50th percentile service time |       autohisto_agg |    4.03652 |     ms |
|   All |  90th percentile service time |       autohisto_agg |    4.29146 |     ms |
|   All |  99th percentile service time |       autohisto_agg |    5.12963 |     ms |
|   All | 100th percentile service time |       autohisto_agg |    6.79592 |     ms |
|   All |                    error rate |       autohisto_agg |          0 |      % |
|   All |                Min Throughput |  date_histogram_agg |       2.01 |  ops/s |
|   All |             Median Throughput |  date_histogram_agg |       2.02 |  ops/s |
|   All |                Max Throughput |  date_histogram_agg |       2.04 |  ops/s |
|   All |       50th percentile latency |  date_histogram_agg |    8.33197 |     ms |
|   All |       90th percentile latency |  date_histogram_agg |    8.89726 |     ms |
|   All |       99th percentile latency |  date_histogram_agg |    9.02826 |     ms |
|   All |      100th percentile latency |  date_histogram_agg |    9.03734 |     ms |
|   All |  50th percentile service time |  date_histogram_agg |    3.80297 |     ms |
|   All |  90th percentile service time |  date_histogram_agg |    3.93656 |     ms |
|   All |  99th percentile service time |  date_histogram_agg |    4.38513 |     ms |
|   All | 100th percentile service time |  date_histogram_agg |     4.5412 |     ms |
|   All |                    error rate |  date_histogram_agg |          0 |      % |

new date autohisto aggregation.

danielmitterdorfer

The changes look fine to me. I just have a few comments:

Using the full NYC taxi data set on my local laptop gave a few errors that I was unable to reproduce using a subset of the dataset, adjusting the number of buckets to correspond to date_histogram to ensure we're doing an apples to apples comparison.

Does this mean that with the full data set there will be errors? I read the discussion in elastic/elasticsearch#28993 and as far as I understand it, there will be errors with the full data set. In that case I think it might make sense to combine it e.g. with a query to reduce the amount of data and avoid errors?

My understanding is also that we need to wait merging this until the Elasticsearch PR is available on master.

colings86 · 2018-04-03T17:10:39Z

@pcsanwald I'm interested what the errors were with the full dataset? Were they the bucket limit errors we are discussing in elastic/elasticsearch#28993?

Otherwise, could we add a second pair of operations that run a sub-aggregation under the histogram? Something like a terms agg on a low cardinality field with an avg fare under it? The auto_date_histogram runs in breadth first mode so the evaluation of the sub aggregations is delayed until after collection is finished so I am interested in what affect this has on the performance.

@danielmitterdorfer you are right, this will need to be merged once elastic/elasticsearch#28993 is in master otherwise your nightly benchmarks will start failing

pcsanwald · 2018-04-03T18:30:03Z

@colings86 @danielmitterdorfer I was struggling to debug the error in rally (couldn't find the actual error messages), and wasn't able to load the full dataset on a vanilla ES server running locally (which I was also struggling to debug), which led me to suspect that perhaps the problem was with running a local setup.

I like the idea of limiting the query further and will update the PR and re-test.

danielmitterdorfer · 2018-04-04T06:38:37Z

@colings86 @danielmitterdorfer I was struggling to debug the error in rally (couldn't find the actual error messages), and wasn't able to load the full dataset on a vanilla ES server running locally (which I was also struggling to debug), which led me to suspect that perhaps the problem was with running a local setup.

One tip @pcsanwald: By default, Rally will just continue on errors and you will only see that something bad has happened if you look at the "error rate". However, you can enforce a more strict mode by specifying --on-error=abort on the command line. In that case you should get a more detailed error message as soon as the first error occurs and the benchmark is stopped. Should the error message not be helpful enough, please just tell here and we'll see what we can do to improve it.

pcsanwald · 2018-04-04T14:27:14Z

thanks @danielmitterdorfer, the part I was struggling with was how to get a more specific stacktrace for runs where the error rate for a query was 100%, I was looking in the logs, and couldn't find the errors. I'll use --on-error=abort which will hopefully also speed up the dev cycle for this :)

pcsanwald · 2018-04-05T21:25:50Z

Pushed changes to limit by query as @danielmitterdorfer @colings86 suggested, re-ran successfully with full NYC taxi dataset, results here:

|   All |                  Min Throughput |       autohisto_agg |      1.96 |  ops/s |
|   All |               Median Throughput |       autohisto_agg |         2 |  ops/s |
|   All |                  Max Throughput |       autohisto_agg |      2.01 |  ops/s |
|   All |         50th percentile latency |       autohisto_agg |   531.618 |     ms |
|   All |         90th percentile latency |       autohisto_agg |   964.313 |     ms |
|   All |         99th percentile latency |       autohisto_agg |   1020.03 |     ms |
|   All |        100th percentile latency |       autohisto_agg |   1077.56 |     ms |
|   All |    50th percentile service time |       autohisto_agg |   465.907 |     ms |
|   All |    90th percentile service time |       autohisto_agg |   522.152 |     ms |
|   All |    99th percentile service time |       autohisto_agg |   618.826 |     ms |
|   All |   100th percentile service time |       autohisto_agg |   788.446 |     ms |
|   All |                      error rate |       autohisto_agg |         0 |      % |
|   All |                  Min Throughput |  date_histogram_agg |      1.88 |  ops/s |
|   All |               Median Throughput |  date_histogram_agg |       1.9 |  ops/s |
|   All |                  Max Throughput |  date_histogram_agg |      1.99 |  ops/s |
|   All |         50th percentile latency |  date_histogram_agg |   3079.44 |     ms |
|   All |         90th percentile latency |  date_histogram_agg |   4391.78 |     ms |
|   All |         99th percentile latency |  date_histogram_agg |   5058.75 |     ms |
|   All |        100th percentile latency |  date_histogram_agg |   5073.23 |     ms |
|   All |    50th percentile service time |  date_histogram_agg |   539.676 |     ms |
|   All |    90th percentile service time |  date_histogram_agg |   589.096 |     ms |
|   All |    99th percentile service time |  date_histogram_agg |   669.835 |     ms |
|   All |   100th percentile service time |  date_histogram_agg |   692.903 |     ms |
|   All |                      error rate |  date_histogram_agg |         0 |      % |

danielmitterdorfer · 2018-04-06T07:06:53Z

Changes look fine to me. One remark: Your target throughput seems a bit too high for your hardware but we can adjust this for our nightly hardware once you have merged (more background info on target throughput and latency, and also my ElasticON talk starting at around 11:30 into the video / slide 15).

danielmitterdorfer · 2018-07-27T04:56:12Z

@pcsanwald is there still interest in merging this PR?

pcsanwald · 2018-07-30T18:23:50Z

@danielmitterdorfer - yes, especially now that the auto-interval histogram stuff is merged. let me "dust this off" and re-test, etc.

danielmitterdorfer · 2018-07-31T04:41:35Z

Great! Thank you Paul.

pcsanwald · 2018-09-17T13:29:55Z

@danielmitterdorfer I re-ran this locally and am happy with the results for this dataset; I might also add a query over geonames in order to benchmark cases where we have sparse fields (and thus lots of 0 buckets in the aggregation result), but I will open a separate PR for that.

danielmitterdorfer · 2018-09-17T13:31:38Z

@pcsanwald you're free to merge this at any time. It will automatically be run in our nightly benchmarks as soon as it is merged but we need to create a new chart for it (I'll take care of that).

danielmitterdorfer · 2018-09-17T13:55:31Z

I've added charts now for our nightly and release benchmarks. Thanks for your PR @pcsanwald!

Paul Sanwald added 2 commits March 30, 2018 14:05

Add test cases to compare performance between date_histogram and

2dd66cb

new date autohisto aggregation.

Fix odd tab issue

874fd69

pcsanwald requested review from danielmitterdorfer, colings86 and dliappis April 2, 2018 17:29

danielmitterdorfer approved these changes Apr 3, 2018

View reviewed changes

constrain buckets returned

995737f

pcsanwald merged commit 1b05cc8 into elastic:master Sep 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks to compare autohisto with date histogram #40

Benchmarks to compare autohisto with date histogram #40

pcsanwald commented Apr 2, 2018

danielmitterdorfer left a comment

colings86 commented Apr 3, 2018

pcsanwald commented Apr 3, 2018 •

edited

Loading

danielmitterdorfer commented Apr 4, 2018

pcsanwald commented Apr 4, 2018

pcsanwald commented Apr 5, 2018

danielmitterdorfer commented Apr 6, 2018

danielmitterdorfer commented Jul 27, 2018

pcsanwald commented Jul 30, 2018

danielmitterdorfer commented Jul 31, 2018

pcsanwald commented Sep 17, 2018

danielmitterdorfer commented Sep 17, 2018

danielmitterdorfer commented Sep 17, 2018

Benchmarks to compare autohisto with date histogram #40

Benchmarks to compare autohisto with date histogram #40

Conversation

pcsanwald commented Apr 2, 2018

danielmitterdorfer left a comment

Choose a reason for hiding this comment

colings86 commented Apr 3, 2018

pcsanwald commented Apr 3, 2018 • edited Loading

danielmitterdorfer commented Apr 4, 2018

pcsanwald commented Apr 4, 2018

pcsanwald commented Apr 5, 2018

danielmitterdorfer commented Apr 6, 2018

danielmitterdorfer commented Jul 27, 2018

pcsanwald commented Jul 30, 2018

danielmitterdorfer commented Jul 31, 2018

pcsanwald commented Sep 17, 2018

danielmitterdorfer commented Sep 17, 2018

danielmitterdorfer commented Sep 17, 2018

pcsanwald commented Apr 3, 2018 •

edited

Loading