Skip to content

Conversation

@nik9000
Copy link
Member

@nik9000 nik9000 commented May 7, 2020

This wires auto_date_histogram into the rounding optimization that I
built in #55559. This is should significantly speed up any
auto_date_histograms with time_zones on them.

This wires `auto_date_histogram` into the rounding optimization that I
built in elastic#55559. This is should significantly speed up any
`auto_date_histogram`s with `time_zone`s on them.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 7, 2020
@nik9000
Copy link
Member Author

nik9000 commented May 7, 2020

I'm not actually sure how to make a proper unit test that this is "plugged in". I will add some benchmark results with it eventually though. My desktop is currently busy benchmarking #56371.

Copy link
Member

@not-napoleon not-napoleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@nik9000
Copy link
Member Author

nik9000 commented May 9, 2020

I've finally finished the benchmarks. When there is a time zone and the index contains a daylight savings time transition this cuts the runtime of auto_date_histogram by 65%. My benchmarks don't show the case where there isn't a transition but it's probably another ~30% faster again, if this is anything like date_histogram.

Before:

|                                                 Min Throughput |                      auto_date_histogram |         0.1 |  ops/s |
|                                              Median Throughput |                      auto_date_histogram |         0.1 |  ops/s |
|                                                 Max Throughput |                      auto_date_histogram |         0.1 |  ops/s |
|                                        50th percentile latency |                      auto_date_histogram |     11464.8 |     ms |
|                                        90th percentile latency |                      auto_date_histogram |       13392 |     ms |
|                                       100th percentile latency |                      auto_date_histogram |     13507.1 |     ms |
|                                   50th percentile service time |                      auto_date_histogram |     10032.3 |     ms |
|                                   90th percentile service time |                      auto_date_histogram |     10167.2 |     ms |
|                                  100th percentile service time |                      auto_date_histogram |     10735.9 |     ms |
|                                                     error rate |                      auto_date_histogram |           0 |      % |
|                                                 Min Throughput |              auto_date_histogram_with_tz |        0.03 |  ops/s |
|                                              Median Throughput |              auto_date_histogram_with_tz |        0.03 |  ops/s |
|                                                 Max Throughput |              auto_date_histogram_with_tz |        0.03 |  ops/s |
|                                        50th percentile latency |              auto_date_histogram_with_tz |     34068.5 |     ms |
|                                        90th percentile latency |              auto_date_histogram_with_tz |     34415.6 |     ms |
|                                       100th percentile latency |              auto_date_histogram_with_tz |     34798.8 |     ms |
|                                   50th percentile service time |              auto_date_histogram_with_tz |     34067.6 |     ms |
|                                   90th percentile service time |              auto_date_histogram_with_tz |     34414.5 |     ms |
|                                  100th percentile service time |              auto_date_histogram_with_tz |     34798.2 |     ms |
|                                                     error rate |              auto_date_histogram_with_tz |           0 |      % |

After:

|                                                 Min Throughput |                      auto_date_histogram |         0.1 |  ops/s |
|                                              Median Throughput |                      auto_date_histogram |         0.1 |  ops/s |
|                                                 Max Throughput |                      auto_date_histogram |         0.1 |  ops/s |
|                                        50th percentile latency |                      auto_date_histogram |     24352.4 |     ms |
|                                        90th percentile latency |                      auto_date_histogram |     33612.9 |     ms |
|                                       100th percentile latency |                      auto_date_histogram |     35688.8 |     ms |
|                                   50th percentile service time |                      auto_date_histogram |     10414.5 |     ms |
|                                   90th percentile service time |                      auto_date_histogram |     10539.9 |     ms |
|                                  100th percentile service time |                      auto_date_histogram |     10809.3 |     ms |
|                                                     error rate |                      auto_date_histogram |           0 |      % |
|                                                 Min Throughput |              auto_date_histogram_with_tz |        0.03 |  ops/s |
|                                              Median Throughput |              auto_date_histogram_with_tz |        0.03 |  ops/s |
|                                                 Max Throughput |              auto_date_histogram_with_tz |        0.03 |  ops/s |
|                                        50th percentile latency |              auto_date_histogram_with_tz |     14550.4 |     ms |
|                                        90th percentile latency |              auto_date_histogram_with_tz |     14969.7 |     ms |
|                                       100th percentile latency |              auto_date_histogram_with_tz |       15375 |     ms |
|                                   50th percentile service time |              auto_date_histogram_with_tz |     14531.7 |     ms |
|                                   90th percentile service time |              auto_date_histogram_with_tz |     14949.2 |     ms |
|                                  100th percentile service time |              auto_date_histogram_with_tz |     15354.9 |     ms |
|                                                     error rate |              auto_date_histogram_with_tz |           0 |      % |

Note: The "after" numbers for auto_date_histogram without a time zone are gnarly because I'm still working on dialing in the throughput to target.

@nik9000
Copy link
Member Author

nik9000 commented May 9, 2020

Note: The "after" numbers for auto_date_histogram without a time zone are gnarly because I'm still working on dialing in the throughput to target.

Actually both of them aren't quite right. But they can give you a sense that this particular auto_date_histogram is about 10 seconds without a time zone. When there is a time zone it is about 35 seconds before this PR and 15 after it. And it'd probably be about 10 seconds if there isn't a daylight savings time transition across the index.

@nik9000 nik9000 merged commit 12e9218 into elastic:master May 9, 2020
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request May 9, 2020
This wires `auto_date_histogram` into the rounding optimization that I
built in elastic#55559. This is should significantly speed up any
`auto_date_histogram`s with `time_zone`s on them.
nik9000 added a commit that referenced this pull request May 9, 2020
This wires `auto_date_histogram` into the rounding optimization that I
built in #55559. This is should significantly speed up any
`auto_date_histogram`s with `time_zone`s on them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v7.9.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants