Skip to content

Percentiles aggregation #5323

@jpountz

Description

@jpountz

A percentiles aggregation would allow to compute (approximate) values of arbitrary percentiles based on the t-digest algorithm. Computing exact percentiles is not reasonably feasible as it would require shards to stream all values to the node that coordinates search execution, which could be gigabytes on a high-cardinality field. On the other hand, t-digest allows to trade accuracy for memory by trying to summarize the set of values that have been accumulated with interesting properties/features:

  • compression is configurable, meaning that if you can configure it to have better accuracy at the cost of a higher memory usage,
  • accuracy is excellent for extreme percentiles,
  • percentiles are going to be accurate if few values were accumulated.

Example:

{
    "aggs" : {
        "load_time_outlier" : {
            "percentiles" : {
                "field" : "load_time",
                "percents" : [95, 99, 99.9] 
            }
        }
    }
}

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions