Skip to content

Add t-digest functions to Presto#13940

Merged
rongrong merged 2 commits intoprestodb:masterfrom
tdcmeehan:add_tdigest_tmeehanrebase
Jan 22, 2020
Merged

Add t-digest functions to Presto#13940
rongrong merged 2 commits intoprestodb:masterfrom
tdcmeehan:add_tdigest_tmeehanrebase

Conversation

@tdcmeehan
Copy link
Contributor

@tdcmeehan tdcmeehan commented Jan 8, 2020

== RELEASE NOTES ==

General Changes
* Add tdigest_agg, merge(tdigest), value_at_quantile(tdigest, quantile), values_at_quantiles(tdigest, quantiles), quantile_at_value(tdigest, quantile), quantiles_at_values(tdigest, quantile) for creating, merging, and querying t-digests

SPI Changes
* isHidden attribute on AggregationFunction and ScalarFunction removed and replaced with visibility

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jan 8, 2020

CLA Check
The committers are authorized under a signed CLA.

  • ✅ Timothy Meehan (e69e879b264a4d345dcefadb9b7cf8f9c9df255b, b41ab1c6e11ffcad85dc5a604ed6c81a85aed03a)

@tdcmeehan tdcmeehan force-pushed the add_tdigest_tmeehanrebase branch 2 times, most recently from 706cb75 to 7e2e032 Compare January 9, 2020 22:11
@tdcmeehan tdcmeehan changed the title [WIP] Add t-digest functions to Presto <<<FIXUPS AFTER REBASE>>> Add t-digest functions to Presto Jan 9, 2020
@tdcmeehan tdcmeehan requested a review from rongrong January 9, 2020 23:31
Copy link
Contributor

@rongrong rongrong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good. Didn't check the first commit. Let me know if you think I should take a look at that again.

@wenleix
Copy link
Contributor

wenleix commented Jan 13, 2020

@tdcmeehan tdcmeehan force-pushed the add_tdigest_tmeehanrebase branch 5 times, most recently from a6e2903 to 6e77a33 Compare January 15, 2020 22:33
@tdcmeehan tdcmeehan requested a review from rongrong January 17, 2020 05:25
@wenleix
Copy link
Contributor

wenleix commented Jan 21, 2020

Assume @rongrong has no further comment, @tdcmeehan can you rebase it so @rongrong or I can merge it?

@tdcmeehan
Copy link
Contributor Author

@rongrong do you also approve Migrate function isHidden to new visibility enum?

@tdcmeehan tdcmeehan force-pushed the add_tdigest_tmeehanrebase branch from 6e77a33 to 731953a Compare January 21, 2020 16:47
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering, are we going to have "alpha" functions? Is it better to name the value "experimental" as explained in the comment?

@rongrong
Copy link
Contributor

@rongrong do you also approve Migrate function isHidden to new visibility enum?

Yes the commit looks good in general. Please think about the naming. Whether it makes more sense to call it EXPERIMENTAL. Thanks for working on this! ^_^

@tdcmeehan tdcmeehan force-pushed the add_tdigest_tmeehanrebase branch 3 times, most recently from ef122b9 to 43451a0 Compare January 22, 2020 03:40
tdcmeehan and others added 2 commits January 21, 2020 19:41
SqlFunctionVisibility can take three states:
- public
- hidden
- beta

Public visibility is equivalent to isHidden=false and hidden visibility
is equivalent to isHidden=true.  Beta visibility by default is
equivalent to hidden, however can be toggled by included session and
system properties to make it public.
Add scalar and aggregation functions to Presto repo. Scalar functions
allow retrieving values at a given quantile and quantiles at a given
value. Aggregation functions allow creating a t-digest from a block
of doubles and merging a block of t-digests.

Benchmark testing results for aggregation functions displayed below:

BenchmarkDoubleStatisticalDigestAggregationFunctions.benchmarkQuantileDigestAggregation	1	avgt	20	413.63	± 16.67	ms/op
BenchmarkDoubleStatisticalDigestAggregationFunctions.benchmarkTDigestAggregation	1	avgt	20	118.57	±  2.25	ms/op
BenchmarkDoubleStatisticalDigestAggregationFunctions.benchmarkQuantileDigestAggregation	2	avgt	20	801.19	± 10.97	ms/op
BenchmarkDoubleStatisticalDigestAggregationFunctions.benchmarkTDigestAggregation	2	avgt	20	242.89	±  9.71	ms/op
BenchmarkDoubleStatisticalDigestAggregationFunctions.benchmarkQuantileDigestAggregation	3	avgt	20     1156.49	± 14.87	ms/op
BenchmarkDoubleStatisticalDigestAggregationFunctions.benchmarkTDigestAggregation	3	avgt	20	359.56	± 11.72	ms/op

Co-authored-by: Gaston Arguindegui <gastonar@fb.com>
@tdcmeehan tdcmeehan force-pushed the add_tdigest_tmeehanrebase branch from 43451a0 to 222936d Compare January 22, 2020 03:41
@rongrong rongrong merged commit fb37c6e into prestodb:master Jan 22, 2020
@caithagoras caithagoras mentioned this pull request Feb 20, 2020
8 tasks
amellnik added a commit to amellnik/presto that referenced this pull request Nov 30, 2021
This adds documentation for the t-digest type (prestodb#15674), similar to
what already exists for the QDigest type. The t-digest library was
added in prestodb#12961 and functions to use it were added in prestodb#13940.
highker pushed a commit that referenced this pull request Dec 16, 2021
This adds documentation for the t-digest type (#15674), similar to
what already exists for the QDigest type. The t-digest library was
added in #12961 and functions to use it were added in #13940.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants