Skip to content

Toward release 0.5.0 - fast TDigest and Spark-3 Aggregator API#20

Merged
erikerlandson merged 69 commits into
isarn:developfrom
erikerlandson:dev-0.5.0
Sep 15, 2020
Merged

Toward release 0.5.0 - fast TDigest and Spark-3 Aggregator API#20
erikerlandson merged 69 commits into
isarn:developfrom
erikerlandson:dev-0.5.0

Conversation

@erikerlandson
Copy link
Copy Markdown
Member

The goal for 0.5.0 is to adopt the new faster TDigest from isarn-sketches, and simultaneously take advantage of the improvements to User Defined Aggregation in Spark 3.0. I am expecting the combination of these two improvements to approach a 150x performance increase.

@erikerlandson erikerlandson mentioned this pull request Jun 26, 2020
3 tasks
@erikerlandson
Copy link
Copy Markdown
Member Author

for posterity: apache/spark#28983

@erikerlandson
Copy link
Copy Markdown
Member Author

This is ready to merge and publish but I'm going to wait until spark 3.0.1 for proper array type support in aggregations (SPARK-32159)

@erikerlandson erikerlandson merged commit e7d3136 into isarn:develop Sep 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant