Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impl convert_to_state for GroupsAccumulatorAdapter (faster median for high cardinality aggregates) #11827

Merged
merged 13 commits into from
Sep 15, 2024

Conversation

Rachelint
Copy link
Contributor

@Rachelint Rachelint commented Aug 5, 2024

Which issue does this PR close?

Closes #11819

Rationale for this change

See #11819

What changes are included in this PR?

Impl convert_to_state for GroupsAccumulatorAdapter.

Are these changes tested?

Test by exists.

Are there any user-facing changes?

No.

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions labels Aug 5, 2024
@@ -342,6 +374,50 @@ impl GroupsAccumulator for GroupsAccumulatorAdapter {
fn size(&self) -> usize {
self.allocation_bytes
}

fn convert_to_state(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Rachelint

This might be interesting to you: #11825

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Rachelint

This might be interesting to you: #11825

Seems interesting, learning this pr (still not so familiar with arrow ops).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it takes some getting used to thinking in terms of Arrays and masks, etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for delay, fixed.

Copy link
Contributor Author

@Rachelint Rachelint Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found the quick way using filtered_null_mask + set_nulls will just set the filtered row to be null, but not change the row num.

//  `filtered_null_mask` + `set_nulls`
  left: PrimitiveArray<Int32>
[
  null,
  null,
  6,
  null,
  null,
  null,
]

 // `compute::filter`
 right: PrimitiveArray<Int32>
[
  null,
  6,
]

Maybe this will make difference to the correctness for some accumulators? For example, an udf count which thinks a null row as 1?

@Rachelint Rachelint force-pushed the support-convert-to-state-for-adapter branch from f29d0bb to 76f5aba Compare August 13, 2024 15:06
@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions and removed logical-expr Logical plan and expressions labels Aug 13, 2024
@Rachelint Rachelint force-pushed the support-convert-to-state-for-adapter branch from 76f5aba to be5316f Compare September 4, 2024 15:38
@github-actions github-actions bot removed the physical-expr Physical Expressions label Sep 4, 2024
@Rachelint Rachelint force-pushed the support-convert-to-state-for-adapter branch from be5316f to 0396fc4 Compare September 4, 2024 15:39
@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) and removed sqllogictest SQL Logic Tests (.slt) labels Sep 4, 2024
@github-actions github-actions bot removed the functions label Sep 5, 2024
@Rachelint Rachelint marked this pull request as ready for review September 5, 2024 04:26
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Rachelint -- this looks very cool . I am sorry for the delay in the review

It is my understanding that this will allow aggregates that do not yet implement GroupsAccumulator to benefit from the intermediate aggregate state.

Thus the primary benefit of this code is to make aggregates on such queries faster.

Unfortunately I don't think we have any examples of such aggregates in the benchmarks (e.g. calculating median or approx_median). I will make a PR to add some to see if we can measure improvement of this PR

cc @korowa

@Rachelint
Copy link
Contributor Author

Thank you @Rachelint -- this looks very cool . I am sorry for the delay in the review

It is my understanding that this will allow aggregates that do not yet implement GroupsAccumulator to benefit from the intermediate aggregate state.

Thus the primary benefit of this code is to make aggregates on such queries faster.

Unfortunately I don't think we have any examples of such aggregates in the benchmarks (e.g. calculating median or approx_median). I will make a PR to add some to see if we can measure improvement of this PR

cc @korowa

Sounds great! And we can continue to improve the performance after having such benchmarks.

@alamb
Copy link
Contributor

alamb commented Sep 11, 2024

I created a proposal in #12438

I tested a little locally on the PR here and it seems like this PR does not improve the performance much.

@Rachelint
Copy link
Contributor Author

I created a proposal in #12438

I tested a little locally on the PR here and it seems like this PR does not improve the performance much.

Ok, I will check it in my local soon.

@alamb
Copy link
Contributor

alamb commented Sep 11, 2024

I created a proposal in #12438
I tested a little locally on the PR here and it seems like this PR does not improve the performance much.

Ok, I will check it in my local soon.

It may be that the query doesn't show the correct pattern of high cardinality intermediate aggregates, btw. I am not sure

@Rachelint
Copy link
Contributor Author

Rachelint commented Sep 12, 2024

I found I can' t run it successfully in my local... #12438
The memory usage get higher and higher, and swap seems to be triggered, and make it even more slower... Finally, it almost can't finish forever...
I am trying to find the reason.

@alamb
Copy link
Contributor

alamb commented Sep 12, 2024

I am trying to find the reason.

It is probably because there are 17M groups which each is holding multiple aggregates each with non trivial state 🤔

@Rachelint
Copy link
Contributor Author

Rachelint commented Sep 12, 2024

I am trying to find the reason.

It is probably because there are 17M groups which each is holding multiple aggregates each with non trivial state 🤔

Yes... I found the state is much larger than the simple accumulators(e.g. count, sum, average).
For example, the digest for approx_percentile_cont:

pub struct TDigest {
    centroids: Vec<Centroid>,
    max_size: usize,
    sum: f64,
    count: u64,
    max: f64,
    min: f64,
}

@Rachelint
Copy link
Contributor Author

I create a subset of hits_partitioned, and using this smaller dataset, it can success to run the q32 like high cardinality query now.

SELECT "WatchID", "ClientIP", COUNT(*) AS c, approx_median("ResponseStartTiming") tmed FROM hits GROUP BY "WatchID", "ClientIP" ORDER BY c DESC LIMIT 10;

@Rachelint
Copy link
Contributor Author

This is the number from my local test with a subset(15%) of hits_partitioned:

// Main(3ece7a736193)
Q5: SELECT "WatchID", "ClientIP", COUNT(*) AS c, approx_median("ResponseStartTiming") tmed FROM hits GROUP BY "WatchID", "ClientIP" ORDER BY c DESC LIMIT 10;
Query 5 iteration 0 took 17567.8 ms and returned 10 rows
Query 5 iteration 1 took 17462.8 ms and returned 10 rows
Query 5 iteration 2 took 17442.4 ms and returned 10 rows
Query 5 iteration 3 took 17569.0 ms and returned 10 rows
Query 5 iteration 4 took 17527.8 ms and returned 10 rows

// This pr's branch(have rebased 3ece7a736193)
Q5: SELECT "WatchID", "ClientIP", COUNT(*) AS c, approx_median("ResponseStartTiming") tmed FROM hits GROUP BY "WatchID", "ClientIP" ORDER BY c DESC LIMIT 10;
Query 5 iteration 0 took 12093.5 ms and returned 10 rows
Query 5 iteration 1 took 12234.2 ms and returned 10 rows
Query 5 iteration 2 took 11951.3 ms and returned 10 rows
Query 5 iteration 3 took 12066.6 ms and returned 10 rows
Query 5 iteration 4 took 11963.1 ms and returned 10 rows

@alamb
Copy link
Contributor

alamb commented Sep 13, 2024

I just ran this with some new queries on #12438 and this branch goes about 2x faster

  • Elapsed 9.777 seconds. --> Elapsed 5.361 seconds.
  • Elapsed 6.919 seconds. --> Elapsed 3.952 seconds.
./datafusion-cli-intermediate-state   -f tq1.sql
DataFusion CLI v41.0.0
0 row(s) fetched.
Elapsed 0.019 seconds.

+-------------+---------------------+---+------+------+------+
| ClientIP    | WatchID             | c | tmin | tp95 | tmax |
+-------------+---------------------+---+------+------+------+
| 1611957945  | 6655575552203051303 | 2 | 0    | 0    | 0    |
| -1402644643 | 8566928176839891583 | 2 | 0    | 0    | 0    |
+-------------+---------------------+---+------+------+------+
2 row(s) fetched.
Elapsed 5.361 seconds.

+-------------+---------------------+---+------+------+------+
| ClientIP    | WatchID             | c | tmin | tmed | tmax |
+-------------+---------------------+---+------+------+------+
| 1611957945  | 6655575552203051303 | 2 | 0    | 0    | 0    |
| -1402644643 | 8566928176839891583 | 2 | 0    | 0    | 0    |
+-------------+---------------------+---+------+------+------+
2 row(s) fetched.
Elapsed 3.952 seconds.

andrewlamb@Andrews-MacBook-Pro-2 Downloads % datafusion-cli    -f tq1.sql
datafusion-cli    -f tq1.sql
DataFusion CLI v41.0.0
0 row(s) fetched.
Elapsed 0.020 seconds.

+-------------+---------------------+---+------+------+------+
| ClientIP    | WatchID             | c | tmin | tp95 | tmax |
+-------------+---------------------+---+------+------+------+
| 1611957945  | 6655575552203051303 | 2 | 0    | 0    | 0    |
| -1402644643 | 8566928176839891583 | 2 | 0    | 0    | 0    |
+-------------+---------------------+---+------+------+------+
2 row(s) fetched.
Elapsed 9.777 seconds.

+-------------+---------------------+---+------+------+------+
| ClientIP    | WatchID             | c | tmin | tmed | tmax |
+-------------+---------------------+---+------+------+------+
| 1611957945  | 6655575552203051303 | 2 | 0    | 0    | 0    |
| -1402644643 | 8566928176839891583 | 2 | 0    | 0    | 0    |
+-------------+---------------------+---+------+------+------+
2 row(s) fetched.
Elapsed 6.919 seconds.

@alamb alamb changed the title Impl convert_to_state for GroupsAccumulatorAdapter. Impl convert_to_state for GroupsAccumulatorAdapter (faster median for high cardinality aggregates) Sep 13, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Rachelint and @korowa

While I think the real way to make MEDIAN and APPROX_PERCENTILE_CONT etc faster is to implement GroupsAccumulator, this PR makes them faster for certain cases.

Nice work. Thanks again and sorry for the delay in reviewing while we sorted out benchmarking

let mut results = vec![];
for row_idx in 0..num_rows {
// Create the empty accumulator for converting
let mut converted_accumulator = (self.factory)()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a follow on PR I wonder if we could potentially to improve performance by adding a clear() or reset() type function to each accumulator to avoid having to create a new accumulator for each group.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I want to reuse the converted_accumulator at the beginning, but it is not ensure that the state will be reset after calling state.
It is clever to add such function to do the reset work explicitly.

@alamb
Copy link
Contributor

alamb commented Sep 15, 2024

The 42.0.0 release has been cut -- let's start the code flowing!

@alamb alamb merged commit f48e0b2 into apache:main Sep 15, 2024
26 checks passed
@alamb
Copy link
Contributor

alamb commented Sep 15, 2024

Thanks again @Rachelint

@Rachelint Rachelint deleted the support-convert-to-state-for-adapter branch September 16, 2024 08:00
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve performance of other non GroupsAdapter aggregates: implement convert_to_state
2 participants