Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert stddev and stddev_pop to UDAF #10834

Merged
merged 11 commits into from
Jun 9, 2024

Conversation

goldmedal
Copy link
Contributor

Which issue does this PR close?

Closes #10827 .
I converted stddev_pop to UDAF, too.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Yes

Are there any user-facing changes?

no

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions core Core DataFusion crate labels Jun 8, 2024
Comment on lines 29 to 30
/// An accumulator to compute the average
#[derive(Debug)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StddevAccumulator is used by correlation. That's why I keep it.
We should remove it after converting correlation to UDAF.

@@ -939,32 +847,6 @@ mod tests {
assert!(observed.is_err());
}

#[test]
fn test_stddev_return_type() -> Result<()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those tests are covered by sqllogictests testing. I didn't add similar tests for UDAF.

@@ -747,82 +731,6 @@ mod tests {
Ok(())
}

#[test]
fn test_stddev_expr() -> Result<()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. It's already covered by sqllogictests.

}

#[test]
fn test_stddev_pop_expr() -> Result<()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. It's already covered by sqllogictests.

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Jun 8, 2024

fn accumulator(&self, acc_args: AccumulatorArgs) -> Result<Box<dyn Accumulator>> {
if acc_args.is_distinct {
return internal_err!("STDDEV_POP(DISTINCT) aggregations are not available");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be not_impl_err?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @comphead. Indeed. It's better.

Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Ok(Box::new(StddevAccumulator::try_new(StatsType::Sample)?))
}

fn create_sliding_accumulator(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could rely on the default impl, since they are equivalent

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fn create_sliding_accumulator(
&self,
args: AccumulatorArgs,
) -> Result<Box<dyn Accumulator>> {
self.accumulator(args)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jayzhan211, It's better.

&self,
args: AccumulatorArgs,
) -> Result<Box<dyn Accumulator>> {
self.accumulator(args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@jayzhan211
Copy link
Contributor

Thanks @goldmedal and @comphead

@jayzhan211 jayzhan211 merged commit 8b1f06b into apache:main Jun 9, 2024
23 checks passed
@goldmedal
Copy link
Contributor Author

Thanks again @jayzhan211 @comphead

@goldmedal goldmedal deleted the feature/10827-stddev-udaf branch June 9, 2024 03:43
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
* add stddev and stddev_pot udaf

* remove aggregation function stddev and stddev_pop

* register func and modified return type

* cargo fmt

* regen proto

* cargo clippy

* fix window function support

* cargo fmt

* throw not_impl_err instead

* use default sliding accumulator
vgapeyev added a commit to sdf-labs/sql-functions that referenced this pull request Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert stddev to udaf
3 participants