Skip to content

Conversation

@peter-toth
Copy link
Contributor

@peter-toth peter-toth commented Oct 18, 2018

What changes were proposed in this pull request?

Without this PR some UDAFs like GenericUDAFPercentileApprox can throw an exception because expecting a constant parameter (object inspector) as a particular argument.

The exception is thrown because toPrettySQL call in ResolveAliases analyzer rule transforms a Literal parameter to a PrettyAttribute which is then transformed to an ObjectInspector instead of a ConstantObjectInspector.
The exception comes from getEvaluator method of GenericUDAFPercentileApprox that actually shouldn't be called during toPrettySQL transformation. The reason why it is called are the non lazy fields in HiveUDAFFunction.

This PR makes all fields of HiveUDAFFunction lazy.

How was this patch tested?

added new UT

Copy link
Contributor

@mgaido91 mgaido91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peter-toth may you please elaborate a bit more the change in the PR description? What happened before the PR? Why? Thanks.

}
}

test("constant argument expecting Hive UDF") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may you please reference the JIRA?

test("constant argument expecting Hive UDF") {
val testData = spark.range(10).toDF()
withTempView("inputTable") {
testData.createOrReplaceTempView("inputTable")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can move here spark.range(10)

Change-Id: If5a53df72c616d8b54662619426f5c8f34a9c0c0
resolver.getEvaluator(parameterInfo)
}

private case class Mode(evaluator: GenericUDAFEvaluator, objectInspector: ObjectInspector)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evaluator and object inspector need to be initialized together

Copy link
Contributor

@mgaido91 mgaido91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change makes sense to me. I have just one minor comment about naming.

cc @cloud-fan @dongjoon-hyun do you think we can trigger a build?

resolver.getEvaluator(parameterInfo)
}

private case class Mode(evaluator: GenericUDAFEvaluator, objectInspector: ObjectInspector)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we can find a better name for this class

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just PartialEvaluator?

@peter-toth
Copy link
Contributor Author

cc @cloud-fan I believe this is a regression because https://issues.apache.org/jira/browse/SPARK-18186

@cloud-fan
Copy link
Contributor

OK to test

@dongjoon-hyun
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Oct 18, 2018

Test build #97547 has finished for PR 22766 at commit 00115ff.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Change-Id: I7379b21c00336e389af9b5186fac71c566699c14
@SparkQA
Copy link

SparkQA commented Oct 18, 2018

Test build #97549 has finished for PR 22766 at commit 844d0ba.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Oct 18, 2018

Test build #97558 has finished for PR 22766 at commit 844d0ba.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

GenericUDAFEvaluator.Mode.PARTIAL1,
inputInspectors
)
private lazy val partial1Mode = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial1ModeEvaluator

GenericUDAFEvaluator.Mode.FINAL,
Array(partialResultInspector)
)
private lazy val finalMode = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah it's also used in final mode, then maybe HiveEvaluator is a better name than PartialEvaluator

@SparkQA
Copy link

SparkQA commented Oct 19, 2018

Test build #97592 has finished for PR 22766 at commit 6e6eca4.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91
Copy link
Contributor

retest this please

@mgaido91
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Oct 19, 2018

Test build #97603 has finished for PR 22766 at commit 6e6eca4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Oct 19, 2018
## What changes were proposed in this pull request?

Without this PR some UDAFs like `GenericUDAFPercentileApprox` can throw an exception because expecting a constant parameter (object inspector) as a particular argument.

The exception is thrown because `toPrettySQL` call in `ResolveAliases` analyzer rule transforms a `Literal` parameter to a `PrettyAttribute` which is then transformed to an `ObjectInspector` instead of a `ConstantObjectInspector`.
The exception comes from `getEvaluator` method of `GenericUDAFPercentileApprox` that actually shouldn't be called during `toPrettySQL` transformation. The reason why it is called are the non lazy fields in `HiveUDAFFunction`.

This PR makes all fields of `HiveUDAFFunction` lazy.

## How was this patch tested?

added new UT

Closes #22766 from peter-toth/SPARK-25768.

Authored-by: Peter Toth <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit f38594f)
Signed-off-by: Wenchen Fan <[email protected]>
asfgit pushed a commit that referenced this pull request Oct 19, 2018
## What changes were proposed in this pull request?

Without this PR some UDAFs like `GenericUDAFPercentileApprox` can throw an exception because expecting a constant parameter (object inspector) as a particular argument.

The exception is thrown because `toPrettySQL` call in `ResolveAliases` analyzer rule transforms a `Literal` parameter to a `PrettyAttribute` which is then transformed to an `ObjectInspector` instead of a `ConstantObjectInspector`.
The exception comes from `getEvaluator` method of `GenericUDAFPercentileApprox` that actually shouldn't be called during `toPrettySQL` transformation. The reason why it is called are the non lazy fields in `HiveUDAFFunction`.

This PR makes all fields of `HiveUDAFFunction` lazy.

## How was this patch tested?

added new UT

Closes #22766 from peter-toth/SPARK-25768.

Authored-by: Peter Toth <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit f38594f)
Signed-off-by: Wenchen Fan <[email protected]>
@cloud-fan
Copy link
Contributor

thanks, merging to master/2.4/2.3!

@asfgit asfgit closed this in f38594f Oct 19, 2018
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?

Without this PR some UDAFs like `GenericUDAFPercentileApprox` can throw an exception because expecting a constant parameter (object inspector) as a particular argument.

The exception is thrown because `toPrettySQL` call in `ResolveAliases` analyzer rule transforms a `Literal` parameter to a `PrettyAttribute` which is then transformed to an `ObjectInspector` instead of a `ConstantObjectInspector`.
The exception comes from `getEvaluator` method of `GenericUDAFPercentileApprox` that actually shouldn't be called during `toPrettySQL` transformation. The reason why it is called are the non lazy fields in `HiveUDAFFunction`.

This PR makes all fields of `HiveUDAFFunction` lazy.

## How was this patch tested?

added new UT

Closes apache#22766 from peter-toth/SPARK-25768.

Authored-by: Peter Toth <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants