-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-31965][TESTS][PYTHON] Move doctests related to Java function registration to test conditionally #28795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Also cc @bersprockets I remember you took a look for this one as well. |
|
Also cc @dongjoon-hyun FYI |
|
Thank you for pinging me, @HyukjinKwon . |
| df.createOrReplaceTempView("df") | ||
| row = self.spark.sql( | ||
| "SELECT name, javaUDAF(id) as avg from df group by name order by name desc").first() | ||
| self.assertEqual(row.asDict(), Row(name='b', avg=102.0).asDict()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, we don't compare [Row(name=u'b', avg=102.0), Row(name=u'a', avg=102.0)]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could compare them as are. It's just to prevent an issue such as SPARK-29748 or similar issues in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it~ No problem~
|
Test build #123811 has finished for PR 28795 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you for making this, @HyukjinKwon .
I also verified manually. Merged to master/3.0.
…egistration to test conditionally
### What changes were proposed in this pull request?
This PR proposes to move the doctests in `registerJavaUDAF` and `registerJavaFunction` to the proper unittests that run conditionally when the test classes are present.
Both tests are dependent on the test classes in JVM side, `test.org.apache.spark.sql.JavaStringLength` and `test.org.apache.spark.sql.MyDoubleAvg`. So if you run the tests against the plain `sbt package`, it fails as below:
```
**********************************************************************
File "/.../spark/python/pyspark/sql/udf.py", line 366, in pyspark.sql.udf.UDFRegistration.registerJavaFunction
Failed example:
spark.udf.registerJavaFunction(
"javaStringLength", "test.org.apache.spark.sql.JavaStringLength", IntegerType())
Exception raised:
Traceback (most recent call last):
...
test.org.apache.spark.sql.JavaStringLength, please make sure it is on the classpath;
...
6 of 7 in pyspark.sql.udf.UDFRegistration.registerJavaFunction
2 of 4 in pyspark.sql.udf.UDFRegistration.registerJavaUDAF
***Test Failed*** 8 failures.
```
### Why are the changes needed?
In order to support to run the tests against the plain SBT build. See also https://spark.apache.org/developer-tools.html
### Does this PR introduce _any_ user-facing change?
No, it's test-only.
### How was this patch tested?
Manually tested as below:
```bash
./build/sbt -DskipTests -Phive-thriftserver clean package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf UserDefinedFunction"
./run-tests --python-executable=python3 --testname="pyspark.sql.tests.test_udf UDFTests"
```
```bash
./build/sbt -DskipTests -Phive-thriftserver clean test:package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf UserDefinedFunction"
./run-tests --python-executable=python3 --testname="pyspark.sql.tests.test_udf UDFTests"
```
Closes #28795 from HyukjinKwon/SPARK-31965.
Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 56264fb)
Signed-off-by: Dongjoon Hyun <[email protected]>
|
Thanks! |
BryanCutler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, I always hit this problem ;) Thanks @HyukjinKwon !
…egistration to test conditionally
### What changes were proposed in this pull request?
This PR proposes to move the doctests in `registerJavaUDAF` and `registerJavaFunction` to the proper unittests that run conditionally when the test classes are present.
Both tests are dependent on the test classes in JVM side, `test.org.apache.spark.sql.JavaStringLength` and `test.org.apache.spark.sql.MyDoubleAvg`. So if you run the tests against the plain `sbt package`, it fails as below:
```
**********************************************************************
File "/.../spark/python/pyspark/sql/udf.py", line 366, in pyspark.sql.udf.UDFRegistration.registerJavaFunction
Failed example:
spark.udf.registerJavaFunction(
"javaStringLength", "test.org.apache.spark.sql.JavaStringLength", IntegerType())
Exception raised:
Traceback (most recent call last):
...
test.org.apache.spark.sql.JavaStringLength, please make sure it is on the classpath;
...
6 of 7 in pyspark.sql.udf.UDFRegistration.registerJavaFunction
2 of 4 in pyspark.sql.udf.UDFRegistration.registerJavaUDAF
***Test Failed*** 8 failures.
```
### Why are the changes needed?
In order to support to run the tests against the plain SBT build. See also https://spark.apache.org/developer-tools.html
### Does this PR introduce _any_ user-facing change?
No, it's test-only.
### How was this patch tested?
Manually tested as below:
```bash
./build/sbt -DskipTests -Phive-thriftserver clean package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf UserDefinedFunction"
./run-tests --python-executable=python3 --testname="pyspark.sql.tests.test_udf UDFTests"
```
```bash
./build/sbt -DskipTests -Phive-thriftserver clean test:package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf UserDefinedFunction"
./run-tests --python-executable=python3 --testname="pyspark.sql.tests.test_udf UDFTests"
```
Closes apache#28795 from HyukjinKwon/SPARK-31965.
Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 56264fb)
Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This PR proposes to move the doctests in
registerJavaUDAFandregisterJavaFunctionto the proper unittests that run conditionally when the test classes are present.Both tests are dependent on the test classes in JVM side,
test.org.apache.spark.sql.JavaStringLengthandtest.org.apache.spark.sql.MyDoubleAvg. So if you run the tests against the plainsbt package, it fails as below:Why are the changes needed?
In order to support to run the tests against the plain SBT build. See also https://spark.apache.org/developer-tools.html
Does this PR introduce any user-facing change?
No, it's test-only.
How was this patch tested?
Manually tested as below: