[SPARK-23691][PYTHON] Use sql_conf util in PySpark tests where possible#20830
[SPARK-23691][PYTHON] Use sql_conf util in PySpark tests where possible#20830HyukjinKwon wants to merge 2 commits intoapache:masterfrom
Conversation
|
@ueshin and @BryanCutler, could you take a look when you are available? |
|
BTW, I double checked it produces the stack trace fine by manually changing some tests locally. |
|
Test build #88254 has finished for PR 20830 at commit
|
ueshin
left a comment
There was a problem hiding this comment.
LGTM except for one comment, but I'd leave it to @HyukjinKwon whether we omit the default value block or not.
python/pyspark/sql/tests.py
Outdated
| self.spark.conf.set("spark.sql.execution.arrow.enabled", "true") | ||
| pdf_arrow = df.toPandas() | ||
|
|
||
| with self.sql_conf({"spark.sql.execution.arrow.enabled": True}): |
There was a problem hiding this comment.
We can omit this when we use the default value or set the value in setup method, but I'm okay if we want to show the value explicitly.
There was a problem hiding this comment.
Ah, OK. I am fine. will omit this.
BryanCutler
left a comment
There was a problem hiding this comment.
Thanks @HyukjinKwon , LGTM!
| self.assertRaises(AnalysisException, lambda: df1.join(df2, how="inner").collect()) | ||
|
|
||
| self.spark.conf.set("spark.sql.crossJoin.enabled", "true") | ||
| with self.sql_conf({"spark.sql.crossJoin.enabled": True}): |
There was a problem hiding this comment.
So the sql_conf context will change this back to be unset right?
There was a problem hiding this comment.
Yup, it originally unset spark.sql.crossJoin.enabled but now it set to the original value back.
If spark.sql.crossJoin.enabled is unset in this test, it will change this back to be like it's unset.
|
Test build #88295 has finished for PR 20830 at commit
|
|
retest this please |
|
Test build #88349 has finished for PR 20830 at commit
|
|
retest this please |
|
Test build #88350 has finished for PR 20830 at commit
|
|
@BryanCutler, could I have the very first PR merged by you as a new fresh committer :-)? I personally think it might be good to merge to branch-2.3 if it doesn't have conflicts. If it has, I think we are fine to get this into master only for now. |
|
Sure thing! I probably won't be able to until later tonight, but I'll give
it a shot as soon as I can, hopefully it will work :-D
On Mar 18, 2018 10:24 PM, "Hyukjin Kwon" <notifications@github.com> wrote:
@BryanCutler <https://github.com/bryancutler>, could I have the very first
PR merged by you as a new fresh committer :-)?
I personally think it might be good to merge to branch-2.3 if it doesn't
have conflicts. If it has, I think we are fine to get this into master only
for now.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20830 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEUwdYkTxHXrXjKFMGSZlDd1JwZjxK2wks5tf0EUgaJpZM4SriXl>
.
|
|
Merged to master! (I think it went ok..) Thanks @HyukjinKwon !! |
|
Thanks for reviewing and merging this @ueshin, @felixcheung, @BryanCutler and @dongjoon-hyun. (Just FYI, I usually manually resolve JIRAs when I accidentally failed to take an action with the merge script. I think that's fine.) |
|
The cherry pick to branch-2.3 did have some conflicts. Just to check for the reason to backport, even though this isn't a bug it's pretty safe and will help keep things inline so less conflicts for future backports? |
|
Yup, that was exactly what I thought. I think it's fine to not bother backport too since it has conflicts. |
|
But I am willing to do this if you think it's better to do this. No objection. |
|
Hmm, it looks like the conflict is just one block with group agg tests, probably not a big deal - you want to take a look? |
|
Sure, will open a PR soon. |
apache@d6632d1 added an useful util ```python contextmanager def sql_conf(self, pairs): ... ``` to allow configuration set/unset within a block: ```python with self.sql_conf({"spark.blah.blah.blah", "blah"}) # test codes ``` This PR proposes to use this util where possible in PySpark tests. Note that there look already few places affecting tests without restoring the original value back in unittest classes. Manually tested via: ``` ./run-tests --modules=pyspark-sql --python-executables=python2 ./run-tests --modules=pyspark-sql --python-executables=python3 ``` Author: hyukjinkwon <gurwls223@gmail.com> Closes apache#20830 from HyukjinKwon/cleanup-sql-conf.
|
Ahhhh .. d6632d1 added the util into master only ... |
|
Let me open a PR and cc you guys to show the diff. |
…where possible ## What changes were proposed in this pull request? This PR backports #20830 to reduce the diff against master and restore the default value back in PySpark tests. d6632d1 added an useful util. This backport extracts and brings this util: ```python contextmanager def sql_conf(self, pairs): ... ``` to allow configuration set/unset within a block: ```python with self.sql_conf({"spark.blah.blah.blah", "blah"}) # test codes ``` This PR proposes to use this util where possible in PySpark tests. Note that there look already few places affecting tests without restoring the original value back in unittest classes. ## How was this patch tested? Likewise, manually tested via: ``` ./run-tests --modules=pyspark-sql --python-executables=python2 ./run-tests --modules=pyspark-sql --python-executables=python3 ``` Author: hyukjinkwon <gurwls223@gmail.com> Closes #20863 from HyukjinKwon/backport-20830.
What changes were proposed in this pull request?
d6632d1 added an useful util
to allow configuration set/unset within a block:
This PR proposes to use this util where possible in PySpark tests.
Note that there look already few places affecting tests without restoring the original value back in unittest classes.
How was this patch tested?
Manually tested via: