Skip to content

[SPARK-23691][PYTHON][BRANCH-2.3] Use sql_conf util in PySpark tests where possible#20863

Closed
HyukjinKwon wants to merge 2 commits intoapache:branch-2.3from
HyukjinKwon:backport-20830
Closed

[SPARK-23691][PYTHON][BRANCH-2.3] Use sql_conf util in PySpark tests where possible#20863
HyukjinKwon wants to merge 2 commits intoapache:branch-2.3from
HyukjinKwon:backport-20830

Conversation

@HyukjinKwon
Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

This PR backports #20830 to reduce the diff against master and restore the default value back in PySpark tests.

d6632d1 added an useful util. This backport extracts and brings this util:

@contextmanager
def sql_conf(self, pairs):
    ...

to allow configuration set/unset within a block:

with self.sql_conf({"spark.blah.blah.blah", "blah"})
    # test codes

This PR proposes to use this util where possible in PySpark tests.

Note that there look already few places affecting tests without restoring the original value back in unittest classes.

How was this patch tested?

Likewise, manually tested via:

./run-tests --modules=pyspark-sql --python-executables=python2
./run-tests --modules=pyspark-sql --python-executables=python3

apache@d6632d1 added an useful util

```python
contextmanager
def sql_conf(self, pairs):
    ...
```

to allow configuration set/unset within a block:

```python
with self.sql_conf({"spark.blah.blah.blah", "blah"})
    # test codes
```

This PR proposes to use this util where possible in PySpark tests.

Note that there look already few places affecting tests without restoring the original value back in unittest classes.

Manually tested via:

```
./run-tests --modules=pyspark-sql --python-executables=python2
./run-tests --modules=pyspark-sql --python-executables=python3
```

Author: hyukjinkwon <gurwls223@gmail.com>

Closes apache#20830 from HyukjinKwon/cleanup-sql-conf.
"\n\nResult:\n%s\n%s" % (result, result.dtypes))
self.assertTrue(expected.equals(result), msg=msg)

@contextmanager
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was extracted alone from d6632d1

df1 = self.spark.range(1).toDF("a")
df2 = self.spark.range(1).toDF("b")

try:
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other diff are basically the same.

@HyukjinKwon
Copy link
Copy Markdown
Member Author

cc @ueshin and @BryanCutler

Copy link
Copy Markdown
Member

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending Jenkins.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Mar 20, 2018

Test build #88404 has finished for PR 20863 at commit 3056e3c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Copy Markdown
Member Author

Merged to branch-2.3.

Thank you for reviewing this @ueshin.

asfgit pushed a commit that referenced this pull request Mar 20, 2018
…where possible

## What changes were proposed in this pull request?

This PR backports #20830 to reduce the diff against master and restore the default value back in PySpark tests.

d6632d1 added an useful util. This backport extracts and brings this util:

```python
contextmanager
def sql_conf(self, pairs):
    ...
```

to allow configuration set/unset within a block:

```python
with self.sql_conf({"spark.blah.blah.blah", "blah"})
    # test codes
```

This PR proposes to use this util where possible in PySpark tests.

Note that there look already few places affecting tests without restoring the original value back in unittest classes.

## How was this patch tested?

Likewise, manually tested via:

```
./run-tests --modules=pyspark-sql --python-executables=python2
./run-tests --modules=pyspark-sql --python-executables=python3
```

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #20863 from HyukjinKwon/backport-20830.
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
…where possible

## What changes were proposed in this pull request?

This PR backports apache#20830 to reduce the diff against master and restore the default value back in PySpark tests.

apache@d6632d1 added an useful util. This backport extracts and brings this util:

```python
contextmanager
def sql_conf(self, pairs):
    ...
```

to allow configuration set/unset within a block:

```python
with self.sql_conf({"spark.blah.blah.blah", "blah"})
    # test codes
```

This PR proposes to use this util where possible in PySpark tests.

Note that there look already few places affecting tests without restoring the original value back in unittest classes.

## How was this patch tested?

Likewise, manually tested via:

```
./run-tests --modules=pyspark-sql --python-executables=python2
./run-tests --modules=pyspark-sql --python-executables=python3
```

Author: hyukjinkwon <gurwls223@gmail.com>

Closes apache#20863 from HyukjinKwon/backport-20830.
@HyukjinKwon HyukjinKwon deleted the backport-20830 branch October 16, 2018 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants