[SPARK-25238][PYTHON] lint-python: Fix W605 warnings for pycodestyle 2.4 by srowen · Pull Request #22400 · apache/spark

srowen · 2018-09-12T03:21:45Z

(This change is a subset of the changes needed for the JIRA; see #22231)

What changes were proposed in this pull request?

Use raw strings and simpler regex syntax consistently in Python, which also avoids warnings from pycodestyle about accidentally relying Python's non-escaping of non-reserved chars in normal strings. Also, fix a few long lines.

How was this patch tested?

Existing tests, and some manual double-checking of the behavior of regexes in Python 2/3 to be sure.

…h also avoids warnings from pycodestyle about accidentally relying Python's non-escaping of non-reserved chars in normal strings

srowen · 2018-09-12T03:21:53Z

CC @cclauss @holdenk

srowen · 2018-09-12T03:22:20Z

dev/run-tests-jenkins.py


    failure_note_by_errcode = {
-        1: 'executing the `dev/run-tests` script',  # error to denote run-tests script failures
+        1: 'executing the dev/run-tests script',  # error to denote run-tests script failures


Back-ticks invoke repr or something, I think? not the intent here so I removed them to quiet the warning

Eh, I think this is a part of the Jenkins test result message. How about # noqa if it complains?

I thought it wouldn't matter as it's just a message printed in Github/Jenkins messages. But yeah noqa is easy.

srowen · 2018-09-12T03:22:43Z

python/pyspark/ml/feature.py


 class BucketedRandomProjectionLSHModel(LSHModel, JavaMLReadable, JavaMLWritable):
-    """
+    r"""


A few docstrings have backslash or backticks in them. This should make sure they don't have surprising effects some day.

srowen · 2018-09-12T03:22:56Z

python/pyspark/sql/functions.py

 @since(2.1)
 def approx_count_distinct(col, rsd=None):
-    """Aggregate function: returns a new :class:`Column` for approximate distinct count of column `col`.
+    """Aggregate function: returns a new :class:`Column` for approximate distinct count of


Line too long

SparkQA · 2018-09-12T03:25:31Z

Test build #95973 has finished for PR 22400 at commit fc4b49e.

This patch fails Python style tests.
This patch merges cleanly.
This patch adds no public classes.

…some latex and other minor docs

SparkQA · 2018-09-12T07:05:01Z

Test build #95975 has finished for PR 22400 at commit 9c9178b.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-09-12T07:17:52Z

python/pyspark/sql/streaming.py

            columnNameOfCorruptRecord=None, multiLine=None, charToEscapeQuoteEscaping=None,
            enforceSchema=None, emptyValue=None):
-        """Loads a CSV file stream and returns the result as a  :class:`DataFrame`.
+        r"""Loads a CSV file stream and returns the result as a  :class:`DataFrame`.


tiny nit: there are two spaces within result as a :class:Da`

HyukjinKwon · 2018-09-12T07:22:48Z

retest this please

HyukjinKwon

LGTM otherwise

cclauss · 2018-09-12T07:42:38Z

dev/create-release/generate-contributors.py


 def is_release(commit_title):
-    return re.findall("\[release\]", commit_title.lower()) or \
+    return re.findall(r"\[release\]", commit_title.lower()) or \


Could we use parens to remove line terminating backslashes as recommended in PEP8?

Could we get rid of the use of re in this instance with

return ("[release]" in commit_title.lower() or "preparing spark release" in commit_title.lower() or "preparing development version" in commit_title.lower() or "CHANGES.txt" in commit_title)

Heh yeah not sure why it ended up as a regex, actually

cclauss

Nice work!

SparkQA · 2018-09-12T11:28:23Z

Test build #95979 has finished for PR 22400 at commit 9c9178b.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-09-12T11:52:18Z

Test build #95978 has finished for PR 22400 at commit 9c9178b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2018-09-12T16:19:28Z

dev/run-tests-jenkins.py


    failure_note_by_errcode = {
-        1: 'executing the `dev/run-tests` script',  # error to denote run-tests script failures
+        1: 'executing the dev/run-tests script',  # error to denote run-tests script failures


I thought it wouldn't matter as it's just a message printed in Github/Jenkins messages. But yeah noqa is easy.

srowen · 2018-09-12T16:23:24Z

dev/create-release/generate-contributors.py


 def is_release(commit_title):
-    return re.findall("\[release\]", commit_title.lower()) or \
+    return re.findall(r"\[release\]", commit_title.lower()) or \


Heh yeah not sure why it ended up as a regex, actually

srowen · 2018-09-12T17:06:10Z

dev/create-release/generate-contributors.py

 print("Previous release tag: %s" % PREVIOUS_RELEASE_TAG)
 print("Number of commits in this range: %s" % len(new_commits))
-print
+print("")


Works in Python 2 and 3

srowen · 2018-09-12T17:06:58Z

dev/merge_spark_pr.py

    # Extract spark component(s):
    # Look for alphanumeric chars, spaces, dashes, periods, and/or commas
-    pattern = re.compile(r'(\[[\w\s,-\.]+\])', re.IGNORECASE)
+    pattern = re.compile(r'(\[[\w\s,.-]+\])', re.IGNORECASE)


Two issues: \. is unnecessary, and ,-. is probably misleading as it's a range, not three characters in the class. As it happens the range from comma to period is exactly those chars in ASCII!

SparkQA · 2018-09-12T20:39:50Z

Test build #95996 has finished for PR 22400 at commit f750e08.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-09-12T21:46:31Z

retest this please

SparkQA · 2018-09-12T21:48:22Z

Test build #95998 has finished for PR 22400 at commit 3112b33.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-09-13T02:16:40Z

Test build #96006 has finished for PR 22400 at commit 3112b33.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-09-13T03:20:36Z

Merged to master and branch-2.4.

(This change is a subset of the changes needed for the JIRA; see #22231) ## What changes were proposed in this pull request? Use raw strings and simpler regex syntax consistently in Python, which also avoids warnings from pycodestyle about accidentally relying Python's non-escaping of non-reserved chars in normal strings. Also, fix a few long lines. ## How was this patch tested? Existing tests, and some manual double-checking of the behavior of regexes in Python 2/3 to be sure. Closes #22400 from srowen/SPARK-25238.2. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: hyukjinkwon <gurwls223@apache.org> (cherry picked from commit 08c76b5) Signed-off-by: hyukjinkwon <gurwls223@apache.org>

(This change is a subset of the changes needed for the JIRA; see apache#22231) ## What changes were proposed in this pull request? Use raw strings and simpler regex syntax consistently in Python, which also avoids warnings from pycodestyle about accidentally relying Python's non-escaping of non-reserved chars in normal strings. Also, fix a few long lines. ## How was this patch tested? Existing tests, and some manual double-checking of the behavior of regexes in Python 2/3 to be sure. Closes apache#22400 from srowen/SPARK-25238.2. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: hyukjinkwon <gurwls223@apache.org>

Use raw strings and simpler regex syntax consistently in Python, whic…

fc4b49e

…h also avoids warnings from pycodestyle about accidentally relying Python's non-escaping of non-reserved chars in normal strings

srowen commented Sep 12, 2018

View reviewed changes

Further fix directives: don't need backslash and need to indent. Fix …

9c9178b

…some latex and other minor docs

HyukjinKwon reviewed Sep 12, 2018

View reviewed changes

HyukjinKwon approved these changes Sep 12, 2018

View reviewed changes

cclauss reviewed Sep 12, 2018

View reviewed changes

cclauss approved these changes Sep 12, 2018

View reviewed changes

cclauss mentioned this pull request Sep 12, 2018

[SPARK-25238][PYTHON] lint-python: Upgrade pycodestyle to v2.4.0 #22231

Closed

srowen added 2 commits September 12, 2018 11:28

Review updates

f750e08

A few more fixes

3112b33

srowen commented Sep 12, 2018

View reviewed changes

asfgit closed this in 08c76b5 Sep 13, 2018

srowen deleted the SPARK-25238.2 branch September 20, 2018 10:52

Conversation

srowen commented Sep 12, 2018

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

srowen commented Sep 12, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Sep 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Sep 12, 2018

Uh oh!

SparkQA commented Sep 12, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon commented Sep 12, 2018

Uh oh!

HyukjinKwon left a comment

Choose a reason for hiding this comment

Uh oh!

cclauss Sep 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cclauss left a comment

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Sep 12, 2018

Uh oh!

SparkQA commented Sep 12, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Sep 12, 2018

Uh oh!

HyukjinKwon commented Sep 12, 2018

Uh oh!

SparkQA commented Sep 12, 2018

Uh oh!

SparkQA commented Sep 13, 2018

Uh oh!

HyukjinKwon commented Sep 13, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

HyukjinKwon Sep 12, 2018 •

edited

Loading

cclauss Sep 12, 2018 •

edited

Loading