[SPARK-25238][PYTHON] lint-python: Fix W605 warnings for pycodestyle 2.4#22400
[SPARK-25238][PYTHON] lint-python: Fix W605 warnings for pycodestyle 2.4#22400srowen wants to merge 4 commits intoapache:masterfrom
Conversation
…h also avoids warnings from pycodestyle about accidentally relying Python's non-escaping of non-reserved chars in normal strings
dev/run-tests-jenkins.py
Outdated
|
|
||
| failure_note_by_errcode = { | ||
| 1: 'executing the `dev/run-tests` script', # error to denote run-tests script failures | ||
| 1: 'executing the dev/run-tests script', # error to denote run-tests script failures |
There was a problem hiding this comment.
Back-ticks invoke repr or something, I think? not the intent here so I removed them to quiet the warning
There was a problem hiding this comment.
Eh, I think this is a part of the Jenkins test result message. How about # noqa if it complains?
There was a problem hiding this comment.
I thought it wouldn't matter as it's just a message printed in Github/Jenkins messages. But yeah noqa is easy.
|
|
||
| class BucketedRandomProjectionLSHModel(LSHModel, JavaMLReadable, JavaMLWritable): | ||
| """ | ||
| r""" |
There was a problem hiding this comment.
A few docstrings have backslash or backticks in them. This should make sure they don't have surprising effects some day.
| @since(2.1) | ||
| def approx_count_distinct(col, rsd=None): | ||
| """Aggregate function: returns a new :class:`Column` for approximate distinct count of column `col`. | ||
| """Aggregate function: returns a new :class:`Column` for approximate distinct count of |
|
Test build #95973 has finished for PR 22400 at commit
|
…some latex and other minor docs
|
Test build #95975 has finished for PR 22400 at commit
|
python/pyspark/sql/streaming.py
Outdated
| columnNameOfCorruptRecord=None, multiLine=None, charToEscapeQuoteEscaping=None, | ||
| enforceSchema=None, emptyValue=None): | ||
| """Loads a CSV file stream and returns the result as a :class:`DataFrame`. | ||
| r"""Loads a CSV file stream and returns the result as a :class:`DataFrame`. |
There was a problem hiding this comment.
tiny nit: there are two spaces within result as a :class:Da`
|
retest this please |
|
|
||
| def is_release(commit_title): | ||
| return re.findall("\[release\]", commit_title.lower()) or \ | ||
| return re.findall(r"\[release\]", commit_title.lower()) or \ |
There was a problem hiding this comment.
- Could we use parens to remove line terminating backslashes as recommended in PEP8?
- Could we get rid of the use of re in this instance with
return ("[release]" in commit_title.lower() or
"preparing spark release" in commit_title.lower() or
"preparing development version" in commit_title.lower() or
"CHANGES.txt" in commit_title)There was a problem hiding this comment.
Heh yeah not sure why it ended up as a regex, actually
|
Test build #95979 has finished for PR 22400 at commit
|
|
Test build #95978 has finished for PR 22400 at commit
|
dev/run-tests-jenkins.py
Outdated
|
|
||
| failure_note_by_errcode = { | ||
| 1: 'executing the `dev/run-tests` script', # error to denote run-tests script failures | ||
| 1: 'executing the dev/run-tests script', # error to denote run-tests script failures |
There was a problem hiding this comment.
I thought it wouldn't matter as it's just a message printed in Github/Jenkins messages. But yeah noqa is easy.
|
|
||
| def is_release(commit_title): | ||
| return re.findall("\[release\]", commit_title.lower()) or \ | ||
| return re.findall(r"\[release\]", commit_title.lower()) or \ |
There was a problem hiding this comment.
Heh yeah not sure why it ended up as a regex, actually
| print("Previous release tag: %s" % PREVIOUS_RELEASE_TAG) | ||
| print("Number of commits in this range: %s" % len(new_commits)) | ||
| print("") |
| # Extract spark component(s): | ||
| # Look for alphanumeric chars, spaces, dashes, periods, and/or commas | ||
| pattern = re.compile(r'(\[[\w\s,-\.]+\])', re.IGNORECASE) | ||
| pattern = re.compile(r'(\[[\w\s,.-]+\])', re.IGNORECASE) |
There was a problem hiding this comment.
Two issues: \. is unnecessary, and ,-. is probably misleading as it's a range, not three characters in the class. As it happens the range from comma to period is exactly those chars in ASCII!
|
Test build #95996 has finished for PR 22400 at commit
|
|
retest this please |
|
Test build #95998 has finished for PR 22400 at commit
|
|
Test build #96006 has finished for PR 22400 at commit
|
|
Merged to master and branch-2.4. |
(This change is a subset of the changes needed for the JIRA; see #22231) ## What changes were proposed in this pull request? Use raw strings and simpler regex syntax consistently in Python, which also avoids warnings from pycodestyle about accidentally relying Python's non-escaping of non-reserved chars in normal strings. Also, fix a few long lines. ## How was this patch tested? Existing tests, and some manual double-checking of the behavior of regexes in Python 2/3 to be sure. Closes #22400 from srowen/SPARK-25238.2. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: hyukjinkwon <gurwls223@apache.org> (cherry picked from commit 08c76b5) Signed-off-by: hyukjinkwon <gurwls223@apache.org>
(This change is a subset of the changes needed for the JIRA; see apache#22231) ## What changes were proposed in this pull request? Use raw strings and simpler regex syntax consistently in Python, which also avoids warnings from pycodestyle about accidentally relying Python's non-escaping of non-reserved chars in normal strings. Also, fix a few long lines. ## How was this patch tested? Existing tests, and some manual double-checking of the behavior of regexes in Python 2/3 to be sure. Closes apache#22400 from srowen/SPARK-25238.2. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: hyukjinkwon <gurwls223@apache.org>
(This change is a subset of the changes needed for the JIRA; see #22231)
What changes were proposed in this pull request?
Use raw strings and simpler regex syntax consistently in Python, which also avoids warnings from pycodestyle about accidentally relying Python's non-escaping of non-reserved chars in normal strings. Also, fix a few long lines.
How was this patch tested?
Existing tests, and some manual double-checking of the behavior of regexes in Python 2/3 to be sure.