-
Notifications
You must be signed in to change notification settings - Fork 107
Module shlex removing newline after comment
-
Affected Components : shlex
-
Operating System : Linux
-
Python Versions : 2.6.x, 2.7.x, 3.1.x, 3.2.x
-
Reproducible : Yes
from shlex import shlex
print("#################")
lexera = shlex("a \n b")
A = ",".join(lexera)
texta = repr('CASE1 ==> shlex("a \n b")')
print('%s' % (texta))
print(repr(A))
if A == 'a,b':
print "CASE1 --> PASS"
else:
print "CASE1 --> FAIL"
print("#################")
lexerb = shlex("a # comment \n b")
B = ",".join(lexerb)
textb = repr('CASE2 ==> shlex("a # comment \n b")')
print('%s' % (textb))
print(repr(B))
if B == 'a,\n,b':
print "CASE2 --> PASS"
else:
print "CASE2 --> FAIL"
print("#################")
lexerc = shlex("a \n b")
lexerc.whitespace=" "
C = ",".join(lexerc)
textc = repr('CASE3 ==> shlex("a \n b") + whitespace')
print('%s' % (textc))
print(repr(C))
if C == 'a,\n,b':
print "CASE3 --> PASS"
else:
print "CASE3 --> FAIL"
print("#################")
lexerd = shlex("a # comment \n b")
lexerd.whitespace=" "
D = ",".join(lexerd)
textd = repr('CASE4 ==> shlex("a # comment \n b") + whitespace')
print('%s' % (textd))
print(repr(D))
if D == 'a,\n,b':
print "CASE4 --> PASS"
else:
print "CASE4 --> FAIL"
To reproduce the problem copy the source code
in a file and execute the script using the following command syntax:
$ python -OOBRtt test.py
Alternatively you can open python in interactive mode:
$ python -OOBRtt <press enter>
Then copy the lines of code into the interpreter.
Test results are the same in every version of python tested.
Test | Test String | Expected | Obtained | Status |
---|---|---|---|---|
1 | "a \n b" |
'a,b' | 'a,b' | PASS |
Test | Test String | Expected | Obtained | Status |
---|---|---|---|---|
2 | "a # comment \n b" |
'a,\n,b' | 'a,b' | FAIL |
Test | Test String | Expected | Obtained | Status |
---|---|---|---|---|
3 | "a \n b" |
'a,b' | 'a,b' | PASS |
Test | Test String | Expected | Obtained | Status |
---|---|---|---|---|
4 | "a # comment \n b" |
'a,\n,b' | 'a,b' | FAIL |
The module shlex
should tokenize like the shell but the python implementation seems not following POSIX 2008 standards.
If POSIX 2008 would have to be followed, nexline characters after a comment are not considered to be part of the comment therefore they should be tokenized.
We are not aware on any easy solution other than trying to avoid using shlex for cases like the one examined.
[POSIX 2008 - Shell Command Language][01] [01]:http://pubs.opengroup.org/onlinepubs/9699919799/idx/shell.html
[POSIX 2008 - Shell Command Language - Sections 2.10][02] [02]:http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_10
[POSIX 2008 - Shell Command Language - Sections 2.10][02] [02]:http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03
[POSIX 2008 - Shell Command Language rationale][07] [07]:http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xcu_chap02.html
[Python bug 7089][04] [04]:http://bugs.python.org/issue7089
[Python bug 7611][05] [05]:http://bugs.python.org/issue7611
[Python bug 1521950][06] [06]:http://bugs.python.org/issue1521950
Main site: pythonsecurity.org
OWASP Page: owasp.org/index.php/OWASP_Python_Security_Project