-
-
Notifications
You must be signed in to change notification settings - Fork 34.7k
gh-50002: xml.dom.minidom now preserves whitespaces in attributes #107947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
dd6c84c
6dd15a4
b926f36
dc72044
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -300,12 +300,28 @@ def _in_document(node): | |
| node = node.parentNode | ||
| return False | ||
|
|
||
| def _write_data(writer, data): | ||
| def _write_data(writer, text, attr): | ||
|
scoder marked this conversation as resolved.
|
||
| "Writes datachars to writer." | ||
| if data: | ||
| data = data.replace("&", "&").replace("<", "<"). \ | ||
| replace("\"", """).replace(">", ">") | ||
| writer.write(data) | ||
| if not text: | ||
| return | ||
| # See the comments in ElementTree.py for behavior and | ||
| # implementation details. | ||
| if "&" in text: | ||
| text = text.replace("&", "&") | ||
| if "<" in text: | ||
| text = text.replace("<", "<") | ||
| if ">" in text: | ||
| text = text.replace(">", ">") | ||
| if attr: | ||
| if '"' in text: | ||
| text = text.replace('"', """) | ||
| if "\r" in text: | ||
| text = text.replace("\r", " ") | ||
| if "\n" in text: | ||
| text = text.replace("\n", " ") | ||
| if "\t" in text: | ||
| text = text.replace("\t", "	") | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder, why
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No technical reason. Probably based on Python's hex character spelling or due to consistency with the other two-digit codes above. The XML character spec does not need (or mention) leading zeros. I'm happy to keep the leading zero. If you need compact data, use compression. That's way more effective than stripping some zeros from rare tab characters.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Which makes me wonder why we need a new implementation here, rather than importing the existing one.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I now looked up the implementation in
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is sad, but there is a copy of escaping function in almost every module which outputs XML or HTML. On one hand, it is a trivial function, and we want to avoid unneeded dependencies. On other hand, efficient and complete implementation is not so trivial. But It was worse in the past. Now many code just use
serhiy-storchaka marked this conversation as resolved.
Outdated
|
||
| writer.write(text) | ||
|
|
||
| def _get_elements_by_tagName_helper(parent, name, rc): | ||
| for node in parent.childNodes: | ||
|
|
@@ -883,7 +899,7 @@ def writexml(self, writer, indent="", addindent="", newl=""): | |
|
|
||
| for a_name in attrs.keys(): | ||
| writer.write(" %s=\"" % a_name) | ||
| _write_data(writer, attrs[a_name].value) | ||
| _write_data(writer, attrs[a_name].value, True) | ||
| writer.write("\"") | ||
| if self.childNodes: | ||
| writer.write(">") | ||
|
|
@@ -1112,7 +1128,7 @@ def splitText(self, offset): | |
| return newText | ||
|
|
||
| def writexml(self, writer, indent="", addindent="", newl=""): | ||
| _write_data(writer, "%s%s%s" % (indent, self.data, newl)) | ||
| _write_data(writer, "%s%s%s" % (indent, self.data, newl), False) | ||
|
|
||
| # DOM Level 3 (WD 9 April 2002) | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| :mod:`xml.dom.minidom` now preserves whitespaces in attributes. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| :mod:`xml.dom.minidom` now only quotes ``"`` in attributes. |
Uh oh!
There was an error while loading. Please reload this page.