-
Notifications
You must be signed in to change notification settings - Fork 12
Closed
Labels
experimentalRelates to experimental microformat parasingRelates to experimental microformat parasing
Description
Describe the bug
When implying the value
property for a nested microformat (e.g., h-adr
inside h-entry
) from the HTML textContents
, multiple successive whitespace characters should be collapsed to a single space character.
To Reproduce
HTML input:
<div class="h-entry">
<span class="p-location h-adr">
<span class="p-locality">Berlin</span>,
<span class="p-region">Berlin</span>,
<span class="p-country-name">DE</span>
<data class="p-latitude" value="52.518606"></data>
<data class="p-longitude" value="13.376127"></data>
</span>
</div>
Expected behavior
Correct JSON output:
{
"items": [
{
"type": [
"h-entry"
],
"properties": {
"location": [
{
"type": [
"h-adr"
],
"properties": {
"locality": [
"Berlin"
],
"region": [
"Berlin"
],
"country-name": [
"DE"
],
"latitude": [
"52.518606"
],
"longitude": [
"13.376127"
]
},
"value": "Berlin, Berlin, DE"
}
]
}
}
],
"rels": {},
"rel-urls": {},
}
Actual JSON output:
{
"rels": {},
"rel-urls": {},
"items": [
{
"type": [
"h-entry"
],
"properties": {
"location": [
{
"type": [
"h-adr"
],
"properties": {
"locality": [
"Berlin"
],
"region": [
"Berlin"
],
"country-name": [
"DE"
],
"latitude": [
"52.518606"
],
"longitude": [
"13.376127"
]
},
"value": "Berlin,\n Berlin,\n DE"
}
]
}
}
]
}
Note the difference Berlin, Berlin, DE
vs. Berlin,\n Berlin,\n DE
.
Additional context
From what I can tell, this is not actually part of the specification, it seems to be commonly accepted though, as both the PHP parser and the Python parser do this.
Metadata
Metadata
Assignees
Labels
experimentalRelates to experimental microformat parasingRelates to experimental microformat parasing