-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML entities in element content confuses xpath #28
Comments
Hello, first of all, SweetXml is just a wrapper of Xmerl from erlang standard library. I will try to investigate when I find some time this week. |
The error I found executing your command on first xml is because your xpath Maybe you mean : So here are the remarks I found with your error :
(I tested it with erlang 18.1) |
Hi there! Sorry, I should have wrapped the xpath the first time with backticks. GH applied markdown. I've corrected so the xpath shows as intended in the OP. |
The behavior I see with the first example is truncation of the The second example, yeah, is really unhappy that it thinks it sees an entity but it is an invalid one. I'm not sure necessarily what (if anything) we can do but I added that example as it felt non-graceful. And problematic if you have random characters that happen to look like that. |
Hi :) Still the string modifier of SweetXml ( iex> xml |> xpath( ~x"//soapenv:Body/*[1]/*", message: ~x"name(.)", part: ~x"./text()"s)
%{message: 'loginReturn', part: "vSFFDDDzA34/SNu384NhbT93cGEEE+msH4hk<separator>LfhRIM7U9B0=+_+Blahblah"} The behavior you observe is that if the list specifier ( iex> xml |> xpath( ~x"//soapenv:Body/*[1]/*", message: ~x"name(.)",
part: ~x"./text()"l |> transform_by(&Enum.join/1))
%{message: 'loginReturn',
part: "vSFFDDDzA34/SNu384NhbT93cGEEE+msH4hk<separator>LfhRIM7U9B0=+_+Blahblah"} XML text node with a Still both behaviors can be cumbersome, but as they are standard erlang Still I think bypass them with the "sigil with modifiers" approach is sufficient. |
Hi, is it relevant to keep this open ? |
HTML entities in the element content appear to confuse xpath. It either seems to truncate the string on certain valid entities (eg,
<
) or blows up entirely.Example failures:
_the_following_data_ |> SweetXml.xpath( ~x"//soapenv:Body/*[1]/*", message: ~x"name(.)", part: ~x"./text()")
Remove the ampersands in the
loginReturn
bodies and the query works.The text was updated successfully, but these errors were encountered: