-
-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EmojiFilter doesn't work on strings that don't contain HTML #133
Comments
I wonder why irb(main):018:0> Nokogiri::HTML('hi').search("text()")
=> [#<Nokogiri::XML::Text:0x3fd03a089798 "hi">] You implementation works, but I'd be worried about the performance of doc.children.each do |node|
next unless node.text?
# snip...
end Thanks for digging in on this bug. Could you open a PR with a test and we can continue the discussion from there? |
@jch I will dig deeper. When this wasn't working, I created a test pipeline with only EmojiFilter in it, so I know it wasn't any of the custom filters I built, but it's quite possible I did do something wrong. If it's not an ID10T error on my part, I'll certainly work up a PR! |
@wideopenspaces any luck? |
Work got in the way the last two weeks. I'll see if I can set aside a few hours this week to tackle this. Thanks for reminding me! |
@wickedshimmy Hitting the same issue. Had you any success? |
@jch Your approach works. I could provide PR if would acceptable. |
@Razer6 👍 a PR would be awesome. I'd be happy to review and test it for compatibility. |
Fixed by #146 |
I ran into this problem too. I think some versions of libxml2 don't return top-level text nodes inside a DocumentFragment when using HTML::Pipeline normally avoids this by wrapping everything inside a @wideopenspaces Were you using PlainTextInputFilter? If you weren't, then you're probably opening yourself to XSS attacks or at least bad parsing/rendering (e.g., if your input string happens to contain HTML). |
So I guess what I'm saying is that #146 seems unnecessary. It seems like the correct fix is "Use PlainTextInputFilter". (Unless you were using it, in which case you and I weren't seeing the same bug.) |
Eeeenteresting. I had forgotten about @aroben is there a downside to inlining |
Well if you are in fact starting with HTML, not plaintext, you should not use PlainTextInputFilter. |
Not just overhead. All your HTML would get escaped and thus rendered as plain text. I.e. you'd see HTML tags in your output. |
Ooo ya. Good point. |
@aroben No, we are sanitizing separately. I will look into whether or not PlainTextFilter will work for us. |
@wideopenspaces If you already have HTML-escaped text on your hands then you could manually wrap it in a |
Yep, that may be the best solution. Thanks for chiming in! And @Razer6 thanks for picking up my slack! |
I proposed this solution also in #144. Didn't see that there is a dedicated filter doing that. Thanks for pointing out 👍
This is right. |
When I pass this string...
...to a pipeline containing EmojiFilter, it does not replace the emoji-cheat-sheet code with the Emoji as expected.
I tracked the problem down to here:
What does happen is that the DocumentFragment in doc contains one child Nokogiri::XML::Text node, and
doc.text
contains the same text thathtml
contains. So....Armed with that knowledge, I made the following changes:
... and that fixed it for me.
Anyone see any problems with that fix? If not, I'll work up a PR as soon as I can.
The text was updated successfully, but these errors were encountered: