Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nokogiri::XML::XPath::SyntaxError: Invalid expression: .//child::text() | self::child::text() #1233

Closed
JuanitoFatas opened this issue Jan 25, 2015 · 7 comments

Comments

@JuanitoFatas
Copy link
Contributor

Hello,

I am doing a search on document fragment:

# https://github.com/jch/html-pipeline/blob/master/lib/html/pipeline/emoji_filter.rb#L18
doc.search('text()').each do |node|
  ...
end

When I upgrade nokogiri to 1.6.6.1+, one of the Error messages:

Nokogiri::XML::XPath::SyntaxError: Invalid expression: .//child::text() | self::child::text()
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/searchable.rb:165:in `evaluate'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/searchable.rb:165:in `block in xpath'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/searchable.rb:156:in `map'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/searchable.rb:156:in `xpath'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/searchable.rb:193:in `css_internal'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/node_set.rb:76:in `block in css'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/node_set.rb:187:in `block in each'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/node_set.rb:186:in `upto'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/node_set.rb:186:in `each'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/node_set.rb:75:in `inject'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/node_set.rb:75:in `css'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/document_fragment.rb:108:in `block in search'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/document_fragment.rb:104:in `each'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/document_fragment.rb:104:in `inject'
    /home/travis/.rvm/gems/ruby-2.2.0/gems/nokogiri-1.6.6.2/lib/nokogiri/xml/document_fragment.rb:104:in `search'
    /home/travis/build/jch/html-pipeline/lib/html/pipeline/emoji_filter.rb:22:in `call'

I have checked that works on 1.6.5 but failed on 1.6.6.1 & 1.6.6.2.

I found relevant code is:

# https://github.com/sparklemotion/nokogiri/blob/master/lib/nokogiri/xml/searchable.rb#L165
ctx.evaluate(path, handler)

The path caused syntax error is:".//child::text() | self::child::text()".

In 1.6.5 path is ".//child::text()".

Should 1.6.6.x maintain this backward compatibility?

Ref. gjtorikian/html-pipeline#170

Thanks!!!

@flavorjones
Copy link
Member

Looking, this is definitely unintentional behavior.

@flavorjones
Copy link
Member

An immediate workaround is to avoid #search and use real XPath, in this case:

doc.xpath(".//text()")

@flavorjones
Copy link
Member

OK, this bug is deep, and is a result of trying to fix xpath querying on DocumentFragments, so I'm not going to be able to fix it this morning.

Please, as a workaround, use #xpath with a real XPath expression (as above) to avoid having to convert CSS into a hacky XPath expression to work around unrelated DocumentFragment issues.

@JuanitoFatas
Copy link
Contributor Author

OK, this bug is deep, and is a result of trying to fix xpath querying on DocumentFragments, so I'm not going to be able to fix it this morning.

Thank you for looking into it! I'll wait for your fix.

Please, as a workaround, use #xpath with a real XPath expression (as above) to avoid having to convert CSS into a hacky XPath expression to work around unrelated DocumentFragment issues.

👌 Will use the workaround for now.

Thanks for your time! 😄

simeonwillbanks pushed a commit to gjtorikian/html-pipeline that referenced this issue Feb 9, 2015
- As identified in #170,
  nokogiri 1.6.x is buggy
  - Team nokogiri is working on the fix
    sparklemotion/nokogiri#1233
- Until the bug is fixed, define a range of working nokogiri gems
JuanitoFatas added a commit to jollygoodcode/twemoji that referenced this issue Feb 11, 2015
nokogiri 1.6.x is buggy => sparklemotion/nokogiri#1233

Until the bug is fixed, define a range of working nokogiri gems.
@kbrock
Copy link

kbrock commented May 24, 2015

Thanks @JuanitoFatas for posting. (I was just looking at some of your html-inline code)
Thanks @flavorjones thanks for workaround.

Looks like most people are pegging to a specific nokogiri version :(

@JuanitoFatas
Copy link
Contributor Author

Thanks @JuanitoFatas for posting. (I was just looking at some of your html-inline code)

Thanks for your kind words Keenan.

Looks like most people are pegging to a specific nokogiri version :(

I try to use the workaround @flavorjones provided workaround to soften dependency for html-pipeline. But I have a question.

Are .//text() the same across different versions of Nokogiri? Or could reply to here. Thanks!

@flavorjones
Copy link
Member

Closing this, since it's a symptom of the deeper issue described at #572

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants