-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*:first-of-type and friends are not implemented yet #4
Comments
Actually, the current implementation is broken. The selector |
Hi Simon, Do you know what the status of this is? Would these be easy to implement? I wanted to use |
I don’t know how easy it is. What’s needed it to find what the correct XPath translation is, if there is one in the general case. |
I see. I found this Wikibook last night, I'm not sure how reliable it is though. |
Example: <div>
<p id="a"/><p id="b"/><p id="c"/><p id="d"/><p id="e"/>
</div> In Selectors, I’m not convinced there even is a correct XPath translation of some Selectors. This kind of thing has lead me to believe that the entire premise of translating Selectors to XPath (or at least to XPath 1.0, what’s implemented in libxml,) is flawed. I’ve started work on cssselect2 which implements Selectors “for real” without XPath being involved, but it’s blocked on some design decisions that need to be made: Kozea/cssselect2#1 |
Thanks for the comprehensive answer @SimonSapin ! It's truly a shame if XPath 1.0 isn't flexible enough. I'm keen to see how |
For the record, here's a tentative implementation using an XPath extension function with lxml: scrapy/parsel#73 |
cssselect contributors, do you have any advice on this? There seem two solutions, cssselect2 and scrapy/parsel ... are any of these solutions mature enough ? Do i still need the cssselect package? |
'Type' refers to the element name [1], so e.g. `p.class:first-of-type` should select every first `p` child that also has the class `class`, not every first `p.class` child. Support in combination with the universal selector (`*:first-of-type` or `:first-of-type` instead of e.g. `p:first-of-type`) is not implemented. A proper translation to XPath 1.0 might not possible; it is not implemented in the Python cssselect library either [2]. [1] https://www.w3.org/TR/selectors-3/#nth-of-type-pseudo [2] scrapy/cssselect#4
'Type' refers to the element name [1], so e.g. `p.class:first-of-type` should select every first `p` child that also has the class `class`, not every first `p.class` child. Support in combination with the universal selector (`*:first-of-type` or `:first-of-type` instead of e.g. `p:first-of-type`) is not implemented. A proper translation to XPath 1.0 might not possible; it is not implemented in the Python cssselect library either [2]. [1] https://www.w3.org/TR/selectors-3/#nth-of-type-pseudo [2] scrapy/cssselect#4
From the docs:
The text was updated successfully, but these errors were encountered: