Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Zorba as an alternative XML/HTML processing engine #29

Closed
gerosalesc opened this issue Feb 18, 2016 · 3 comments
Closed

Support Zorba as an alternative XML/HTML processing engine #29

gerosalesc opened this issue Feb 18, 2016 · 3 comments

Comments

@gerosalesc
Copy link

This has been troubling me for some time now but I would like this project to support a more powerful XML/HTML processing engine as an alternative to Lxml. The only contender for lxml in Python: Zorba. But why?

  • Zorba supports XQuery technology as well as JSONiq.
  • Zorba has Python bindings. I know they are not precisely the best bindings ever but at least they exist.
  • I think XPath 1.0 is very limited for more complex structures.
  • Lxml extensions are ok but not that much when compared to XQuery capabilities by default.
  • Zorba can be hosted as a service.

Ideally, we should be able to use selectors with Zorba in this way:

Selector(response=response).xquery('...').extract()
or
response.selector.xquery('...').extract()

@eliasdorneles
Copy link
Member

Hello @gerosalesc !

So, to be fair I don't see lxml going away anytime soon, but this looks like a nice optional addition.

I'm not really familiar with Zorba nor its bindings, but this seems worth a proof-of-concept.
Could you please point me to some use cases when supporting XQuery would give the biggest benefits?

Thank you!

@gerosalesc
Copy link
Author

gerosalesc commented May 6, 2016

@eliasdorneles Hi there buddy. I have found myself in need of some of the features of XQuery when trying to do serious stuff to get the value from high complex HTML pages.

Let's say for example the FLWOR syntax, that alone would allow us to sort the values of a list of elements, not to mention that you can actually get more complex structures returned and perform some interesting data comparisons and transformations with functions of XPATH 2.0 which is supported by XQuery by default.

I understand that we are highly coupled but I think this change would take this library to a whole world of new possibilities.

For a PoC I see myself using XQuilla bindings because is seems to be easier. BTW you guys should consider XQuilla as well as Zorba.

@Gallaecio Gallaecio changed the title Alternatives to Lxml as XML processing engine Support Zorba as an alternative XML/HTML processing engine May 9, 2019
@Gallaecio
Copy link
Member

https://github.com/28msec/zorba seems dead, should we close this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants