Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QueryBuilder #724

Merged
merged 10 commits into from
Nov 30, 2014
Merged

QueryBuilder #724

merged 10 commits into from
Nov 30, 2014

Conversation

webdevsHub
Copy link
Contributor

Hi, after the ScanAndScoll-Iterator I wrote a QueryBuilder for Elastica!

Goals

  • full DSL support
  • version check (is that command available in my elasticsearch installation?)
  • convenience (autocomplete, links to doc, good exception messages)
  • extendable (add/overwrite parts of the DSL)

Implementation

I took the idea from predis, because they have a similar situation:

  1. Every redis command is represented by a class in predis (like in Elastica: every Query/Filter/... has a class).
  2. They have Profile-Classes (I called them Version) which contains all supported commands of a Version.
  3. To execute a command via Client-Class predis uses __call() magic function to check whether or not that command is supported in configured version.

My implementation is different in some ways:

DSL part classes

The existent command classes in Elastica can be used for QueryBuilder because they have a fluent interface. So I only had to create wrapper classes to easily access/construct these classes.

I created one class for each elasticsearch DSL part (Query, Filter, Aggregation, Suggesters) which constructs every existent Elastica-Object with a convenient method definition. You can find this classes in Elastica\QueryBuilder\DSL.

This is the heart of QueryBuilder and all other stuff is optional. If you don't want the Version-check you could just use this classes:

$queryDSL = new \Elastica\QueryBuilder\DSL\Query();
$queryDSL->match('field', 'value'); // returns Elastica\Query\Match

I added every available method according to elasticsearch documentation to the DSL classes. In case a method is not implemented in Elastica (e.g. query "custom_filters_score") you get a Elastica\Exception\NotImplementedException.

Most of the DSL methods just pipe the required Constructor arguments to the class, but I changed some method definitions for convenience reasons (e.g. Elastica\QueryBuilder\DSL\Query::match())

There is only one single Filter, that causes problems: Elastica\QueryBuilder\DSL\Filter::geo_shape() 2 implementations exists: GeoShapeProvided, GeoShapePreIndexed. Maybe you have an idea to solve this?!

QueryBuilder

Elastica\QueryBuilder takes a Elastica\QueryBuilder\Version object and registers the default DSL objects. You can access this DSL object via __call() (names defined by Elastica\QueryBuilder\DSL::getType()) or via $qb->query() $qb->filter() $qb->agg() $qb->suggest() convenience methods.

Add or overwrite a DSL object with Elastica\QueryBuilder::addDSL()

Facade

Version check is done by Elastica\QueryBuilder\Facade. It wraps a DSL with a Version object and acts like a whitelist firewall. Note that the QueryBuilder convenience methods have a "wrong" @return annotation to get correct autocomplete.

The exception messages in Facade are very important, because they provide exact information about what is not working. For this reason I wrote tests for the exception messages.

Tests

I'm not sure if we should test DSL and Version classes, because they are basicly configurations. The actual work is done by the returned objects and they are tested pretty well. I would like to discuss this!

I added a showcase for QueryBuider usage in Elastica/Test/QueryBuilderTest::testQueryBuilder().

Conclusion

I'm open to critics, even if you want major changes. Lets discuss the QueryBuilder topic without any restrictions, because I think a QueryBuilder is a very important functionallity for a Client library. We can just merge my current implementation or see it as a first proposal and continue working on it.

*
* @return Aggregation
*/
public function agg() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not use shortcuts please, they're shortcut to hell :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

@damienalexandre
Copy link
Contributor

Wow, looks good! I like it 👍

@ruflin
Copy link
Owner

ruflin commented Nov 19, 2014

Wow, that is very nice. I really need to take some time to check this in detail but I don't see anything major that could hold us back. I especially like that is easy extendable to new features.

Can you also update the changes.txt file? And we should get your documentation above to Elastica.io as soon as it is merged.

@fprochazka
Copy link
Contributor

FYI the build is failing, could you please fix that? https://travis-ci.org/ruflin/Elastica/builds/41164326

@webdevsHub
Copy link
Contributor Author

Thank you, appreciate it 😄

There is no rush. I am not going to work on it before weekend anyway.

Regarding geo_shape filter: it could be merged in a single class Elastica\Filter\GeoShape with two methods (something like setProvided and preIndexed)

Regarding tests: best solution would be to write tests for every method in Elastica\QueryBuilder\DSL\* with different arguments than the corresponding class constructor to ensure a consistent API

I would like hearing your opinion on these two open issues

@webdevsHub
Copy link
Contributor Author

@fprochazka build is failing because of OutOfMemoryError exceptions. QueryBuilder tests have been successfully completed

@coveralls
Copy link

Coverage Status

Coverage decreased (-2.28%) when pulling 346ef5b on webdevsHub:QueryBuilder into 5fe627f on ruflin:master.

/**
* @var Version
*/
private $version;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please name all private and protected methods $_version with the underline.

@ruflin
Copy link
Owner

ruflin commented Nov 29, 2014

@webdevsHub I finnaly found the time to review the pull request. I made a comment concerning the variable naming. Outside that, I would like to proceed with merging as soon as possible as I think this will bring enormous value to the project. As soon as this is in the project we should deprecate Elastica\Query\Builder.

About geo_shape: I'm not 100% sure I understand the problem. Why don't you go just with geo_shape_provided as the function name?

We should increase the test set step by step, but have a basic test set ready.

What I also would like to have as soon as we merged it a basic documentation with 2-3 examples on the usage of the query builder on Elastica.io.

@webdevsHub
Copy link
Contributor Author

  • Added geo_shape_provided, geo_shape_preIndexed
  • Added underscore in private/protected variables

I wrote 2 "integration" tests:

  1. Elastica\Test\QueryBuilder\VersionTest to ensure that every method name used in Version classes is actually available in DSL classes (to prevent typos, ... in further contributions)
  2. Elastica\Test\QueryBuilder\DSL\* to avoid/detect breaking changes in further contributions. All method arguments are configured and the test will fail, if something changed there (it could break queries created with QueryBuilder on update). In general: The only allowed return values are a subclass of corresponding abstract class (like Elastica\Query\AbstractQuery) or a Elastica\Exception\NotImplementedException

@ruflin
Copy link
Owner

ruflin commented Nov 30, 2014

Why did you choose geo_shape_preIndex and not goe_shape_pre_indexed?

@webdevsHub
Copy link
Contributor Author

@ruflin can change it if you want, but I'm not really happy with that solution at all. DSL classes should reflect elasticsearch methods, not implementation specific stuff in Elastica.

There is only one geo_shape filter in elasticsearch, so the QueryBuilder should also have only one geo_shape filter. This is what I would expect.

@ruflin
Copy link
Owner

ruflin commented Nov 30, 2014

I see your point as the QueryBuilder maps to the DSL. The main reason Elastica has two different classes is because the parameters are so different. The question is how you would handle this?

The main problem I see currently is that if someone uses geo_shape it throws an not implemented exception, which is not really true.

Following suggestion:

  • We must find a solution to make geo_shape working
  • There is an "Elastica" extension to the DSL which includes for example geo_shape_provided

@webdevsHub
Copy link
Contributor Author

I think for now it's okay to support "only" geo_shape_provided and geo_shape_pre_indexed. I like that "Elastica" extension point of view, because now it looks like a feature, not a bug 😄

When you are writing a Query and want to use geo_shape filter, you are going to get these 2 options in IDEs autocomplety anyway, so I don't think that will cause any confusion

@ruflin
Copy link
Owner

ruflin commented Nov 30, 2014

Ok, then I would suggest to rename it geo_shape_pre_indexed to have the naming consistent.

@ruflin
Copy link
Owner

ruflin commented Nov 30, 2014

BTW: Can you add the pull request id to the changes.txt?

@webdevsHub
Copy link
Contributor Author

okay, done!

I would really like to add examples to elastica.io, but I am not going to do this before next weekend. If you want to merge it now, you could use the example I provided in Elastica\Test\QueryBuilderTest (in first commit) until next weekend

@coveralls
Copy link

Coverage Status

Coverage increased (+1.91%) when pulling 8a14240 on webdevsHub:QueryBuilder into 208c928 on ruflin:master.

@ruflin
Copy link
Owner

ruflin commented Nov 30, 2014

No worries about the documentation. Do you want to keep preIndexed or should we go with pre_indexed?

@webdevsHub
Copy link
Contributor Author

already changed it to geo_shape_pre_indexed

ruflin added a commit that referenced this pull request Nov 30, 2014
@ruflin ruflin merged commit 171e86a into ruflin:master Nov 30, 2014
@ruflin
Copy link
Owner

ruflin commented Nov 30, 2014

Merged. Let me know when you have time for some documentation on this new feature.

@dbu @merk This could also be interesting for the FOSElasticaBundle.

@dbu
Copy link
Contributor

dbu commented Nov 30, 2014

hey, this looks great! some documentation would be nice of course...

what would FOSElasticaBundle do with the builder? provide it as a service? or something more fancy?

@merk
Copy link
Collaborator

merk commented Nov 30, 2014

We wont need to do anything specific to support the QB, though we'll have to update our documentation to point to using this as the recommended way of handling queries.

@webdevsHub
Copy link
Contributor Author

Hi! You could create a new configuration for elasticsearch Version and construct QueryBuilder with that Version in a service

@jdeniau
Copy link
Contributor

jdeniau commented Jan 21, 2015

Hi @webdevsHub,

This seems really nice, but can you document an example of a "complex" query ?

Thank you

@webdevsHub
Copy link
Contributor Author

@jdeniau please take a look at the new documentation: http://elastica.io/getting-started/search-documents.html#section-query

@jdeniau
Copy link
Contributor

jdeniau commented Jan 21, 2015

Oops sorry I did not see that !

Thank you

On Wed, Jan 21, 2015 at 11:45 AM, Manuel Andreo Garcia <
[email protected]> wrote:

@jdeniau https://github.com/jdeniau please take a look at the new
documentation:
http://elastica.io/getting-started/search-documents.html#section-query


Reply to this email directly or view it on GitHub
#724 (comment).

@ewgRa
Copy link
Contributor

ewgRa commented Jan 13, 2016

On my mind goal "version check" must be achieved not by implementing "Version", but by code assistant from IDE.

Solution like Version allows understand that you use not supported feature only on runtime level. So, there is no way to prevent misuse of Elastica on development time.

Lets imagine we in future and Elasticsearch during a time deprecate 584 queries. When I type $qb->query()-> - I will see all 584 queries and must understand which one I can use or not?

@ewgRa
Copy link
Contributor

ewgRa commented Jan 13, 2016

For example sooner functionality will be dropped from Elasticsearch and from Elastica. From QueryBuilder it will be also dropped anyway.
And we return to situation when Version kind of strange thing - ability to use this or another functionality defined by code base.

Good case now - deprecation. We need to show user that query is deprecated now, but still supported by ES.

If developer will use Version140 - he will be wondering - why he have deprecation message, because it is deprecated in 1.7.0. It give unexpected behavior.

@ewgRa
Copy link
Contributor

ewgRa commented Jan 13, 2016

Since each Elastica release declare with which one Elasticsearch version it compatible - can you show real case of usage Version?

@webdevsHub
Copy link
Contributor Author

Let me first try do understand your question and issues before getting into discussion:

I added that version check because of the very fast changing Elasticsearch API (mostly in between 0.9 and 1.*) and because there are so many different versions used by developers out there.

The old way to handle that was: just use the corresponding Elastica version released for a specific Elasticsearch version. But those old versions are no longer maintained so you do not get bug fixes or new features.

Version check is intended to find misuse because you construct QueryBuilder with a specific ES version and it throws exceptions if you use something not supported in that version. Now you can use latest release of Elastica with old Elasticsearch DSL.

You are right: that exception is thrown on runtime level + @deprecated might be confusing if you do not read annotation message and DSL classes stacks up "deprecated waste".

That approach will no longer work when Elasticsearch change all query names (like your worst case with 584 deprecated queries) or when a new version is completly incompatible with old ones (like a query JSON structure change).

@ewgRa Have I understood that properly?

@ewgRa
Copy link
Contributor

ewgRa commented Jan 13, 2016

@webdevsHub Thank for detailed explanations and yes, you focus on my concerns here.

My point is:

  1. Version not prevent misuse. You can understand misuse only at runtime, it is not helpful. QueryBuilder must help, but now it is suggest wrong directions.
  2. Problems with deprecated and removed functionality.

@webdevsHub
Copy link
Contributor Author

So you want to remove all version stuff and change DSL classes to reflect current Elasticsearch API?

@ewgRa
Copy link
Contributor

ewgRa commented Jan 13, 2016

I can suggest to do this, if another developers will agree with me.

@ruflin
Copy link
Owner

ruflin commented Jan 13, 2016

Thanks for the details @webdevsHub @ewgRa . I would pose the question a little bit differently: Does it with Elastica 3.0.0 still make sense to keep the versions or is it just an overhead we have to maintain. My opinion is that we should have probably removed all the old versions with the release of 3.0.0 but unfortunately I missed it. As Elastica is decoupled from the elasticsearch versioning, major changes are minor or major version numbers. So when we break the QueryBuilder, there is probably also a new release out. And functions are mainly only added, rarely removed.

@webdevsHub What is your opinion on the versions? Are they still needed?

@webdevsHub
Copy link
Contributor Author

ewgRa pointed out that all this version stuff does not work + create problems. So lets just delete that. The only solution to ewgRa 2. problem would be to create DSL classes for each version in different repositories and let users replace DSL classes via composer; or create a IDE plugin :)

Removing version system + change DSL classes to current ES API (which includes useful deprecation annotations) is the best option in my opinion

@webdevsHub
Copy link
Contributor Author

I can create a pull request to remove version stuff. About DSL classes change: I am not up to date with ES API 2.* (I am still working only on 1.*), is there any overview of what is new, deprecated, ... without having to compare different ES doc HTML?

@ruflin
Copy link
Owner

ruflin commented Jan 14, 2016

@webdevsHub A PR would be great. I haven't seen a very simple comparison yet. Just go with what we have now. More functions can always be added.

@ruflin ruflin mentioned this pull request Jan 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants