-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move away from orConditionGroup
to UNION
-ized queries
#1068
Conversation
Some other small optimizations.
Was looking at other callers of the original method; however, missed that the caller as a DGI-specific module. No need to roll such a method in the core code... not that it would really hurt per-se, just not wanting to add method for it at the moment.
Also, handle other return-types.
I tested this PR locally and 100 calls to
for SELECT base_table.revision_id AS revision_id, base_table.tid AS tid
FROM
taxonomy_term_data base_table
INNER JOIN taxonomy_term_field_data taxonomy_term_field_data ON taxonomy_term_field_data.tid = base_table.tid
WHERE taxonomy_term_field_data.tid IN (SELECT f.entity_id AS entity_id
FROM
taxonomy_term__field_authority_link f
WHERE f.field_authority_link_uri = 'http://purl.org/coar/resource_type/c_18cc' UNION SELECT f.entity_id AS entity_id
FROM
taxonomy_term__field_external_uri f
WHERE f.field_external_uri_uri = 'http://purl.org/coar/resource_type/c_18cc')
GROUP BY base_table.revision_id, base_table.tid
LIMIT 1 OFFSET 0; Which will have the same result of running N number of queries, where N is the number of external URI fields in the site config. Which was the original issue with #1067 (comment)
|
Unsure what you're trying to say. The same result is expected; however, it's the actual execution of the query with which I'm concerned. Like, sure there will be N parts under the That said, I am unable to see the same query performance you're seeing between any of:
The |
Oh, I see. Yeah, that's reasonable. I thought you were concered about running N number of queries vs just one in a general sense. Yeah, makes sense since Drupal can tack on a lot of extras on a per-query basis.
With this script <?php
$i = 0;
while (++$i < 100) {
$term = \Drupal::service('islandora.utils')->getTermForUri("http://purl.org/coar/resource_type/c_18cc");
} on a fresh install I get
That's because there's pretty much nothing in On a fresh isle-site-template install I imported
So I think we still have a performance issue with this PR as-is. |
I mean, sure, but between either here or the On the other hand, maybe the method in question should be avoided, since it doesn't really confine to a given taxonomy and/or field, which is entirely the root cause of the complex query? For example, if you're looking for a model term, you could query more directly against the singular field of the model taxonomy, instead of calling against this method that queries against all fields authority link/external_uri fields for all taxonomies. (this already being a part of the implementation, just highlighting that for the signature of the method, either the present Somewhat along the same line, the idea of using the same URI being possible, but varying access control on the terms meaning different users can get different terms (from different taxonomies!) when they perform a lookup seems like a loaded foot-gun. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mostly agree with your concerns, and the solution(s) probably need to go into a 3+ release (and we should carve out an issue for those concerns).
As for today, for those running on 2.x: Given the getTermForUri
function is on a hot path I think we should go for faster is better and use #1067 and consider adding a similar fix to the media/file fix you added in this PR.
Somewhat along the same line, the idea of using the same URI being possible, but varying access control on the terms meaning different users can get different terms (from different taxonomies!) when they perform a lookup seems like a loaded foot-gun.
This would be a great thing for a status report warning @ /admin/reports/status
and/or in documentation
Is non-sensical on a draft PR accompanying the recommendation to adopt another PR?
GitHub Issue: (link)
Release pull requests, etc.)
What does this Pull Request do?
Issues reporting slowness in queries that was being patched by effectively avoiding the
orConditionGroup
, so, it seems we want to avoid them. At the same time, submitting the queries separately can be avoided by constructing the set of results byUNION
-ing all the queries together so, let's make it so.What's new?
A in-depth description of the changes made by this PR. Technical details and
possible side effects.
(i.e. Regeneration activity, etc.)?
How should this be tested?
A description of what steps someone could take to:
Documentation Status
Additional Notes:
Any additional information that you think would be helpful when reviewing this
PR.
Interested parties
Tag (@ mention) interested parties or, if unsure, @Islandora/committers