-
-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix dedupe finder performance issue on looking up table size #25527
Merged
eileenmcnaughton
merged 1 commit into
civicrm:master
from
eileenmcnaughton:finder_build
Mar 10, 2023
Merged
Fix dedupe finder performance issue on looking up table size #25527
eileenmcnaughton
merged 1 commit into
civicrm:master
from
eileenmcnaughton:finder_build
Mar 10, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(Standard links)
|
Hmm - well if that is the test that hits it it is not where I expected .... CRM_Contact_Import_Parser_ContactTest.testIgnoreLocationTypeId |
eileenmcnaughton
force-pushed
the
finder_build
branch
2 times, most recently
from
February 8, 2023 22:11
4e4f886
to
bd83298
Compare
eileenmcnaughton
force-pushed
the
finder_build
branch
3 times, most recently
from
February 9, 2023 06:15
3c46200
to
35f33db
Compare
wmfgerrit
referenced
this pull request
in wikimedia/wikimedia-fundraising-crm
Feb 16, 2023
Note we have some recent patches that are included. These are mostly ones merged to civicrm-core since forking for 5.58. One is in 5.58 now so is listed for completeness & 2 minor ones are not yet merged upstream |pr|status|notes| |[25565](https://github.com/civicrm/civicrm-core/pull/25565)|merged to master(5.60)|Supports T327963 - activity types for thank you| |[25516](https://github.com/civicrm/civicrm-core/pull/25516)|merged to master(5.60)|Helps import - filter imported searches| |[25527](https://github.com/civicrm/civicrm-core/pull/25527)|not yet merged| but performance fix for dedupe searches (the name & address ones Sandra is doing) |[25433](https://github.com/civicrm/civicrm-core/pull/25433)|merged to rc (5.59)|fix for search kit display| |[25482](https://github.com/civicrm/civicrm-core/pull/25482)|Already in 5.58.1|Fix for search kit display| |[25418](https://github.com/civicrm/civicrm-core/pull/25418)|Merged to rc (5.59)|Permanent imap timout fix| |[25226](https://github.com/civicrm/civicrm-core/pull/25226)|Merged to rc (5.59)|ReportInstance api - in our install| |[25568](https://github.com/civicrm/civicrm-core/pull/25568)|not yet merged|Helps with new import code, reduces chance of crash| |[24515](https://github.com/civicrm/civicrm-core/pull/25415)|Merged to rc (5.59)| Fix to stop CI from breaking, replaces our hack| List of cherry picked commits from the above 18d82af27b9 (HEAD -> master) Filter expired searches from search kit results a75b98cb47a Add test to is_current on SavedSearch, fix e4d69230ad1 Increase timeout on imap 89ab855619f Do not crash the whole SearchKit UI if one entity fails 0142e5dd18c Fix performance on deciding which tables to query cd710af9733 dev/core#4117 Add is_current to UserJob, Search 2fa17fb88ae Make activity_type_id available to links, test 2d2bd93cda8 (dev/core#4088) ClassScanner - Move unit-test registration e3e669d4556 Add Report Instance apiv4 Bug: T329681 Change-Id: I839d1bbfabfe3558094f7445a4281f237b609fed
eileenmcnaughton
force-pushed
the
finder_build
branch
from
March 8, 2023 00:17
35f33db
to
bafcc24
Compare
we've been running this in production for several weeks now |
I'm happy with this between the |
test this please |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
Fix dedupe finder performance issue on looking up table size
Before
Our dedupe queries seem to be particularly slow, rule dependent, and I found that the the issue was slow
SELECT COUNT(*) FROM civicrm_address etc
queries. We recently fixed some of these in other place (SnapShot, an upgrade script). At the time I was under the impression the technique used here also had performance issues - ieHowever, I tested that query and it is not slow
414 rows in set, 1 warning (0.103 sec)
And is quicker for less tables.
I also traced back the origin of the code comment to that effect & nothing in there seems to back it up
#8636
https://issues.civicrm.org/jira/browse/CRM-19027
After
Use of a faster function, which we can also use from those other places
Technical Details
This is very hard to access under r-run as it deals with the ordering of queries for a perceived performance improvement - however I could step through it with
CRM_Contact_Import_Parser_ContactTest.testIgnoreLocationTypeId
and per the screenshots you can see that it re-ordered the array to put thecivicrm_address
query first based on the table size being smallerOriginal sort order
See in the variable
$tableQueries
the query involving the contact table is firstIt decides the the table lengths
See in the
$cachedResults
there are more rows in the contact table**And it re-orders based on that **
See in the variable
$tableQueries
the query involving the (shorter) address table is now firstComments