-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements and various fixes #104
Conversation
Track queries, time. Fire actions to allow for environment-specific adjustments like sending DB reads to masters, autocommit=0, etc.
Non-strings fail the test in update_metadata() where get_metadata() is used to see if an update is needed.
Simplify and centralize term creation to avoid repeated term_exists(). Watch for changes to terms and meta before deciding whether to update the post. (Updating the post bumps post_modified, which we do want, but don't want on every import.) Also fixes package/subpackage assignments which were overwriting each other.
Nope on skipping anything. As previously discussed we might not use ignored stuff in reference (which I am against of personally), but Parser should provide as complete data as possible. |
Sorry, but we can't do that at present. See #92. We need to change the way we handle duplicates (#94) Most of the rest is probably great though.
Maybe this could be an option, but I agree with @Rarst, I think that in general the parser should provide as much data as possible (within reason) and let the user decide what to ignore further down the line. But the option to speed things up by skipping |
Ahh, I see this is handled in nacin/WP-Parser@05b013d by just skipping the duplicates. I think we want to know the locations of every place that the hook is used though. So instead of just skipping them we should save the file and line numbers. |
@Rarst: We have @JDGrimes: I avoided the issues switching to post_name by not importing secondary instances of a hook for now. That fixes the first half of #94. There are still some problematic situations (about 15 total) but all of them are issues of documentation or the parser (it doesn't detect when we break out of PHP and back again between the doc block and the hook), and are not to be imported. The second half of #94 would be to actually make note of these "duplicate" locations, which is low priority for actually launching this thing. As a general note: This PR is designed to make the importer both faster and also fairly idempotent. Once the database has been populated initially, running it again should result in no INSERT or UPDATE queries. Previously it was churning absolutely everything. |
Another aspect: by preventing churn, that means post_modified would actually be accurate and can be used to indicate and track recent updates. Previously all posts were getting updated even if no data (including meta or tax) were changing. This PR handles that. |
👍 I see now that we are on the same page (I should've finished looking at the whole PR before commenting. 😊). |
Ah, right. As per #16 I am for opt-out, rather than current opt-in about internal.
Makes sense in core context, however parser is not just for core. Can't know how other people are possibly using the tag. Also the ignored tag doesn't seem to be part of PSR-5 spec at all. As for me that's argument against special handling for it, more so against "lossy" one. |
Performance improvements and various fixes
Merging, but I'll look more at handling ignore part later. |
These commits do a number of things: