-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Free Listings + Paid Ads: Extend product statistics API to return the number of syncable products #1667
Conversation
finished onboarding flow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, it's working and I'm getting the results as expected. However I have 500 products on my site and I'm already seeing slow responses, I added some suggestions for faster product queries.
I was expecting some more unit tests to break with these changes, but it seems we don't have unit tests for ProductStatisticsController or ProductSyncStats. Since we want to keep this moving I've logged a separate issue to get that done. See #1678
src/API/Site/Controllers/MerchantCenter/ProductStatisticsController.php
Outdated
Show resolved
Hide resolved
src/Jobs/ProductSyncStats.php
Outdated
* @return int | ||
*/ | ||
public function get_syncable_products_count(): int { | ||
$products = $this->product_repository->find_sync_ready_products()->get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all we are interested in is the count, then maybe we need to make a new helper function find_sync_ready_product_ids
. Otherwise by default it will return objects, which means one DB query for the set of products and many more per product to get the full product data.
An even better way would be to get the create a helper function specifically for returning the total count. See the pagination section here: https://github.com/woocommerce/woocommerce/wiki/wc_get_products-and-WC_Product_Query#parameters
This is accomplished by sending the following args to the product query:
$args = [
'return' => 'ids',
'paginate' => true,
'limit' => 1,
];
And then getting the total from $results->total
. That prevents it from having to assemble a large collection of ID's and then having to count them all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On further inspection this is more of an issue than I expected. The initial DB query is a rough estimate, we then apply filters on the product to allow extensions and special conditions to be excluded as well.
So we'll have to find a balance between either returning a rough estimate (without filtering), or switch to retrieving the products in batches (so we can apply filtering per batch). Might need some caching of the total so we don't do this too often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this performance problem and provided the solutions for it.
❓ I get this results
How come syncable products are 22 and disapproved + not synced are 21?? |
So I was finally able to test with 1000 products When I run the call in step 4 I got a fatal error after 10-15 secs wait
|
Good question, I think @ianlin also identified some discrepancies with the amount of counted products. It could be helpful if you check which product is not being included to see if we can track this down further. Probably best to move it to a separate issue.
Based on that error message your allowed memory size is set at 128MB, which is below the minimum recommendations. I initially didn't want to make the batch size too small, but can you try changing it to something like 200 here. We might need to keep it at a bit more conservative number in case there are servers with limited resources. |
Yeah, this problem was listed as a sub-task in the backend task 1 of the epic issue here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just confirm if changing the batch size worked for you to clear up the out of memory errors? |
I was doing some additional testing and was able to reproduce the out of memory error by reducing the memory limit to 128MB and increasing the amount of products I have. Using a smaller batch size does not fix the issue. Apparently somewhere along the line it's not clearing up the previous set of products which I think has to do with populating the caches, but it would need some further digging. I also noticed that the |
I did some tests and here's my result:
The only case I encountered OOM problem is right after I created 100 products at once using wc-smooth-generator, and calling the API without the cache. I think it's because when products were created the GMC product sync jobs were scheduled as well, so calling the product statistics API at this point would cause OOM. If I manually added the cache to my local DB, the OOM issue didn't occur and the API respond correctly. After all the GMC product sync jobs were completed, the OOM issue didn't occur even when the cache didn't exist. As a result I think it'd be acceptable for setting |
I just found a weird issue that I got |
Can you clarify what you mean by adding the cache to your local DB? I traced the memory usage and it seems to be a combination of the WP meta cache as well as the Here is an overview of memory usage (logging offset, count, peak memory usage, current memory usage)
I can reduce the memory usage by calling
The WC cache does it per product, so I can't find one function to clear all caching. We could try clearing it in a loop but that would slow down the call a bit more.
The query retrieves the product types |
Sorry for the confusion, what I really meant is the transient ( Based on discussion during the dev call, we decided to close this PR and:
For the discrepancies with the amount of counted products, per Mik's finding here (#1667 (comment)), we need to group the product with type |
Since we want to run the same filter the grouping needs to be done after the filter. So my thoughts would be for the Job storing the count to store (unique) included ID's, instead of storing a count as it runs through the batches. This is because of batch one could contain 5 variations of the same product and batch 2 could contain another 5. So if we store the parent ID as counted it won't count it twice. |
Changes proposed in this Pull Request:
This PR implements the backend task 1 📌 Extends
mc/product-statistics
API to return # of syncable products from the epic issue #1610. Here are what have changed:mc/product-statistics
and return a new datasyncable_products
in the response.syncable_products
, we use an existing method find_sync_ready_products insrc/Product/ProductRepository.php
.Google account connected and Merchant Centre setup complete
Previously when calling
mc/product-statistics
API:401
Google account is not connected
.gla_google_connected
.400
Merchant Center account is not set up
.gla_mc_setup_completed_at
mc/settings/sync
.In the step 4 of new onboarding flow (fqR0EHi63lWahRcVTKCcba-fi-449%3A80562), we haven't called the API
mc/settings/sync
so the optiongla_mc_setup_completed_at
is not being set. If we callmc/product-statistics
API at this stage it will respond400
Merchant Center account is not set up
.This PR also modifies the the above behaviour - if the merchant centre account is not set up (i.e. the option
gla_mc_setup_completed_at
is not being set), the API will still respond200
with the data withsyncable_products
and the emptystatistics
:Detailed test instructions:
wp-admin/admin.php?page=wc-admin&path=/google/settings
.GET
request tomc/product-statistics
401
Google account is not connected
.Complete your campaign with paid ads
.GET
request tomc/product-statistics
, the response should be something like this:GET
request tomc/product-statistics
again, the response should be something like this:Bonus test
Repeat the same tests as above on a site with a large amount of products 500+. Can use Smooth Generator to generate a set of test products. Note that this might generate some unsyncable products so the total syncable product count might vary.
The expected result is for the request to take longer, however it should still end up with an accurate count for the syncable products. It's stored in a transient, so the second time round should be faster.
The UI is going to be adjusted to show a message like "calculating product count" so it's not an issue if the request takes a little longer to complete.