-
-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance 2020 queries #1091
Performance 2020 queries #1091
Conversation
…hive.org into performance-sql-2020
Thanks @max-ostapenko! Could you edit the PR description to include a checklist of the metrics needed by the chapter and check off the ones implemented in this PR so far? This will help us see at a glance how much work is still left to do. |
(See #1091 (comment)) |
Thanks @max-ostapenko and @dooman87. I've moved the checklist up to the top of the PR. |
@max-ostapenko @dooman87 the list looks good to me with the exception of changing |
@thefoxis thanks for reviewing checklist. I update it on the ticket. I'm currently trying to add a query for "Average/median changes in performance score between versions" metric and I struggle with understanding what it should look like. Could you please explain what the result of the query would look like? We already have two metrics that measures LH5 vs LH6 performance scores:
|
* Using last year dataset to show distribution of LH performance score
Not sure if this was my original comment, but to see the distribution of LH audit scores, I think this approach is straightforward and a bit more versatile; it counts the # and % of pages having each score. So we would query the raw distribution data and sum up those pages in the results sheet so they're grouped in fast/avg/slow buckets as needed, and/or we can chart the full distribution if that'd be interesting.
For this query I do think it makes sense to represent the distribution in terms of percentiles. For min and max you can use the 0 and 100th percentiles. The rest can be summarized by the 10, 25, 50 (median), 75, and 90th percentiles. Kind of like this query except we'd be distributing the difference of
SGTM Let me know if you'd like help with any of these queries. |
Also note that the |
…vs LH6 scores comparison
Thanks @rviscomi, it's all has perfect sense to me now. I updated queries as you explained above. Please, have a look and if it's all good we could start filling in spreadsheets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks everyone, we're very close. I just have some final feedback then we should be good to go.
In addition to these suggestions I'd also like to see one more query based on @bazzadp's PWA audits query that counts the percent of non-zero scores for each LH audit in the category:
#standardSQL
# Get summary of all lighthouse scores for a category
# Note scores, weightings, groups and descriptions may be off in mixed months when new versions of Lighthouse roles out
CREATE TEMPORARY FUNCTION getAudits(report STRING, category STRING)
RETURNS ARRAY<STRUCT<id STRING, weight INT64, audit_group STRING, title STRING, description STRING, score INT64>> LANGUAGE js AS '''
var $ = JSON.parse(report);
var auditrefs = $.categories[category].auditRefs;
var audits = $.audits;
$ = null;
var results = [];
for (auditref of auditrefs) {
results.push({
id: auditref.id,
weight: auditref.weight,
audit_group: auditref.group,
description: audits[auditref.id].description,
score: audits[auditref.id].score
});
}
return results;
''';
SELECT
audits.id AS id,
COUNTIF(audits.score > 0) AS num_pages,
COUNT(0) AS total,
COUNTIF(audits.score > 0) / COUNT(0) AS pct,
APPROX_QUANTILES(audits.weight, 100)[OFFSET(50)] AS median_weight,
MAX(audits.audit_group) AS audit_group,
MAX(audits.description) AS description
FROM
`httparchive.lighthouse.2020_08_01_mobile`,
UNNEST(getAudits(report, "performance")) AS audits
WHERE
LENGTH(report) < 20000000 # necessary to avoid out of memory issues. Excludes 16 very large results
GROUP BY
audits.id
ORDER BY
median_weight DESC,
id
Co-authored-by: Barry Pollard <[email protected]>
Co-authored-by: Rick Viscomi <[email protected]>
A few outstanding review comments, otherwise this is ready to merge and we can unblock the first draft. @dooman87 @max-ostapenko would either of you be able to resolve the last of the feedback before the weekend? |
Co-authored-by: Rick Viscomi <[email protected]>
Co-authored-by: Rick Viscomi <[email protected]>
Co-authored-by: Rick Viscomi <[email protected]>
Co-authored-by: Rick Viscomi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@max-ostapenko Thanks a lot for fixing my stuff. I've just managed to find some time to looks through changes. It all looks good to me, so let's get them in!
@dooman87 Yeah, I'm adding data into the charts now. Looking forward to seeing how the results look with new data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ship it!
Progress on #905
distribution of LH performance category scores
LH audits
LCP
FID
CLS
FCP
TTFB
Field Data
new PerformanceObserver
in JS (source)