Performance 2020 queries #1091

max-ostapenko · 2020-07-24T21:49:31Z

Progress on #905

distribution of LH performance category scores

Distribution of Performance Score (slow, moderate, fast) on mobile in LH5
Changes to Performance Score from LH5 to LH6
Distribution of Performance Score (slow, moderate, fast) on mobile in LH6
Average/median changes in performance score between versions
LH5 ([?] 2020) vs LH6 (August 2020)

LH audits

all scores and weightings per performance category

LCP

% of CrUX origins that meet the "good" threshold (75%+ < 2.5s)
% of CrUX origins that meet the "NI" threshold (not good or poor)
% of CrUX origins that meet the "poor" threshold (25%+ >= 4.0s)
distribution by device
Segment by device, country and ECT

FID

% of CrUX origins that meet the good/NI/poor thresholds
distribution by device
Segment by device, country and ECT

CLS

% of CrUX origins that meet the good/NI/poor thresholds
distribution by device
Segment by device, country and ECT

FCP

% of CrUX origins that meet the fast/moderate/slow thresholds
Segment by device, country and ECT
distribution by device
Compare against 2019

TTFB

% of CrUX origins that meet the fast/moderate/slow thresholds
Segment by device, country and ECT
distribution by device
Compare against 2019

Field Data

% of website with new PerformanceObserver in JS (source)

…hive.org into performance-sql-2020

rviscomi · 2020-07-25T18:08:01Z

Thanks @max-ostapenko! Could you edit the PR description to include a checklist of the metrics needed by the chapter and check off the ones implemented in this PR so far? This will help us see at a glance how much work is still left to do.

dooman87 · 2020-07-25T20:11:35Z

(See #1091 (comment))

rviscomi · 2020-07-26T06:50:07Z

Thanks @max-ostapenko and @dooman87. I've moved the checklist up to the top of the PR.

sql/2020/09_Performance/web_vitals_by_country.sql

sql/2020/09_Performance/lcp_score_by_month.sql

sql/2020/09_Performance/lh6_vs_lh5_performance_score_02.sql

sql/2020/09_Performance/offline_origins.sql

sql/2020/09_Performance/web_vitals_by_country.sql

sql/2020/09_Performance/web_vitals_by_device.sql

thefoxis · 2020-07-27T01:25:01Z

@max-ostapenko @dooman87 the list looks good to me with the exception of changing Distribution of Performance Score (slow, moderate, fast) by device in LH5 to Distribution of Performance Score (slow, moderate, fast) on mobile in LH5 to match the capabilities of HA + the dataset from LH6. 👍🏻

dooman87 · 2020-08-13T21:10:54Z

@thefoxis thanks for reviewing checklist. I update it on the ticket. I'm currently trying to add a query for "Average/median changes in performance score between versions" metric and I struggle with understanding what it should look like. Could you please explain what the result of the query would look like?

We already have two metrics that measures LH5 vs LH6 performance scores:

Calculates number of sites where performance score changed either low (<10), medium (> 10 && < 30) or big (> 30).
Average change of score between LH5 -> LH6 as of take delta on all URL and calculate average.

* Using last year dataset to show distribution of LH performance score

rviscomi · 2020-10-04T16:37:17Z

this will calculate percentage of fast, avg, slow websites given the LH performance score. @rviscomi is this what you meant by distribution in the comment above?

Not sure if this was my original comment, but to see the distribution of LH audit scores, I think this approach is straightforward and a bit more versatile; it counts the # and % of pages having each score. So we would query the raw distribution data and sum up those pages in the results sheet so they're grouped in fast/avg/slow buckets as needed, and/or we can chart the full distribution if that'd be interesting.

This one finds minimum, maximum and avg delta between performance score of latest LH5 results and LH6. Correct me if I wrong @thefoxis, we'd like to see how score changed because users of LH were surprised by the sudden change of their score in some cases.

For this query I do think it makes sense to represent the distribution in terms of percentiles. For min and max you can use the 0 and 100th percentiles. The rest can be summarized by the 10, 25, 50 (median), 75, and 90th percentiles. Kind of like this query except we'd be distributing the difference of newScore - oldScore.

this one shows percents of websites where there was small, medium and avg change in performance score when they switched from LH5 and LH6. This one will also use the latest LH5 and September data for LH6 with the same reason as above.

SGTM

Let me know if you'd like help with any of these queries.

rviscomi · 2020-10-04T16:38:45Z

Also note that the lighthouse.2020_09_01_mobile data is now available.

…vs LH6 scores comparison

dooman87 · 2020-10-04T21:49:17Z

Thanks @rviscomi, it's all has perfect sense to me now. I updated queries as you explained above. Please, have a look and if it's all good we could start filling in spreadsheets

rviscomi

Thanks everyone, we're very close. I just have some final feedback then we should be good to go.

In addition to these suggestions I'd also like to see one more query based on @bazzadp's PWA audits query that counts the percent of non-zero scores for each LH audit in the category:

#standardSQL
# Get summary of all lighthouse scores for a category
# Note scores, weightings, groups and descriptions may be off in mixed months when new versions of Lighthouse roles out

CREATE TEMPORARY FUNCTION getAudits(report STRING, category STRING)
RETURNS ARRAY<STRUCT<id STRING, weight INT64, audit_group STRING, title STRING, description STRING, score INT64>> LANGUAGE js AS '''
var $ = JSON.parse(report);
var auditrefs = $.categories[category].auditRefs;
var audits = $.audits;
$ = null;
var results = [];
for (auditref of auditrefs) {
  results.push({
    id: auditref.id,
    weight: auditref.weight,
    audit_group: auditref.group,
    description: audits[auditref.id].description,
    score: audits[auditref.id].score
  });
}
return results;
''';

SELECT
  audits.id AS id,
  COUNTIF(audits.score > 0) AS num_pages,
  COUNT(0) AS total,
  COUNTIF(audits.score > 0) / COUNT(0) AS pct,
  APPROX_QUANTILES(audits.weight, 100)[OFFSET(50)] AS median_weight,
  MAX(audits.audit_group) AS audit_group,
  MAX(audits.description) AS description
FROM
  `httparchive.lighthouse.2020_08_01_mobile`,
  UNNEST(getAudits(report, "performance")) AS audits
WHERE
  LENGTH(report) < 20000000  # necessary to avoid out of memory issues. Excludes 16 very large results
GROUP BY
  audits.id
ORDER BY
  median_weight DESC,
  id

sql/2020/09_Performance/lh6_vs_lh5_performance_score_01.sql

sql/2020/09_Performance/median_lcp_score_by_month.sql

sql/2020/09_Performance/offline_origins.sql

sql/2020/09_Performance/web_vitals_by_country.sql

sql/2020/09_Performance/web_vitals_by_ect.sql

sql/2020/09_Performance/web_vitals_distribution_by_device.sql

sql/2020/09_Performance/lighthouse_performace_audits.sql

Co-authored-by: Barry Pollard <[email protected]>

Co-authored-by: Rick Viscomi <[email protected]>

thefoxis · 2020-10-07T01:14:17Z

@rviscomi thanks for stepping in, your explanations/knowledge about retrieving the metrics is definitely more advanced than mine. everything you said makes sense and should give us the right data to comment on 👍

@dooman87 thanks so much for your work! glad that @rviscomi’s explanations make sense 🙌

rviscomi · 2020-10-08T17:47:54Z

A few outstanding review comments, otherwise this is ready to merge and we can unblock the first draft. @dooman87 @max-ostapenko would either of you be able to resolve the last of the feedback before the weekend?

Co-authored-by: Rick Viscomi <[email protected]>

dooman87

@max-ostapenko Thanks a lot for fixing my stuff. I've just managed to find some time to looks through changes. It all looks good to me, so let's get them in!

max-ostapenko · 2020-10-12T08:21:14Z

@dooman87 Yeah, I'm adding data into the charts now. Looking forward to seeing how the results look with new data.

rviscomi

Ship it!

[email protected] and others added 8 commits July 20, 2020 06:02

Selecting LH performance score. WIP

8dd8a92

Selecting LH performance score. WIP

8892bb1

Selecting LH performance score. WIP

5ec7860

Delta for LH5 and LH6 performance scores

8e94984

Delta for LH5 and LH6 performance scores

a67e276

Delta for LH5 and LH6 performance scores

b111b5a

web vitals aggregated

38580fa

Web Vitals query suggested in the discussion

7d007b7

max-ostapenko mentioned this pull request Jul 24, 2020

SEO 2020 queries #1062

Merged

65 tasks

max-ostapenko added 4 commits July 24, 2020 23:54

cwv per device, ect and country

aff681b

offline origins

61a1093

webvitals distribution by device

4e61180

lcp score by month

d46e1b5

max-ostapenko mentioned this pull request Jul 25, 2020

CMS 2020 queries #1087

Merged

20 tasks

Merge branch 'main' of https://github.com/HTTPArchive/almanac.httparc…

83235de

…hive.org into performance-sql-2020

rviscomi changed the title ~~Performance sql 2020~~ Performance 2020 queries Jul 25, 2020

rviscomi added the analysis Querying the dataset label Jul 25, 2020

rviscomi added this to the 2020 Analysis milestone Jul 25, 2020

Removed unused query

8e13885

rviscomi reviewed Jul 26, 2020

View reviewed changes

max-ostapenko added 3 commits July 27, 2020 00:02

review updates

32e6a42

query name updated

d80737b

performance observer

a5cf826

Distribution of LH5 and LH6 permormance scores on mobile

078c503

* Using percentage to compare performance scores between LH5 and LH6.

f52241f

* Using last year dataset to show distribution of LH performance score

Using raw distribution for performance score and percentiles for LH5 …

aeb4f15

…vs LH6 scores comparison

rviscomi requested changes Oct 6, 2020

View reviewed changes

max-ostapenko added 3 commits October 6, 2020 03:21

review updates + lighthouse performance audits

5b0f2f9

year series fix

05b2ce2

fix median precision

62035e7

tunetheweb reviewed Oct 6, 2020

View reviewed changes

sql/2020/09_Performance/lighthouse_performace_audits.sql Outdated Show resolved Hide resolved

max-ostapenko and others added 4 commits October 6, 2020 13:29

Update sql/2020/09_Performance/lighthouse_performace_audits.sql

0b46b3e

Co-authored-by: Barry Pollard <[email protected]>

Update sql/2020/09_Performance/web_vitals_distribution_by_device.sql

c18af83

Co-authored-by: Rick Viscomi <[email protected]>

use August data from crux tables

9f5529a

merge changes from review suggestions

e1880d1

max-ostapenko and others added 5 commits October 10, 2020 03:26

Update sql/2020/09_Performance/lh6_vs_lh5_performance_score_01.sql

706e358

Co-authored-by: Rick Viscomi <[email protected]>

Update sql/2020/09_Performance/lh6_vs_lh5_performance_score_02.sql

2df9022

Co-authored-by: Rick Viscomi <[email protected]>

Update sql/2020/09_Performance/lh6_vs_lh5_performance_score_01.sql

b84733d

Co-authored-by: Rick Viscomi <[email protected]>

Update sql/2020/09_Performance/lh6_vs_lh5_performance_score_01.sql

b6c5aef

Co-authored-by: Rick Viscomi <[email protected]>

added yesr comparison

4f6598a

max-ostapenko requested review from rviscomi and a team October 10, 2020 11:37

linting

b91ce5c

dooman87 approved these changes Oct 11, 2020

View reviewed changes

rviscomi approved these changes Oct 12, 2020

View reviewed changes

rviscomi merged commit 66b014d into main Oct 12, 2020

rviscomi deleted the performance-sql-2020 branch October 12, 2020 17:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance 2020 queries #1091

Performance 2020 queries #1091

max-ostapenko commented Jul 24, 2020 •

edited

Loading

rviscomi commented Jul 25, 2020

dooman87 commented Jul 25, 2020 •

edited by rviscomi

Loading

rviscomi commented Jul 26, 2020

thefoxis commented Jul 27, 2020

dooman87 commented Aug 13, 2020

rviscomi commented Oct 4, 2020 •

edited

Loading

rviscomi commented Oct 4, 2020

dooman87 commented Oct 4, 2020

rviscomi left a comment

thefoxis commented Oct 7, 2020

rviscomi commented Oct 8, 2020

dooman87 left a comment

max-ostapenko commented Oct 12, 2020

rviscomi left a comment

Performance 2020 queries #1091

Performance 2020 queries #1091

Conversation

max-ostapenko commented Jul 24, 2020 • edited Loading

distribution of LH performance category scores

LH audits

LCP

FID

CLS

FCP

TTFB

Field Data

rviscomi commented Jul 25, 2020

dooman87 commented Jul 25, 2020 • edited by rviscomi Loading

rviscomi commented Jul 26, 2020

thefoxis commented Jul 27, 2020

dooman87 commented Aug 13, 2020

rviscomi commented Oct 4, 2020 • edited Loading

rviscomi commented Oct 4, 2020

dooman87 commented Oct 4, 2020

rviscomi left a comment

Choose a reason for hiding this comment

thefoxis commented Oct 7, 2020

rviscomi commented Oct 8, 2020

dooman87 left a comment

Choose a reason for hiding this comment

max-ostapenko commented Oct 12, 2020

rviscomi left a comment

Choose a reason for hiding this comment

max-ostapenko commented Jul 24, 2020 •

edited

Loading

dooman87 commented Jul 25, 2020 •

edited by rviscomi

Loading

rviscomi commented Oct 4, 2020 •

edited

Loading