Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jamstack 2020 queries #1228

Merged
merged 20 commits into from
Oct 12, 2020
Merged

Jamstack 2020 queries #1228

merged 20 commits into from
Oct 12, 2020

Conversation

denar90
Copy link
Member

@denar90 denar90 commented Aug 18, 2020

Queries for the Jamstack 2020 chapter (#878)

  • Adoption of image formats in SSGs
  • Core Web Vitals distribution by CDNs hosted top SSGs
  • Core Web Vitals of CDNs hosted SSGs
  • Distribution of page weight, requests, and co2 grams per SSG web page
  • Lighthouse category scores per SSG
  • SSG adoptions and top SSGs YoY
  • Core Web Vitals distribution by SSG
  • Core Web Vitals performance by SSG
  • Third party bytes and requests on SSGs

Spreadsheet

@denar90 denar90 added the analysis Querying the dataset label Aug 18, 2020
@denar90 denar90 mentioned this pull request Aug 30, 2020
10 tasks
@foxdavidj foxdavidj added this to the 2020 Analysis milestone Aug 31, 2020
@ahmadawais
Copy link
Member

Hi @denar90 doesn't look like the query was able to pick up any Next.js websites?

@denar90
Copy link
Member Author

denar90 commented Sep 3, 2020

@ahmadawais, yup, we didn't include Next.js due to research - #878 (comment)

@ahmadawais
Copy link
Member

That comment is not leading anywhere on mobile app. Will check later. But without Nextjs the chapter would be incomplete. It's the biggest Jamstack framework.

@denar90
Copy link
Member Author

denar90 commented Sep 3, 2020

Totally agree. In other hand we can provide readers with wrong data, because users can use Next.js differently (like SSR or just SPA). Having no distinguish for that will lead to corrupted data. Maybe we can put "message" to Next.js team, so they can provide some indication for type of Next.js usage. Next year we will be able to include it and be 100% confident in data.

@ahmadawais
Copy link
Member

I believe that Next is Jamstack. JAM means JavaScript APIs and Markup. It does NOT mean static. SPAs, SSRs, built with Next are different. Next statically optimizes all types of pages CSR/SSR/SSG. They advertise this as well.

These are hybrid apps. Almost every other web app using Gatsby or the likes also has SSR stuff or completely dynamic pages like contact forms and whatnot.

Next belongs to Jamstack. The way you implement SPA with Next.js, the git push, and deploy is also part of the Jamstack philosophy. We might end up with disclaimers — but I plan to mostly write about this as Jamstack and beyond (similar stuff) — since the line is not as clear.

I don't feel comfortable dropping Next.js. I ask you to reconsider and add the statistics for Next.js sites as well.

@remotesynth
Copy link

It's complicated. I do think Next.js popularity makes it a tough decision to ignore it. I also think Next.js has traditionally been more popular as an SSR solution than a true Jamstack solution. I disagree with @ahmadawais in that it most definitely does mean static - as in static aka pre-rendered assets though not necessarily a static app.

This is why emphasizing the JAM acronym in Jamstack continues to frustrate me because it can lead to confusion like this. An SSR app is not Jamstack, even if the framework is capable of Jamstack. There are hybrid apps using Next.js which blur the lines a bit, but if an SSR app that uses JavaScript on the frontend was the primary criteria, nearly ever web app could claim to be Jamstack and the whole thing becomes meaningless.

So, all that being said, I think we have two options:

  1. Include Next.js with the caveat that it may include a significant number of sites that are not truly Jamstack.
  2. Exclude Next.js with the explanation that it's historical popularity as an SSR solution would muddy the waters too much.

I am honestly fine with either but a worthwhile exercise might be to run a query with Next.js and see if it looks like it would greatly influence the results. How many are there? Do they seem to have differing profiles than the other data based upon a cursory review? My opinion would be, if the results would be substantively different with Next.js, it is probably because it includes a number of irrelevant sites.

@denar90
Copy link
Member Author

denar90 commented Sep 6, 2020

@ahmadawais @remotesynth I'm fine having some disclaimer/explanation about Next.js.

I totally reworked queries. Now they includes all SSGs (+ Next.js). Also now we have

  • core_web_vitals_distribution.sql
  • core_web_vitals_passing.sql
  • third_party_bytes_and_requests_on_ssgs.sql

Also started working on GSheet
and planning to have queries similar to #1087

@rviscomi
Copy link
Member

Be sure to update this from "Draft" to "Ready for review" so we can get more eyes on it

@ahmadawais
Copy link
Member

@denar90 we are getting close to the deadline. Is it ready from your side to be reviewed?

client,
CDN,
COUNT(DISTINCT origin) AS origins,
SUM(fast_lcp) / (SUM(fast_lcp) + SUM(avg_lcp) + SUM(slow_lcp)) AS good_lcp,
Copy link
Member

@rviscomi rviscomi Sep 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Throughout) It might not matter but this is simpler and consistent with other CWV queries

Suggested change
SUM(fast_lcp) / (SUM(fast_lcp) + SUM(avg_lcp) + SUM(slow_lcp)) AS good_lcp,
SUM(fast_lcp) / SUM(fast_lcp + avg_lcp + slow_lcp) AS good_lcp,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment applies throughout the query

JOIN (
SELECT
CASE
WHEN REGEXP_EXTRACT(LOWER(CONCAT(respOtherHeaders, resp_x_powered_by, resp_via, resp_server)), '(x-github-request)') = 'x-github-request' THEN 'GitHub'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Throughout)

Suggested change
WHEN REGEXP_EXTRACT(LOWER(CONCAT(respOtherHeaders, resp_x_powered_by, resp_via, resp_server)), '(x-github-request)') = 'x-github-request' THEN 'GitHub'
WHEN REGEXP_CONTAINS(CONCAT(respOtherHeaders, resp_x_powered_by, resp_via, resp_server), '(?i)x-github-request') THEN 'GitHub'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment applies throughout the query

Comment on lines 38 to 39
FROM
`httparchive.summary_requests.2020_08_01_*`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Throughout) Do you also need a WHERE firstHtml clause here so that you're only looking at requests for pages that are served with these headers, not any misc 3P requests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can you take a look at the whitespace related to this change? firstHtml is appearing misaligned with the rest of the query.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be aligned

@rviscomi rviscomi marked this pull request as ready for review September 19, 2020 23:43
@rviscomi rviscomi requested a review from a team September 19, 2020 23:43
@denar90
Copy link
Member Author

denar90 commented Sep 22, 2020

@rviscomi can you have one more look?

@@ -0,0 +1,65 @@
#standardSQL
# Core Web Vitals distribution by SSG
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this identical to cdn_core_web_vitals_distribution.sql?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, I merged them into 1 query.

@@ -0,0 +1,85 @@
#standardSQL
# Core Web Vitals performance by CMS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same as cdn_core_web_vitals_passing.sql? I see that it's querying 2020_07, but not sure if that's intentional.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be 2020_08, thanks

sql/2020/17_JAMstack/core_web_vitals_distribution.sql Outdated Show resolved Hide resolved
sql/2020/17_JAMstack/core_web_vitals_passing.sql Outdated Show resolved Hide resolved
sql/2020/17_JAMstack/median_lighthouse_score.sql Outdated Show resolved Hide resolved
Comment on lines +50 to +51
LOWER(category) = "static site generator" OR
app = "Next.js"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: should Next.js be categorized as SSG? If so could you open an issue on the Wappalyzer repo? https://github.com/AliasIO/wappalyzer/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now it's hard to say. I included that regarding our last discussion #1228 (comment)
I expect some reaction from Next.js team after publishing the chapter, so they can add some markers which will help Wappalyzer detect Next.js when it's used as SSG.

@denar90
Copy link
Member Author

denar90 commented Oct 12, 2020

@rviscomi can we merge it, or do I have to change something more?

@rviscomi
Copy link
Member

Ship it!

@rviscomi rviscomi merged commit 2815281 into main Oct 12, 2020
@rviscomi rviscomi deleted the jamstack-sql-2020 branch October 12, 2020 16:08
@denar90 denar90 mentioned this pull request Sep 28, 2021
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analysis Querying the dataset
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants