-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Surface manifest and service worker URLs in JSON results #9683
Comments
There's If the artifacts could be added to HTTPArchive, there is the
and the
|
check out The manifest should even be that doesn't include the service worker URL, though, since the page itself isn't doing the network request for the file. #709 would/could do that, or we could put the service worker URL in the details of the |
I was too slow :)
probably won't happen, though, since the artifacts are large and growing without bound these days (and httparchive kind of already has its own artifacts it collects) |
Aha! This sounds really promising. (Edit: removing the part about the resource URLs, there was a bug in my lookup script :)) Would the artifacts be the only way to get the resource URLs? We already do some pruning of the LH response for size control, so it's feasible for us to only grab what we need. |
ha, just saw your edit. So you do see them in there? The only resource URL that would be missing should be the service worker (at least as far as I know the page would have to be doing something weird for it to show up in its own network requests). |
Yes, I fixed the bug and now I'm seeing the manifest URLs. 😁 As expected, I'm not seeing any #standardSQL
CREATE TEMP FUNCTION getUrls(report STRING)
RETURNS STRUCT<manifest STRING, sw STRING> LANGUAGE js AS '''
var $ = JSON.parse(report);
return $.audits['network-requests'].details.items.reduce((urls, item) => {
if (item.resourceType == 'Manifest') {
urls.manifest = item.url;
} else if (item.resourceType == 'ServiceWorker') {
urls.sw = item.url;
}
return urls;
}, {});
''';
SELECT
COUNTIF(urls.manifest IS NULL) AS null_manifests,
COUNTIF(urls.sw IS NULL) AS null_sq,
COUNT(0) AS total,
COUNTIF(urls.manifest IS NULL) / COUNT(0) AS pct_null_manifests,
COUNTIF(urls.sw IS NULL) / COUNT(0) AS pct_null_sw
FROM (
SELECT
getUrls(report) AS urls
FROM
`httparchive.lighthouse.2019_07_01_mobile`
WHERE
JSON_EXTRACT_SCALAR(report, '$.audits.installable-manifest.score') = '1')
Any ideas why that might be? |
Maybe try one of the problem URLs in Chrome and check out the network panel in DevTools? The |
that's actually another one we could manually annotate (this time on the So maybe we should fix this issue as originally requested after all :) |
@rviscomi just confirming this is still a valuable feature for you / HTTPArchive ? |
Yes, this is still useful for us! |
OK, this was a terribly confusing conversation to leave for posterity if the goal was to communicate what actually needs to be done :) but I think I've got a handle on it below. Probably best to assess/fix these two separately as they're not really related. Service worker URLWhen a page has a ServiceWorker, the browser requests the SW script, not the page, so it isn't in our Possible fix: the Web app manifest URLWhen a page has a web app manifest, it should be loaded by the page (caused by lighthouse itself, I believe?) and so it will show up in Possible fix: Assuming that 9% is just misclassified resource types, should we find the manifest URL in the One issue is that |
Our fetch of the manifest does not happen while we're recording devtools logs though, so I would assume this happens somehow as a result of us loading the page and there being a |
When LH discovers a page with a service worker or a manifest file, it would be hugely helpful for downstream services like HTTP Archive if those file URLs were included in the JSON results somehow. Otherwise we end up doing a lot of questionable regex parsing of the HTML.
plz i can has urls?
The text was updated successfully, but these errors were encountered: