Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🏗Add web performance measurement gulp task #26026

Merged
merged 1 commit into from
Jan 24, 2020

Conversation

kevinkimball
Copy link
Contributor

⚠️ Opening this WIP PR to get feedback on the approach

Add a performance measurement task to help contributors understand the performance impact of their changes. This is a step towards adding a performance check to CI (see #12128).

When run, downloads example AMP documents and loads them using Puppeteer in order to measure metrics that capture the user perceived speed of an AMP site. Compares the working branch against the current production release. Compiles and minifies the working branch, and downloads the production release to avoid measuring network latency. Prints a report to the console. Margin of error is around ~5% when running 100 iterations, which takes about 10-15 minutes (as tested by running the same document against itself).

Example report:

Screen Shot 2019-12-13 at 12 50 59 PM

Reference for metrics:

Screen Shot 2019-12-13 at 12 40 21 PM

Copy link
Member

@danielrozenberg danielrozenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave it to Kris to discuss the approach, it looks fine to me. Nits and other comments :)

build-system/tasks/lint.js Show resolved Hide resolved

Use `gulp performance-test` to run this test. It measures performance for the current branch compared to the current release. By default, the command first needs to compile minified runtimes from the current branch by executing the `gulp dist` task. To skip this step, use the `--no-compile` flag, but the task will throw an error if the dist files are missing. Only the runtime and components used by documents from the configured URLs are compiled.

The `config.json` file contains settings. If you set `headless`, you will be able to watch the test run in Chrome. It is not recommended to change `concurrency` (as it does not speed up the test). The default value for `runs` is 100, which takes ~10 minutes to run, and generates results with ~5% margin of error. Differences larger than this should be investigated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like setting headless means you won't be able to watch the test run, right?

"runs": 100,
"concurrency": 1,
"urls": [
"https://amp.cnn.com/cnn/2019/11/19/weather/tropical-storm-sebastien/index.html"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be some issues using real URLs without getting permission first; better talk to your friendly neighbourhood PM to get this conversation started :)

path.join(EXPERIMENT_CACHE_PATH, 'v0'),
].forEach(dirPath => {
try {
fs.mkdirSync(dirPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try-catch programming is more of a Python thing than JS. You can do fs.mkdirSync(dirPath, {recursive: true}) which doesn't throw an error if the directory exists, or wrap it with if (!fs.existsSync(dirPath))

build-system/tasks/performance-test/measure-documents.js Outdated Show resolved Hide resolved

try {
await page.goto(`file:${urlToCachePath(url, version)}`, {
waitUntil: 'networkidle0',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI we had some issues with networkidle0, sometimes websites keep an open connection (iframed ads tend to do that) for longer than the default timeout. networkidle2 worked as a compromise for us with visual diffs. If you get lots of errors here, try that.

Might be worth adding logging statements in the catch block to see how often that happens

package.json Outdated Show resolved Hide resolved
window.measureStarted = Date.now();
window.largestContentfulPaint = 0;

const longTaskObserver = new PerformanceObserver(list =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dumb question on my part, but is not possible (or maybe its more difficult?) to set this all up with a single observer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible, but then instead of three callbacks that do one thing, there would be one callback that does three things (with a switch or conditional). This seemed cleaner but I'm open to be persuaded otherwise!

Copy link
Contributor

@rsimha rsimha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really excited to see performance tests! Thanks for working on them. This is a promising start.

I've made a few preliminary comments below. Happy to review again, and discuss alternatives to things I've pointed out over chat / VC.

.gitignore Outdated Show resolved Hide resolved
build-system/tasks/lint.js Show resolved Hide resolved
gulpfile.js Outdated Show resolved Hide resolved
build-system/tasks/performance-test/index.js Outdated Show resolved Hide resolved
build-system/tasks/extension-helpers.js Outdated Show resolved Hide resolved
build-system/tasks/performance-test/compile-scripts.js Outdated Show resolved Hide resolved
Copy link
Contributor

@rsimha rsimha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mistakenly approved this earlier today. Changing status to "request changes" until this is further reviewed. Sorry!

Copy link
Contributor

@kristoferbaxter kristoferbaxter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor feedback. Overall looks like a great start for us to experiment with!

build-system/tasks/lint.js Show resolved Hide resolved

Use `gulp performance` to run this test. It measures performance for the current branch compared to the current release. By default, the command first needs to compile minified runtimes from the current branch by executing the `gulp dist` task. To skip this step, use the `--nobuild` flag, but the task will throw an error if the dist files are missing. Only the runtime and components used by documents from the configured URLs are compiled.

The `config.json` file contains settings. If you set `headless` to `false`, you will be able to watch the test run in Chrome. The default value for `runs` is 100, which takes ~10 minutes to run, and generates results with ~5% margin of error. Differences larger than this should be investigated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include the number of runs required to get different levels of confidence in a graph?

{
"headless": true,
"runs": 100,
"urls": ["https://www.nbcnews.com/news/amp/ncna1103401"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave this field empty (with instructions to add a URL) or use a default value of amp.dev?

function downloadToDisk(url, version = CONTROL) {
touchDirs();

return fetch(url)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Moving to async/await would be nice for this code. Including other sync operations using fs.promises.

const firstPaint = getMetric('first-paint');
const firstContentfulPaint = getMetric('first-contentful-paint');

function getMaxFirstInputDelay() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we opened a ticket with Puppeteer to get this information directly like fcp etc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if that's a Puppeteer issue? It's not in Chrome yet, so maybe it's more of a Chrome issue, but I think that is already underway.

headless,
args: [
'--allow-file-access-from-files',
'--enable-blink-features=LayoutInstabilityAPI',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Didn't know about this.


// Excecute the tasks serially
const [first, ...rest] = tasks;
await rest.reduce((prev, task) => prev.then(task), first());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to execute serially?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got more consistent results from executing serially, and on my machine saw overall execution time increased with concurrency

"private": true,
"name": "performance",
"version": "0.1.0",
"description": "Gulp performance",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: A bit more of a description would be nice!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I just copied the one in the e2e/package.json file here. I'm not really sure how this works with nested packages, is that an idiom or does it have an actual effect? What kind of desc would you expect to see?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no actual effect since this is a private package. I think it's sufficient to make the description something like "Dev-dependencies used by gulp performance".

The original sources are here, feel free to modify them as well if you like:

{
"private": true,
"name": "gulp-visual-diff",
"version": "0.1.0",
"description": "Gulp visual diff",
"devDependencies": {
"@percy/agent": "0.20.6",
"@percy/puppeteer": "1.0.8",
"puppeteer": "2.0.0"
}
}

{
"private": true,
"name": "gulp-e2e",
"version": "0.1.0",
"description": "Gulp e2e",
"devDependencies": {
"@babel/register": "7.7.7",
"babel-regenerator-runtime": "6.5.0",
"chromedriver": "79.0.0",
"puppeteer": "2.0.0",
"geckodriver": "1.19.1",
"selenium-webdriver": "4.0.0-alpha.4"
}
}

experiment.toString().padEnd(12),
percent(control, experiment),
].join(' | '),
`\n${''.padEnd(68, '-')}\n`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do these magic constants mean?

@kevinkimball kevinkimball force-pushed the perf-test branch 2 times, most recently from 5f482a3 to 800d533 Compare December 19, 2019 19:14
Copy link
Contributor

@rsimha rsimha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay! LGTM with a few minor comments. Pre-approving to unblock you. If you'd like, I'm happy to take another look before you merge.

/**
* @return {!Promise}
*/
async function performance() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add code here to automatically run yarn in the build-system/tasks/performance directory. E2E tests do it like this:

function installPackages_() {
log('Running', cyan('yarn'), 'to install packages...');
execOrDie('npx yarn --cwd build-system/tasks/e2e', {'stdio': 'ignore'});
}

async function e2e() {
// install e2e-specific modules
installPackages_();

@@ -0,0 +1,7 @@
# Performance test

This test has its own dependencies that must be installed by running `yarn` in this directory before proceeding.
Copy link
Contributor

@rsimha rsimha Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be done automatically. See comment above (in build-system/tasks/performance/index.js).


This test has its own dependencies that must be installed by running `yarn` in this directory before proceeding.

Use `gulp performance` to run this test. It measures performance for the current branch compared to the current release. By default, the command first needs to compile minified runtimes from the current branch by executing the `gulp dist` task. To skip this step, use the `--nobuild` flag, but the task will throw an error if the dist files are missing. Only the runtime and components used by documents from the configured URLs are compiled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It's more accurate to say "compile the minified runtime" than "compile minified runtimes".


const scripts = Array.from(dom.window.document.querySelectorAll('script'));
for (const script of scripts) {
if (script.src.startsWith('https://cdn.ampproject.org')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of hard-coding the cdn url, does it make sense to use urls.cdn or urls.cdnProxyRegex from here so that this code is immune to future changes in the proxy URL?

amphtml/src/config.js

Lines 37 to 46 in 7f870b9

export const urls = {
thirdParty: env['thirdPartyUrl'] || 'https://3p.ampproject.net',
thirdPartyFrameHost: env['thirdPartyFrameHost'] || 'ampproject.net',
thirdPartyFrameRegex: thirdPartyFrameRegex || /^d-\d+\.ampproject\.net$/,
cdn: env['cdnUrl'] || 'https://cdn.ampproject.org',
/* Note that cdnProxyRegex is only ever checked against origins
* (proto://host[:port]) so does not need to consider path
*/
cdnProxyRegex:
cdnProxyRegex || /^https:\/\/([a-zA-Z0-9_-]+\.)?cdn\.ampproject\.org$/,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't find a way to reuse this const because src is using es6 modules and build-system is using node require. However moved to a const.


const scripts = Array.from(dom.window.document.querySelectorAll('script'));
for (const script of scripts) {
if (script.src.startsWith('https://cdn.ampproject.org')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

@@ -0,0 +1,9 @@
{
"private": true,
"name": "performance",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: For consistency with other tasks, name this gulp-performance.

"private": true,
"name": "performance",
"version": "0.1.0",
"description": "Gulp performance",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no actual effect since this is a private package. I think it's sufficient to make the description something like "Dev-dependencies used by gulp performance".

The original sources are here, feel free to modify them as well if you like:

{
"private": true,
"name": "gulp-visual-diff",
"version": "0.1.0",
"description": "Gulp visual diff",
"devDependencies": {
"@percy/agent": "0.20.6",
"@percy/puppeteer": "1.0.8",
"puppeteer": "2.0.0"
}
}

{
"private": true,
"name": "gulp-e2e",
"version": "0.1.0",
"description": "Gulp e2e",
"devDependencies": {
"@babel/register": "7.7.7",
"babel-regenerator-runtime": "6.5.0",
"chromedriver": "79.0.0",
"puppeteer": "2.0.0",
"geckodriver": "1.19.1",
"selenium-webdriver": "4.0.0-alpha.4"
}
}

*/

const fs = require('fs');
const puppeteer = require('puppeteer');
Copy link
Contributor

@rsimha rsimha Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think puppeteer will have to be lazy-required like the visual-diff tests do, in order to prevent the top-level gulp task from looking for it.

function installPercy_() {
if (!argv.noyarn) {
log('info', 'Running', colors.cyan('yarn'), 'to install Percy...');
execOrDie('npx yarn --cwd build-system/tasks/visual-diff', {
'stdio': 'ignore',
});
}
puppeteer = require('puppeteer');
percySnapshot = require('@percy/puppeteer').percySnapshot;
}

*/
async function performance() {
installPackages_();
const {headless, runs, urls} = new loadConfig();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a message here for when no urls are specified or the config file is invalid? Currently with no URLs the error provided is:

[11:04:35] TypeError: getExtensionsFromArg is not a function or its return value is not iterable
    at setExtensionsToBuildFromDocuments (/Users/gharbiw/Development/amphtml/build-system/tasks/extension-helpers.js:165:8)
    at compileScripts (/Users/gharbiw/Development/amphtml/build-system/tasks/performance/compile-scripts.js:30:5)
    at performance (/Users/gharbiw/Development/amphtml/build-system/tasks/performance/index.js:41:9)
    at processTicksAndRejections (internal/process/task_queues.js:93:5)

@wassgha
Copy link
Contributor

wassgha commented Jan 7, 2020

A few more observations I had while using this for amp-carousel:

  • Towards ~100 websites, I started getting out of memory errors (FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory)
  • When fetching a website fails, the entire task fails (FetchError: request to https://amp.rbc.ru/rbcnews/society/08/12/2015/56bc8a6e9a7947299f72b7d3 failed, reason: read ECONNRESET)
  • A lot of websites with DDOS protection are flagging the tool and returning captcha pages instead, however I only discovered this by turning off headless mode (I assume those pages could affect the score). Not sure if there's anything we can do here but something to keep in mind.

@kevinkimball
Copy link
Contributor Author

@wassgha Thanks for helping to test.

That's a valid observation about large # of URLs, but since it's run serially against multiple URLs I'm not sure if it's worth fixing in this PR vs. just running the task multiple times.

With DDOS, not sure what we can do but we could try to change the UA. It would depend on what sort of anti-scraping measures the site is taking.

@kevinkimball kevinkimball force-pushed the perf-test branch 2 times, most recently from 9034443 to 6423e15 Compare January 10, 2020 21:24
largestContentfulPaint: window.largestContentfulPaint,
timeToInteractive: getTimeToInteractive(),
maxFirstInputDelay: getMaxFirstInputDelay(),
cumulativeLayoutShift: window.cumulativeLayoutShift,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is a percentage with value from 0 to 1 so let's multiply it by 100 otherwise it'll always get rounded up to 0 in the printed table

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The layout shift (LS) score is equal to the impact fraction multiplied by the distance fraction" per entry, and we sum all the entries together here. I think you are correct that typically this is going to be a low number, although not necessarily between 0-1. Will multiply by 100 for now although not 100% sure that's the easiest solution

.gitignore Outdated
@@ -31,3 +31,6 @@ firebase
firebase.json
.firebase/
src/purifier/dist
extensions/amp-analytics/0.1/vendors/*.json
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was removed in 2f76bee

@kevinkimball kevinkimball deleted the perf-test branch January 24, 2020 00:41
robinvanopstal added a commit to jungvonmatt/amphtml that referenced this pull request Jan 27, 2020
* master: (62 commits)
  📦 Update dependency fetch-mock to v8.3.2 (ampproject#26491)
  Revert 'Move mutator implementations out to a standalone service' (ampproject#26479)
  Fix syntax error (ampproject#26481)
  Add pespective back. (ampproject#26486)
  More user friendly errors in layout.js (ampproject#26448)
  ✨ Start logging AMP URL on SwG Pages (ampproject#26480)
  Fix border around desktop amp-story-pages. (ampproject#26449)
  Fix Story tests. (ampproject#26464)
  ✨ Performance Measurement Chrome Extension (ampproject#26333)
  amp-consent restrict iframe fullScreen if no focus  (ampproject#26461)
  Add performance benchmark task (ampproject#26026)
  ♻️ amp-script: emit warning if zero height and width. (ampproject#26444)
  ✨ Launch minimal-wrapper native CEv1 (ampproject#26360)
  ♻️ Lint: include externs (round 2) (ampproject#26446)
  amp-script: Create 'fill content' container for responsive/fluid (ampproject#26400)
  amp-consent remove cmp iframe focus (ampproject#26437)
  Disable macro-after-long-task in inabox. (ampproject#26459)
  Launch layoutbox-invalidate-on-scroll (ampproject#26430)
  Add amp-ad support for ByPlay (ampproject#25663)
  🏗 Add specific RTV opt-in to experiments.html (ampproject#26434)
  ...
@kevinkimball kevinkimball changed the title 🏗Add web performance measurement gulp task (WIP) 🏗Add web performance measurement gulp task Feb 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants