Performance & diagnostics best practices - call for ideas #256

goldbergyoni · 2018-09-26T15:34:15Z

We're kicking off the performance best practices section. Any idea regarding performance best practices or great bibliography to extract ideas from (youtube, blog post, etc)?

This issue will summarize all the ideas and bibliography and serve as a TOC for the writers. This is also a call for write or would like to learn by practicing, discussing and writing.

@sagirk @BrunoScheufler @js-kyle

goldbergyoni · 2018-09-26T15:41:03Z

Title: Monitor first customer-facing metrics
Gist: Going by Google's SRE book, monitoring should focus on metrics that immediately impact the customer, practically the golden signals: API Latency, Traffic, Errors, and Saturation. Also relevant are the RED framework and the USE method

Anything else you monitor (e.g. event loop) is a cause and not a symptom. Quote from My Philosophy on Alerting:

"But," you might say, "I know database servers that are unreachable results in user data unavailability." That's fine. Alert on the data unavailability. Alert on the symptom: the 500, the Oops!, the whitebox metric that indicates that not all servers were reached from the database's client. Why?

Examples: we will show how to acheive this in Node.js

profnandaa · 2018-09-26T21:30:33Z

Perhaps debugging can be part of it? This is a good place - https://www.joyent.com/node-js/production/debug

js-kyle · 2018-09-27T06:18:34Z

The express documentation has good tips: https://expressjs.com/en/advanced/best-practice-performance.html

We may want to review the current production practices section, as they may overlap or be more appropriate as part of the new section, for example the bullet around serving assets from a CDN rather than Node.js

sagirk · 2018-09-27T06:21:05Z

Choose the classical for loop over forEach/ES6 of when dealing with huge amounts of data (e.g., 10/100 million+).

Reasons why:
https://medium.com/tech-tajawal/loops-performances-in-node-js-9fbccf2d6aa6

https://stackoverflow.com/questions/500504/why-is-using-for-in-with-array-iteration-a-bad-idea

https://stackoverflow.com/questions/43031988/javascript-efficiency-for-vs-foreach/43032526#43032526

https://stackoverflow.com/questions/3520688/javascript-loop-performance-why-is-to-decrement-the-iterator-toward-0-faster-t

sagirk · 2018-09-27T06:21:28Z

Use a version of node that ships with new TurboFan JIT compiler rather than just the older Crankshaft compiler.

Reasons why:
https://medium.com/the-node-js-collection/get-ready-a-new-v8-is-coming-node-js-performance-is-changing-46a63d6da4de

AbdelrahmanHafez · 2018-09-27T09:50:05Z

Deal with database and external APIs in batches, meaning that a developer should favor, and try to fetch a 100 entities using a single HTTP request, instead of a 100 HTTP requests with a single document each.

Same goes for database operations, writing and fetching data are faster when done in batch rather than multiple operations.

goldbergyoni · 2018-09-27T13:24:32Z

Copy pasting from the reddit discussion:

"I usually write to log/monitor on DB query start and query end so I can later identify avg query time and identify the slowest queries. Using Sequelize you can pass the flag {benchmark:true} and get the query times logged. Later you can bake/export to your monitoring system and create metrics based on it"

goldbergyoni · 2018-09-27T13:29:59Z

Use factories or constructors to ensure objects are created using the same schema so the v8 won't have to generate hidden classes on runtime (a.k.a POLYMORPHIC VS MONOMORPHIC FUNCTIONS)

https://medium.com/the-node-js-collection/get-ready-a-new-v8-is-coming-node-js-performance-is-changing-46a63d6da4de

VinayaSathyanarayana · 2018-10-01T08:36:48Z

Analyze repeated API Calls/Database Queries to cache them.
Debounce and Throttle : https://css-tricks.com/debouncing-throttling-explained-examples/

goldbergyoni · 2018-10-02T05:30:59Z

When evaluating alternatives and need to measure performance, use benchmark tooling like auto-canon and benchmark js which can provide more precise results (microtime, run many iterations) and better benchmarking performance (e.g. issue more call per second)

TheHollidayInn · 2018-10-05T00:03:28Z

Analyzing and monitoring cpu bottlenecks or memory leaks with Node Clinic/Prometheus

TheHollidayInn · 2018-10-07T17:30:56Z

A few others:

Use gzip compression
Serve from a CDN
use a priority queue for high db usage/long running cpu processes
optimize queries (such as indexing)
parallelize operation where possible
use http2
use the cluster module

References:

barretojs · 2018-10-07T18:00:31Z

i think a good performance practice is to use a load balancer for request and a load balancer to distribute the node.js app on the multiple cores of your cpu(because as we know node is single threaded). the latter can be achieved EXTREMELY easily using pm2, setting one flag that does the multi core load balancing automatically.
besides, pm2 can make really easy to monitor the memory and cpu usage of your app, and it has a lot of other amazing features.

http://pm2.keymetrics.io/

https://medium.com/iquii/good-practices-for-high-performance-and-scalable-node-js-applications-part-1-3-bb06b6204197

goldbergyoni · 2018-10-07T21:34:11Z

@TheHollidayInn Great list!

Few remarks/questions:

Use gzip compression & serve from a CDN - that is more frontend related tips, are we covering frontend as well? never dealt with questions, I don't know and tend to think that - No. What does @sagirk @BrunoScheufler and @js-kyle think?
HTTP2 - do we have evidence/reference that http2 reduces the load on the server?

goldbergyoni · 2018-10-07T21:44:22Z

@barretojs First, welcome on board, good to have you here. I'm not sure about this advice as PM2 is using the cluster module under the hood (I think so at least, does it?) which seems to be slower than *real router like nginx and iptables

https://medium.com/@fermads/node-js-process-load-balancing-comparing-cluster-iptables-and-nginx-6746aaf38272

What do you think?

TheHollidayInn · 2018-10-07T21:54:07Z

@i0natan Np!

For gzip and CDN, I included this because they are decisions you'd make while building out web apps (like using Express). If you serve static content with express, using gzip will be helpful. Also, it might be good to just note the benefits of using a CDN rather than serving your own static content.
I'm not sure load is improved. Although Multiplexing may help. Here is a list of benefits: https://webapplog.com/http2-node. To me, it seems more helpful for front end, but I'll admit I'm still new to http2.

barretojs · 2018-10-07T22:00:37Z

@i0natan you're right. i wasn't aware of iptables' capability, and the numbers are impressive. i think there's no doubt that for pure performance it distributes the load way better. thanks for the great in-depth article!

goldbergyoni · 2018-10-07T22:14:03Z

/remind me in 2 days

reminders · 2018-10-07T22:14:06Z

@i0natan set a reminder for Oct 9th 2018

js-kyle · 2018-10-07T22:57:16Z

I'd definitely +1 using the CDN for static assets - we have this as a production best practice at the moment. I think the performance section and production sections are going to have a lot of potential crossover so it's a good time to clear up the differences

Again, with gzip & other CPU intensive tasks, we actually recommend that they are better handled outside of Node for performance reasons. So I guess it's whether we want to adjust our view on this, or, as part of this section recommend to not use Node. Thoughts? See current bullets 5.3 & 5.11

reminders · 2018-10-09T09:10:39Z

👋 @i0natan,

goldbergyoni · 2018-10-11T06:58:04Z

Adding one more which exists in our production best practices: set NODE_ENV=production if rendering on the server as it utilizes the templating engine cache

goldbergyoni · 2018-10-11T07:01:00Z

Also highly recommends this video to be included in our bibliography: https://www.youtube.com/watch?v=K8spO4hHMhg&vl=en

mohemos · 2018-12-20T10:33:59Z

For MySQL Users:

Always run chains of queries in TRANSACTION, allow mysql to handle rollback and commit don't do it yourself, this would help you move faster with less maintenance.
Create pool of connections in order to use more than one connections concurrently and make sure you release connection when you're done....
Promisify your DB Connections and run queries in paralllel via async/await and Promise.all()

goldbergyoni · 2018-12-21T08:14:35Z

@Berkmann18 Great set of advice. And welcome aboard.

Why removing orphaned dependencies (tree shaking) improves performance? what is "Memoising"?

@mohemos Welcome & Thank you. Adding some of the advice to the pool #302

Berkmann18 · 2018-12-21T09:50:02Z

@Berkmann18 Thank you.
Tree shaking improves performance because it will only include code that is used instead of importing/requiring every exported functions/data structure from modules/libraries/frameworks which will make the resulting code lighter and easier to transfer and parse.
Here's another explanation of what tree shaking is.

Example:
Importing everything but using only add;

//NodeJS
const maths = require('maths');

//ES
import * from 'maths';

Only importing add.

//NodeJS
const { add } = require('maths');
//Or
const add = require('maths').add;

//ES
import { add } from 'maths';
import add from 'maths/add';

Memoising is where a code does something then instead of doing it again, which might waste some resources to do the exact same thing as done before, it simply gets what was done and stored in a variable.

Example:
Without memoising:

const fx = (x, y) => {
  console.log('The result is', doALongCalculation(x, y));
  return doALongCalculation(x, y);
};

With memoising:

const fx = (x, y) => {
  const result = doALongCalculation(x, y);
  console.log('The result is', result);
  return result;
};

So it's basically where the code will memorise some data that will be stored in a variable instead of re-doing the computation/fetching process again and again.

goldbergyoni · 2018-12-24T13:47:03Z

@Berkmann18 Thanks for the comprehensive and great explanation.

Memoising sounds like a good fit for the General bullet - a bullet which holds all generic practices that are not node.js related

About tree shaking - I'm familiar with it, for frontend the benefit is obvious, but why would it improve backend performance? during run-time anyway all the files are parsed (this is how 'require' work), do you mean to include also re-bundling of the code and pack in less files? I'm looking for the exact part that will boost performance

Berkmann18 · 2018-12-25T01:33:10Z

@i0natan You're welcome 😄 .
Tree shaking will prevent servers from fetching things it doesn't need which enable it to get to the main code faster as well as not having to go through more resources than it needs to run.
It might not make much of a difference on static dependencies (aside from the size of the backend codebase) but for dynamic ones, it does (at least a tiny bit).
When it comes to bundling, it will allow the bundling tool to include less stuff which would encourage for a smaller, more optimised bundle.

VinayaSathyanarayana · 2018-12-29T03:40:52Z

We should add throttling to the list

VinayaSathyanarayana · 2018-12-29T04:06:38Z

We should add
"Avoid Functions in Loops" - Writing functions within loops tends to result in errors due to the way the function creates a closure around the loop

goldbergyoni · 2018-12-30T16:37:17Z

@Berkmann18

It might not make much of a difference on static dependencies (aside from the size of the backend codebase) but for dynamic ones, it does (at least a tiny bit).

Can you provide an example, maybe even using code, and some data/benchmark that shows the impact? just want to ensure we address popular issues that leave a serious performance foortprint

goldbergyoni · 2018-12-30T16:40:50Z

@VinayaSathyanarayana Thanks for joining the brainstorm :]

"Avoid Functions in Loops" - Writing functions within loops tends to result in errors due to the way the function creates a closure around the loop

Can you provide (pseudo) code example? are we concerned because of performance or the code readability/errors?

Berkmann18 · 2018-12-30T18:25:26Z

@i0natan Sure, I've made a repo with benchmarks using only math as an example.

TheHollidayInn · 2019-01-14T16:12:42Z

@AbdelrahmanHafez For your proposed idea Deal with database and external APIs in batches, do you have any links or references for this?

migramcastillo · 2019-03-13T19:03:41Z

* Promisify your DB Connections and run queries in paralllel via async/await and Promise.all()

Not only for DB purpose, using Promise.all instead of lot of async/await should be a performance best practice, i.e. we need to call 4 API calls correctly with Axios to build our complete response, instead of using await for each one is better to use a single await with Promise.all().

goldbergyoni · 2019-03-17T16:43:34Z

@migramcastillo Makes a lot of sense. Do you want to write that bullet?

goldbergyoni · 2019-03-17T16:45:53Z

@js-kyle Maybe add this idea to our TOC issue?

migramcastillo · 2019-03-19T22:47:52Z

@migramcastillo Makes a lot of sense. Do you want to write that bullet?

I'd be glad to, in which branch should I make the PR?

sagirk · 2019-03-21T16:52:05Z

I'd be glad to, in which branch should I make the PR?

@migramcastillo If you meant to ask what branch to raise your PR against, that would be master. If you meant to ask what branch to make your changes in, feel free to create a new branch in your fork.

stale · 2019-06-19T17:29:26Z

Hello there! 👋
This issue has gone silent. Eerily silent. ⏳
We currently close issues after 100 days of inactivity. It has been 90 days since the last update here.
If needed, you can keep it open by replying here.
Thanks for being a part of the Node.js Best Practices community! 💚

abhishekshetty · 2020-04-10T14:11:14Z

* Promisify your DB Connections and run queries in paralllel via async/await and Promise.all()
Not only for DB purpose, using Promise.all instead of lot of async/await should be a performance best practice, i.e. we need to call 4 API calls correctly with Axios to build our complete response, instead of using await for each one is better to use a single await with Promise.all().

Although Promise.all is better than sequential non-dependent promises in most of the cases, It works best only when the number of calls/promises are known before hand or atleast few. In case of dynamic number of promises, e.g. batching in runtime and if the number of calls increases significantly it might endup exhausting all the connection pool or create deadlocks or if it http calls then spam the service. sample stackoverflow link https://stackoverflow.com/questions/53964228/how-do-i-perform-a-large-batch-of-promises

abhishekshetty · 2020-04-10T14:44:33Z

Maybe have a section for in-memory cache and best practices around implementing and maintaining cache. There are few packages that supports this functionality. Also it seems there is no shared in-memory cache implementation possible in cluster mode.
Would like to highlight a recent video released by chrome team https://www.youtube.com/watch?v=ff4fgQxPaO0, that talks about how to have faster apps when dealing with large object literals.

victor-homyakov · 2020-10-12T15:02:40Z

When writing NPM module, it's a good idea to measure the time needed to require() the module you are developing, e.g.

console.time('require');
const myModule = require('my-module');
console.timeEnd('require');

Reason: this time is added to the overall startup time of every code that uses your module. It doesn't play a big role for long-running servers, but is very important for CLI utilities (pre-commit hooks, linters etc.).

Few examples:

Fix slow autoprefixer calls in *-no-vendor-prefix stylelint/stylelint#4971 - import of 3rd-party modules in stylelint may take about 1.8 seconds
Performance: plugin adds 150-200ms to ESLint startup time because of ramda lo1tuma/eslint-plugin-mocha#213 - import of ramda adds 150-200ms to ESLint startup time

How to fix/mitigate:

provide granular exports in your module, like ramda and lodash do - consumers may require() only what they need
use exact imports, e.g. const complement = require('ramda/src/complement'); instead of const R = require('ramda');
use https://www.npmjs.com/package/import-lazy or something similar, require modules only when you are 100% sure you need them, e.g. ESLint should not load every plugin listed in config, but only those needed to lint specified files (don't load TSX rules if you are checking just plain JS file, don't load mocha-specific rules if there is no mocha test file to lint)

Aschen · 2021-01-06T17:33:02Z

Reduce hidden classes number

Benefits from the inline cache and let Turbofan generate optimized assembly code

Use the ES Lint sort-keys rule to ensure same internal hidden classes for your object.

This will allows to fully benefits from the inline caching. Also Turbofan will be able to compile the JS code into optimized bytecode.

If you are interested I can write a section on this topic, providing code example, performance diff, etc

Hidden classes and Inline cache
Inline Caches with Monomorphism, Polymorphism and Megamorphism
Interpreter and Compiler: hidden classes, inline caching, polymorphism and megamorphism
Turbofan Speculative Optimization

Aschen · 2021-01-06T17:38:51Z

Prefer monomorphic function

Prefer the usage of monomorphic function instead of polymorphic functions.

This will allows:

Turbofan to compile them into optimized assembly
better inline caching for fast property access

example

// don't
function add (a, b) {
  return a + b
}
add(21, 42);
add(21.2, 42.3);

// do
function addInt(a, b) {
  return a + b;
}
function addFloat(a, b) {
  return a+ b;
}
addInt(21, 42);
addFloat(21.2, 42.3);

If you are interested I can write a section on this topic, providing code example, performance diff, etc

Inline Caches with Monomorphism, Polymorphism and Megamorphism
Interpreter and Compiler: hidden classes, inline caching, polymorphism and megamorphism

Aschen · 2021-01-06T17:44:51Z

Avoid dynamic allocation of function

Avoid to declare function inside function.

Function are object and object allocation is costly as well as the garbage collection.
Also, since the function is re-declared at each call, v8 will not be able to optimize it (compilation into optimized assembly, inline caching for fast property access)

example

// don't
function add(a, b) {
  const addInt = (a, b) => {
    return a + b;
  }

  if (typeof a === 'number') {
    return addInt(a, b);
  }
  
  if (typeof a.x === 'number') {
    return addInt(a.x, b.x);
  }
}

// do
function addInt(a, b) {
  return a + b;
}

function add(a, b) {
  if (typeof a === 'number') {
    return addInt(a, b);
  }
  
  if (typeof a.x === 'number') {
    return addInt(a.x, b.x);
  }
}

If you are interested I can write a section on this topic, providing code example, performance diff, etc

Closures Compilation and Allocation
How Closures Works

kibertoad · 2021-01-06T17:47:42Z

Avoid process.env. Copy and cache it: https://www.reddit.com/r/node/comments/7thtlv/performance_penalty_of_accessing_env_variables/
The whole concept of object shapes: https://mathiasbynens.be/notes/shapes-ics (previously mentioned in the context of monomorphic params, but it's also important in the context of initializing variables)
Avoid using express.js (it is really, really slow: https://github.com/fastify/benchmarks)

Aschen · 2021-01-10T19:45:22Z

@goldbergyoni Are you still looking for contributor on this topic?

rluvaton · 2022-03-24T18:27:07Z

There are really good ideas here! We should add it

I also have some:

Prefer branchless programming (but make sure it's not effecting readability and always measure before to check if it worth it - as all performance tips)
Wrap large data with closure so it can get garbage collected

// Bad - really large file not getting garbage collected 
function run() {
   const reallyLargeFile = readFile()
   const transformed = transformData(reallyLargeFile)

   // do some extra work

}

// Good - really large file  getting garbage collected
function run() {
   let transformed;

    // prefer extract to function instead of empty closure
   {
     const reallyLargeFile = readFile()
     transformed = transformData(reallyLargeFile)
   }

   // do some extra work

}

Read large files in streams instead all at once

victor-homyakov · 2022-03-31T17:30:13Z

Wrap large data with closure so it can get garbage collected

   // prefer extract to function instead of empty closure
   {
     const reallyLargeFile = readFile()
     transformed = transformData(reallyLargeFile)
   }
   // do some extra work

Looks like in order to benefit from this change, the "extra work" part

either should have a significant execution time, e.g. 5 seconds - then in original code reallyLargeFile will be unavailable for GC during these 5 seconds, and in modified code, it could be GCed if the engine decides to start GC during that time
or should have high memory consumption, to trigger Full GC (usually the engine gives it a last try and tries to free all possible memory before throwing the "out of memory" error)

I think these details should be noted - the readers should not follow the advice blindly but should have some understanding of why and when it helps.

Also, it would be great to have some proofs or benchmarks to illustrate this idea.

Read large files in streams instead all at once

But we've already seen const reallyLargeFile = readFile() in advice number 2. Could you please make the code example from advice 2 consistent with advice 3 (i.e. use something else instead of reading a large file at once)?

goldbergyoni added new best practice writer-needed performance labels Sep 26, 2018

goldbergyoni mentioned this issue Oct 2, 2018

Some of the performance items #31

Closed

goldbergyoni mentioned this issue Oct 2, 2018

I can't understand "Layer your app, keep Express within its boundaries" second picture ? #252

Closed

reminders bot added the reminder label Oct 7, 2018

reminders bot removed the reminder label Oct 9, 2018

VinayaSathyanarayana mentioned this issue Dec 29, 2018

Performance: Don't block the event loop #294

Closed

stale bot added the stale label Jun 19, 2019

stale bot closed this as completed Jun 29, 2019

Performance & diagnostics best practices - call for ideas #256

Performance & diagnostics best practices - call for ideas #256

Comments

goldbergyoni commented Sep 26, 2018 • edited by reminders bot Loading

goldbergyoni commented Sep 26, 2018 • edited Loading

profnandaa commented Sep 26, 2018

js-kyle commented Sep 27, 2018

sagirk commented Sep 27, 2018

sagirk commented Sep 27, 2018

AbdelrahmanHafez commented Sep 27, 2018

goldbergyoni commented Sep 27, 2018

goldbergyoni commented Sep 27, 2018

VinayaSathyanarayana commented Oct 1, 2018

goldbergyoni commented Oct 2, 2018 • edited Loading

TheHollidayInn commented Oct 5, 2018

TheHollidayInn commented Oct 7, 2018

barretojs commented Oct 7, 2018

goldbergyoni commented Oct 7, 2018

goldbergyoni commented Oct 7, 2018

TheHollidayInn commented Oct 7, 2018

barretojs commented Oct 7, 2018

goldbergyoni commented Oct 7, 2018

reminders bot commented Oct 7, 2018

js-kyle commented Oct 7, 2018

reminders bot commented Oct 9, 2018

goldbergyoni commented Oct 11, 2018

goldbergyoni commented Oct 11, 2018

mohemos commented Dec 20, 2018

goldbergyoni commented Dec 21, 2018

Berkmann18 commented Dec 21, 2018 • edited Loading

goldbergyoni commented Dec 24, 2018

Berkmann18 commented Dec 25, 2018 • edited Loading

VinayaSathyanarayana commented Dec 29, 2018

VinayaSathyanarayana commented Dec 29, 2018

goldbergyoni commented Dec 30, 2018

goldbergyoni commented Dec 30, 2018

Berkmann18 commented Dec 30, 2018

TheHollidayInn commented Jan 14, 2019

migramcastillo commented Mar 13, 2019

goldbergyoni commented Mar 17, 2019

goldbergyoni commented Mar 17, 2019

migramcastillo commented Mar 19, 2019

sagirk commented Mar 21, 2019

stale bot commented Jun 19, 2019

abhishekshetty commented Apr 10, 2020 • edited Loading

abhishekshetty commented Apr 10, 2020 • edited Loading

victor-homyakov commented Oct 12, 2020 • edited Loading

Aschen commented Jan 6, 2021 • edited Loading

Reduce hidden classes number

Aschen commented Jan 6, 2021 • edited Loading

Prefer monomorphic function

Aschen commented Jan 6, 2021 • edited Loading

Avoid dynamic allocation of function

kibertoad commented Jan 6, 2021

Aschen commented Jan 10, 2021

rluvaton commented Mar 24, 2022 • edited Loading

victor-homyakov commented Mar 31, 2022

goldbergyoni commented Sep 26, 2018 •

edited by reminders bot

Loading

goldbergyoni commented Sep 26, 2018 •

edited

Loading

goldbergyoni commented Oct 2, 2018 •

edited

Loading

Berkmann18 commented Dec 21, 2018 •

edited

Loading

Berkmann18 commented Dec 25, 2018 •

edited

Loading

abhishekshetty commented Apr 10, 2020 •

edited

Loading

abhishekshetty commented Apr 10, 2020 •

edited

Loading

victor-homyakov commented Oct 12, 2020 •

edited

Loading

Aschen commented Jan 6, 2021 •

edited

Loading

Aschen commented Jan 6, 2021 •

edited

Loading

Aschen commented Jan 6, 2021 •

edited

Loading

rluvaton commented Mar 24, 2022 •

edited

Loading