This repository has been archived by the owner on Feb 12, 2024. It is now read-only.
js-ipfs pinning performance #2197
Labels
exp/wizard
Extensive knowledge (implications, ramifications) required
exploration
kind/support
A question or request for support
status/ready
Ready to be worked
ipfs.add()
performance degrades severely once the number of pins exceeds8192
Background
Users can pin a file or a block to prevent it from being garbage collected.
The pinning module maintains two sets of pins:
These pin sets are stored in the block store with the following structure:
8192
, create a node with256
links pointing to an empty block8192
256
buckets> 8192
pins distribute them into sub-buckets etc)Performance
A pin set with less than
8192
pins is stored in a single DAG node. Once there are more than8192
pins, they are distributed between256
buckets, each with its own DAG node. Each time a new pin is added to the set, the distribution across the group of buckets is calculated and written to the block store. The distribution is deterministic, so in reality only one bucket changes each time a new pin is added.For example, if we simplify and say there are 8 buckets, with 5 pins (
A
-E
):[] [D] [] [EA] [] [C] [] [B]
When we add pin
F
only one bucket changes:[] [D] [] [EA] [] [C] [] [BF]
We can improve performance by adding a cache that remembers the structure of the pin sets, and only write nodes that change to the block store (instead of writing all nodes each time a pin is added or removed). This improves
ipfs.add()
performance dramatically once we exceed8192
pins:Memory usage
The pinner uses fnv1a to distribute pins. fnv1a outputs a number (8 bytes) so if we also use it for cache keys each key will be 8 bytes. Each pin is represented by a DAG link pointing to the pinned CID. The DAG link has
So rounding up, a DAG link requires about 128 bytes of memory. eg 10k pins requires about 1MB memory for the cache.
Note: Storing the DAGLink object (rather than just the CID as a Buffer) saves us having to re-create a lot of JavaScript Objects but uses about twice the memory. However this memory would need to be reserved anyway each time a pin is added.
Note: The cache is not used if there are less than
8192
pinsCommand line
When invoking
ipfs add
from the command line, with the daemon running, we need to load the http api each time. This can be several times slower than the add operation itself, so we should look at optimizing it.The text was updated successfully, but these errors were encountered: