Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

Pinning takes longer the more pins exist #3759

Closed
matheus23 opened this issue Jul 21, 2021 · 4 comments
Closed

Pinning takes longer the more pins exist #3759

matheus23 opened this issue Jul 21, 2021 · 4 comments
Assignees
Labels
exp/intermediate Prior experience is likely helpful kind/enhancement A net-new feature or improvement to an existing feature kind/resolved-in-helia need/analysis Needs further analysis before proceeding P2 Medium: Good to have, but can wait until someone steps up topic/perf Performance

Comments

@matheus23
Copy link
Contributor

  • Version:
    {
      version: '0.8.0',
      repo: '10',
      commit: '81f944163f5a78ef15fdc23b2c488ba7f44af0be',
      'interface-ipfs-core': '^0.147.0'
    }
  • Platform: Linux philsklapparch 5.12.12-arch1-1 #1 SMP PREEMPT Fri, 18 Jun 2021 21:59:22 +0000 x86_64 GNU/Linux
  • Subsystem: ipfs-core, the pin manager (ipfs-repo?)

Severity: Medium

Description:

As discussed in #2650 (comment), I've created a benchmark suite to show what I've found: js-ipfs performs worse the more pins you have.

I've created rudimentary benchmark suite and here are some graphs from the data I collected (link to google sheet):

This graph shows the time for the nth ipfs.add(<random data>, { pin: true }), with a js-ipfs instance that's configured with MemoryDatastores:
image

You can clearly see a linear increase in the baseline time it takes for later adds. While the first couple of adds take between 3-5ms, the 3000th add takes between 40-50ms.

This graph shows the time for the nth ipfs.add(<random data>, { pin: false }), with a js-ipfs instance configured in the same way as the one above:
image

You can clearly see that there is no runtime increase after even 3000 adds. All adds except outliers take between 1-2ms.

Steps to reproduce the error:

You can try to reproduce my findings using my reproduction repository.

Also, the graphs above show performance for an in-memory IPFS (with MemoryDatastores), but they're also reproducible for normal IPFS configurations, it's just much noisier.


I've done some performance profiling with v8, and one cost center I'm seeing is this function:

const pinAdd = async function * () {
for await (const { path, recursive, metadata } of normaliseInput(source)) {
const { cid } = await resolvePath(repo, codecs, path)
// verify that each hash can be pinned
const { reason } = await repo.pins.isPinnedWithType(cid, [PinTypes.recursive, PinTypes.direct])
if (reason === 'recursive' && !recursive) {
// only disallow trying to override recursive pins
throw new Error(`${cid} already pinned recursively`)
}
if (recursive) {
await repo.pins.pinRecursively(cid, { metadata })
} else {
await repo.pins.pinDirectly(cid, { metadata })
}
yield cid
}
}

The isPinnedWithType function takes longer the more pins you have. That's likely what's causing the above issue.

I think there are a couple of possible improvements:

  • await repo.pins.isPinnedWithType(cid, [PinTypes.recursive, PinTypes.direct]) is very costly. We can skip this check if recursive is true. I.e. if recursive is true, the condition on line 41 is false anyway and the result of isPinnedWithType is unused anyway. I'm assuming that it doesn't have side-effects.
  • Why is it looking for PinType.directwhen only recursive pins have an effect in line 41?

Most likely my assumption that isPinnedWithType doesn't have side-effects is just wrong 🤷‍♂️

Anyway, I have much less expertise with the ipfs-core codebase anyway, but I hope I was helpful.

@matheus23 matheus23 added the need/triage Needs initial labeling and prioritization label Jul 21, 2021
@matheus23 matheus23 changed the title Pinning takes long the more pins there have been Pinning takes longer the more pins exist Jul 21, 2021
@lidel lidel added exp/intermediate Prior experience is likely helpful kind/enhancement A net-new feature or improvement to an existing feature need/analysis Needs further analysis before proceeding P2 Medium: Good to have, but can wait until someone steps up topic/perf Performance need/triage Needs initial labeling and prioritization and removed need/triage Needs initial labeling and prioritization labels Jul 23, 2021
@CSDUMMI
Copy link
Contributor

CSDUMMI commented Sep 3, 2021

@CSDUMMI
Copy link
Contributor

CSDUMMI commented Sep 4, 2021

isPinnedWithType appears to be called only for this check:

if (reason === 'recursive' && !recursive) {

Maybe we could try to rerun your test, without this check and thus the call to isPinnedWithType? Thus checking if isPinnedWith is the culprit of this performance decrease.

@tinytb
Copy link

tinytb commented Nov 22, 2022

2022-11-22: this needs to be verified to see if it's still a problem.

@SgtPooki SgtPooki self-assigned this May 17, 2023
@SgtPooki SgtPooki moved this to 🥞 Todo in js-ipfs deprecation May 17, 2023
@SgtPooki SgtPooki moved this from 🥞 Todo to 🛑 Blocked in js-ipfs deprecation May 17, 2023
@SgtPooki
Copy link
Member

js-ipfs is being deprecated in favor of Helia. You can #4336 and read the migration guide.

Please feel to reopen with any comments by 2023-06-02. We will do a final pass on reopened issues afterward (see #4336).

This issue is most likely resolved in Helia, please try it out!

Alex (achingbrain; not tagging on purpose) recently put together a talk at IPFS Thing 2023, and benchmarks in Helia repo, that show how much better at pinning Helia is. Give it a go!

@github-project-automation github-project-automation bot moved this from Backlog to Done in IP JS (PL EngRes) v2 May 26, 2023
@github-project-automation github-project-automation bot moved this from 🛑 Blocked to ✅ Done in js-ipfs deprecation May 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
exp/intermediate Prior experience is likely helpful kind/enhancement A net-new feature or improvement to an existing feature kind/resolved-in-helia need/analysis Needs further analysis before proceeding P2 Medium: Good to have, but can wait until someone steps up topic/perf Performance
Projects
No open projects
Status: Done
Development

No branches or pull requests

5 participants