Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrating to go-ipfs v0.5.0 #1225

Closed
8 tasks
b5 opened this issue Mar 30, 2020 · 2 comments · Fixed by qri-io/dag#30
Closed
8 tasks

Migrating to go-ipfs v0.5.0 #1225

b5 opened this issue Mar 30, 2020 · 2 comments · Fixed by qri-io/dag#30
Assignees
Labels
chore Changes to the build process or auxiliary tools and libraries such as documentation generation
Milestone

Comments

@b5
Copy link
Member

b5 commented Mar 30, 2020

Our friends at Protocol Labs are are gearing up for a big release of go-ipfs: version 0.5.0. This is a tracking issue for others to follow our own upgrade process.

Rounding Qri's Benchmarking Suite

I'm moving benchmarking to a separate issue to reduce workload, and unblock getting a release of our own out the door
v0.5.0 emphasizes performance, and we need to be able to see just how much of a boon we're getting for all this work. We currently have a nice high-level benchmarking suite, I'd like to add at least a few benchmarks at the lib level that use a temp repo. We should get these benchmarks in before we migrate dependencies.

Badger Datastore Migration

we're no longer doing this, see below.
One of the biggest wins 0.5.0 brings is moving to badger by default. It's been an opt-in configuration for more than a year. badger replaces flatfs as a datastore, and performs far better. Given that so much of qri is operating on a local repo, as an offline-first application, having faster read/write from IPFS will be the biggest area where we see changes.

One of our slowest commands is qri save. A great deal of this time is spent on a version of IPFS add, more often than not using the flatfs configuration. Given that adding a 1 gig dataset currently takes around 50 seconds, cutting that time in half is very appealing.

go-ipfs is planning to not migrate users to badger automatically, but any new IPFS repos moving forward will use badger by default. If our repo init function works properly, it should initialize

At Qri we should be upgrading our users to leverage badger, but given that we operate directly on the default machine IPFS repo, we need to take some care not to auto-upgrade users who use both IPFS & qri. I think the best thing we can do here is run the equivalent IPFS pin ls --type recursive (we have access to this command within the qri package itself) and check if any of those hashes aren't ones we recognize (basically: are they dataset hashes, or are they an old copy of the webapp?)

We should not let badger migration code block a release. We can publish a guide for command line users to do the upgrade, and the only people who will be left out are desktop-exclusive users. We can follow up with a patch release that'll perform the upgrade in this scenario.

Using bitswap again

We've intentionally carved around bitswap for a long time. Bitswap is getting a big shot in the arm this release, to the point that there are scenarios where bitswap should start to beat out dsync for large, propagated datasets. We should start up a conversation of how the hell we're going to do content resolution when we potentially have have the registry, remote sync, and dsync in the mix. None of this is near-term work, but I am excited to get it on the roadmap.

Removing b5 as a knowledge gatekeeper

So far I'm the only one who's done IPFS dependency upgrades at our organization. We need to change that this release. I'm hoping to tag @ramfox to review changes as they come in, and contribute PRs wherever possible. The resulting PRs must form a trail others can use to learn how to do this dev-upgrade dance on qri. I'd like to switch to review-only when v0.5.1 is released

IPFS repo migration

Switching to v0.5.0 requires an IPFS repo migration, which means we need to migrate our user's IPFS repos. In the past we've asked users to do this manually, this go-round, we'll invoke the migration process with approval for on CLI, and automatically on Qri Desktop.

This has implications for Qri developers: core contributors to Qri should keep a copy of their un-migrated repo around while we work through the transition. Once we merge the v0.5.0 switch into qri-io/qri, we're going to be in a period of transition where users will be filing bugs against two repo versions, and will need to be able to debug both.

Moving Qri's IPFS repo into ~/.qri/ipfs

In an effort to rectify #1296, the PL team has recommended we move qri's IPFS repo into the .qri directory, and use ports other than standard IPFS ports:

ipfs/kubo#7196 (comment)

To me this makes a lot of sense. I don't think our efforts to reduce the number of IPFS repos on a user's machine are delivering enough benefit to outweigh the degraded experience of trying to run both IPFS Desktop & Qri Desktop. Barring any major objection, let's write a migration that moves the repo & detangles Qri data from ~/.ipfs. I'll file an issue going into the implications of this in detail.

Related Issues & Links

Looks like last time we updated our IPFS dependencies was October 2019. 😭 #985

Relevant Tracking Issues from IPFS:

Steps to complete

Repos we manage that have IPFS deps

  • upgrade dependencies for qri-io/qfs - chore(deps): upgrade to go-ipfs 0.5.0 qfs#24
  • add a test to github.com/qri-io/qfs/cafs/ipfs InitRepo that ensures new repos are initialized with the badger data store configured.
  • upgrade dependencies for qri-io/dag - chore(deps): update go-ipfs deps, circleci build settings dag#30
  • add a suite of benchmarks to qri-io/qri/lib for the following tests. All of these should use a TempRepo as part of their setup:
    • DatasetMethods.Save - new dataset
    • DatasetMethods.Save - 2nd version (to require an IPFS read of version one)
    • DatasetMethods.Get body - read a full body of at least 10k rows
    • SQLMethods.Exec - joining two datasets with at least 1000 rows (this is many full reads of a dataset)
  • upgrade dependencies for qri-io/qri: chore(deps): bump ipfs to v0.5.0 #1273
  • github.com/qri-io/qri/repo/test temp repo should use badger. (update empty_ipfs_repo.zip)
  • write an IPFS repo migration tool, use it to move the repo: Moving Qri's IPFS repo into the .qri folder #1319
  • use a configuration migration of our own to indicate the switch to the internal IPFS repo

cc @momack2 from Protocol, who's helping get this project over the finish line!

@b5 b5 added the chore Changes to the build process or auxiliary tools and libraries such as documentation generation label Mar 30, 2020
@b5 b5 self-assigned this Apr 8, 2020
@b5
Copy link
Member Author

b5 commented Apr 29, 2020

Ok sportfans, v0.5.0 is out. We've done rounds of testing, and now time to land v0.5.0 in our own codebase. But first, some updates:

No longer switching to badger by default

Badger is great, but can't be configured to not chew up a bunch of memory at a minimum. For our users, the speedup isn't worth it, especially given that many sources of slowness tend to come from our own code. Qri will still work if a user manually sets badger as the store, and we can happily recommend this to users who want performance and can spare the RAM.

IPFS repo migration

Switching to v0.5.0 requires an IPFS repo migration, which means we need to migrate our user's IPFS repos. In the past we've asked users to do this manually, this go-round, we'll invoke the migration process with approval for on CLI, and automatically on Qri Desktop.

This has implications for Qri developers: core contributors to Qri should keep a copy of their un-migrated repo around while we work through the transition. Once we merge the v0.5.0 switch into qri-io/qri, we're going to be in a period of transition where users will be filing bugs against two repo versions, and will need to be able to debug both.

Moving Qri's IPFS repo into ~/.qri/ipfs

In an effort to rectify #1296, the PL team has recommended we move qri's IPFS repo into the .qri directory, and use ports other than standard IPFS ports:

ipfs/kubo#7196 (comment)

To me this makes a lot of sense. I don't think our efforts to reduce the number of IPFS repos on a user's machine are delivering enough benefit to outweigh the degraded experience of trying to run both IPFS Desktop & Qri Desktop. Barring any major objection, let's write a migration that moves the repo & detangles Qri data from ~/.ipfs. I'll file an issue going into the implications of this in detail.

I'm going to edit the top comment to reflect these changes.

@b5
Copy link
Member Author

b5 commented Apr 29, 2020

Also, we're a little less concerned about benchmarks at this point, considering we're no longer doing the badger thing, which would have been our primary performance pickup

@b5 b5 added this to the v0.9.9 milestone Apr 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Changes to the build process or auxiliary tools and libraries such as documentation generation
Projects
None yet
1 participant