integrate DAG store and CARv2 in deal-making #6671

aarshkshah1992 · 2021-07-03T07:28:04Z

(Description added by @raulk)

Technical description

This commit removes badger from the deal-making processes, and moves to a new architecture with the dagstore as the cental component on the miner-side, and CARv2s on the client-side.

Every deal that has been handed off to the sealing subsystem becomes a shard in the dagstore. Shards are mounted via the LotusMount, which teaches the dagstore how to load the related piece when serving retrievals.

When the miner starts the Lotus for the first time with this patch, we will perform a one-time migration of all active deals into the dagstore. This is a lightweight process, and it consists simply of registering the shards in the dagstore.

Shards are backed by the unsealed copy of the piece. This is currently a CARv1. However, the dagstore keeps CARv2 indices for all pieces, so when it's time to acquire a shard to serve a retrieval, the unsealed CARv1 is joined with its index (safeguarded by the dagstore), to form a read-only blockstore, thus taking the place of the monolithic badger.

Data transfers have been adjusted to interface directly with CARv2 files. On inbound transfers (client retrievals, miner storage deals), we stream the received data into a CARv2 ReadWrite blockstore. On outbound transfers (client storage deals, miner retrievals), we serve the data off a CARv2 ReadOnly blockstore.

Client-side imports are managed by the refactored *imports.Manager component (when not using IPFS integration). Just like it before, we use the go-filestore library to avoid duplicating the data from the original file in the resulting UnixFS DAG (concretely the leaves). However, the target of those imports are what we call "ref-CARv2s": CARv2 files placed under the $LOTUS_PATH/imports directory, containing the intermediate nodes in full, and the leaves as positional references to the original file on disk.

Client-side retrievals are placed into CARv2 files in the location: $LOTUS_PATH/retrievals.

A new set of Dagstore* JSON-RPC operations and lotus-miner dagstore subcommands have been introduced on the miner-side to inspect and manage the dagstore.

Despite moving to a CARv2-backed system, the IPFS integration has been respected, and it continues to be possible to make storage deals with data held in an IPFS node, and to perform retrievals directly into an IPFS node.

Client imports

Because the "client" Badger blockstores are no longer used, existing imports on the client will be rendered useless.

On startup, Lotus will enumerate all imports and print WARN statements on the log for each import that needs to be reimported. These log lines contain these messages:

import lacks carv2 path; import will not work; please reimport
import has missing/broken carv2; please reimport

At the end, we will print a "sanity check completed" message indicating the count of imports found, and how many were deemed broken.

About this PR

We believe the test coverage to be pretty satisfactory, with a substantial portion of the LoC diff corresponding to tests.

This commit is a squashed version of 200+ commits developed by @aarshkshah1992, @dirkmc, and @raulk over the course of months.

This contribution was thoroughly tested in the M1 milestone with minerX.2: #6852.

TODO

Migration of existing storage client imports in Badger to CARv2 files.
- added a warning in logs.
~~Add tests for Dagstore* JSON-RPC operations.~~
- tracked in add tests for Dagstore* JSON-RPC operations. #7138
~~Add tests for expired deals and slashed deals (coverage for OnDealExpiredOrSlashed).~~
- tracked in https://github.com/filecoin-project/lotus/pull/5431/files.
~~Try to get rid of StagingBlockstore, StagingGraphsync.~~
- tracked in https://github.com/filecoin-project/lotus/issues/7139
~~Try to get rid of /client datastore and /staging datastore.~~
- tracked in https://github.com/filecoin-project/lotus/issues/7139
~~If the above is successful, remove the datastore muxing (try to remove the datastore muxing).~~
- tracked in https://github.com/filecoin-project/lotus/issues/7139

dirkmc

Looking really good so far 👍

node/impl/client/client.go

node/repo/importmgr/mgr.go

magik6k

((looks like itests/fstmp889313799 got committed by accident))

api/docgen/docgen.go

api/api_full.go

node/builder_chain.go

node/builder_miner.go

node/impl/client/import.go

node/repo/importmgr/mgr.go

node/impl/client/import.go

hannahhoward

This seems to be writing CAR v2 files to the hard drive on the client side. I thought we'd discussed avoiding storing a second copy of the clients data on the HDD. @magik6k are you ok with every import for regular files on the client having a CarV2 copy on the hard drive (which will of course be 32 additional GB in many cases).

The current code jumped through a lot of hoops to avoid ever storing a CAR file on the client's machine, and always using OS-pipes when we did file i/o needed for CommP calculation. So I was wondering if the intent is in fact to no longer do that.

itests/deals_concurrent_test.go

raulk

@aarshkshah1992 @dirkmc I think there's a misunderstanding in the changes to client import; this part should be mostly unchanged, as @hannahhoward and @magik6k pointed out above.

IUC, the current state of the system is as follows:

User provides a raw file.
Lotus instantiates a new Filestore using the multistore. The multistore combines two things: an actual blockstore (Badger in this case) and a CID => file offset/length tracker. See https://github.com/filecoin-project/go-multistore/blob/master/store.go#L33. There's special-casing for a special type of block called PosInfo that encodes the offset of the block within the file, and its length.
- What's not clear to me is how Lotus client import leverages that special casing, because we seem to use a normal DAG service that feeds normal IPLD blocks to the blockstore. @hannahhoward -- illuminate?
When chunking, we do not import blocks into the Badger blockstore, but somehow the PosInfo route is used, so we record CID => (offset, length) in the datastore, thus avoiding creating a copy of the file.

Later, when performing the deal, we simulate forming the CARv1 and feed it to the commP writer as a stream, to calculate the commP. But the CARv1 is never actually written on disk.

The footprint of this operation is just the cost of the CID => (offset, length) positional tracking, which likely wouldn't exceed 3MiB for a 64GiB file chunked with the 1MiB UnixFS chunking policy.

The Graphsync transfer is conducted using the positional mapping (it does not use a Badger store), which is why the original file is needed.

Conversely, this PR introduces these changes:

User provides a raw file.
File is chunked into a CARv2 to calculate its root.
Another CARv2 is generated with the right root.

For a 64GiB raw file, the footprint of the new code is in the order of 128GiB, which is a regression we can't afford.

I suggest to revert the changes in the client import process.

node/builder_miner.go

node/modules/storageminer.go

hannahhoward

In general, awesome! Yay for this huge change. Most of my comments are suggestions.

However, I have three blocking comments:

The use of RetrievalProviderNode as a dependency to get the IsUnsealed/UnsealSector methods is super awkward and the dependencies need untangling. I've put suggestions on how to do it.
The IPFS blockstore integration is super awkward for retrieval, and I'm pretty sure it's broken completely for storage. We either need to fix it or work closely with partners before we ship with this feature removed. I know PowerGate is the main user, I don't know how important that is.
Unless I am reading the code wrong, it appears that we are still writing second full copies of files out to the hard drive when we import them -- see comments on doImport in node/impl/client/import.go, despite the existence of ReadWriteFilestore which certainly appears like it could be used to avoid doing these writes. This isn't blocking if someone can explain to me why I'm reading the code wrong -- it's fine if I just am not seeing it.

I wonder: Blocking issues 2 & 3 have to do with the person on the client side of deals, not the miner. What is our testing plan for working with this code with people who are making deals? (cause I know we've been focused on beta testing with miners). I realize the dealbot is testing the dealmaking, but it also has space do deal with extra file copies AND it doesn't test the IPFS blockstore functionality.

api/api_storage.go

blockstore/badger/blockstore.go

cli/auth.go

cli/client.go

cli/util/api.go

node/modules/client.go

hannahhoward · 2021-08-09T23:26:12Z

node/builder_chain.go

@@ -169,7 +169,7 @@ func ConfigFullNode(c interface{}) Option {
 		If(cfg.Client.UseIpfs,
 			Override(new(dtypes.ClientBlockstore), modules.IpfsClientBlockstore(ipfsMaddr, cfg.Client.IpfsOnlineMode)),


This is still here but it appears that ClientBlockstore is being ignored by the StorageClient

Nod, we can't lose this, good catch!

hannahhoward · 2021-08-09T23:33:36Z

node/impl/client/client.go

+
+		carV2FilePath = resp.CarFilePath
+		// remove the temp CARv2 file when retrieval is complete
+		defer os.Remove(carV2FilePath) //nolint:errcheck


I put this in the markets PR but if this code is deleting CarV2 files, it should be managing creating them to.

node/impl/client/client.go

markets/storageadapter/provider.go

hannahhoward

Pre-emptive approval to unblock merge while I'm on vacation. Will review tomorrow to see if there are any changes before I leave.

dirkmc · 2021-08-12T13:33:25Z

@raulk need to audit if blockstore provider to loader and storer in node/modules/graphsync.go are actually being used

raulk · 2021-08-13T16:37:55Z

node/impl/client/client.go

-	if err := a.imgr().AddLabel(id, "source", "import-local"); err != nil {
-		return cid.Cid{}, err


@aarshkshah1992 removing this label causes a functional regression.

raulk · 2021-08-13T16:38:44Z

node/impl/client/client.go

-	prefix, err := merkledag.PrefixForCidVersion(1)
+func (a *API) ClientImportLocal(ctx context.Context, r io.Reader) (cid.Cid, error) {
+	// write payload to temp file
+	tmpPath, err := a.imgr().NewTempFile(imports.ID(rand.Uint64()))


We should be using an actual ID generated by the import manager.

raulk · 2021-08-13T16:42:07Z

node/impl/client/client.go

-	nd, err := balanced.Layout(db)
-	if err != nil {
+	defer tmpF.Close() //nolint:errcheck
+	if _, err := io.Copy(tmpF, r); err != nil {


This is a performance regression. We just replaced direct UnixFS chunking into a "heavy" CAR file (containing leaf data) with two files (the temp file where we wrote the upload, and the position CAR file). I don't think that's right.

I think we should import straight into a heavy CAR.

raulk · 2021-08-13T16:45:54Z

node/impl/client/client.go

+	id := imports.ID(rand.Uint64())
+	tmp, err := a.imgr().NewTempFile(id)


Why is this even on the ImportMgr, if it's not managed by it nor it uses an actual ID.

This commit removes badger from the deal-making processes, and moves to a new architecture with the dagstore as the cental component on the miner-side, and CARv2s on the client-side. Every deal that has been handed off to the sealing subsystem becomes a shard in the dagstore. Shards are mounted via the LotusMount, which teaches the dagstore how to load the related piece when serving retrievals. When the miner starts the Lotus for the first time with this patch, we will perform a one-time migration of all active deals into the dagstore. This is a lightweight process, and it consists simply of registering the shards in the dagstore. Shards are backed by the unsealed copy of the piece. This is currently a CARv1. However, the dagstore keeps CARv2 indices for all pieces, so when it's time to acquire a shard to serve a retrieval, the unsealed CARv1 is joined with its index (safeguarded by the dagstore), to form a read-only blockstore, thus taking the place of the monolithic badger. Data transfers have been adjusted to interface directly with CARv2 files. On inbound transfers (client retrievals, miner storage deals), we stream the received data into a CARv2 ReadWrite blockstore. On outbound transfers (client storage deals, miner retrievals), we serve the data off a CARv2 ReadOnly blockstore. Client-side imports are managed by the refactored *imports.Manager component (when not using IPFS integration). Just like it before, we use the go-filestore library to avoid duplicating the data from the original file in the resulting UnixFS DAG (concretely the leaves). However, the target of those imports are what we call "ref-CARv2s": CARv2 files placed under the `$LOTUS_PATH/imports` directory, containing the intermediate nodes in full, and the leaves as positional references to the original file on disk. Client-side retrievals are placed into CARv2 files in the location: `$LOTUS_PATH/retrievals`. A new set of `Dagstore*` JSON-RPC operations and `lotus-miner dagstore` subcommands have been introduced on the miner-side to inspect and manage the dagstore. Despite moving to a CARv2-backed system, the IPFS integration has been respected, and it continues to be possible to make storage deals with data held in an IPFS node, and to perform retrievals directly into an IPFS node. NOTE: because the "staging" and "client" Badger blockstores are no longer used, existing imports on the client will be rendered useless. On startup, Lotus will enumerate all imports and print WARN statements on the log for each import that needs to be reimported. These log lines contain these messages: - import lacks carv2 path; import will not work; please reimport - import has missing/broken carv2; please reimport At the end, we will print a "sanity check completed" message indicating the count of imports found, and how many were deemed broken. Co-authored-by: Aarsh Shah <[email protected]> Co-authored-by: Dirk McCormick <[email protected]>

Stebalien · 2021-08-18T04:27:29Z

go.mod

 )

+replace github.com/multiformats/go-multihash => github.com/multiformats/go-multihash v0.0.14


Please do not do this without a very good reason. If you do, leave a comment explaining why. As far as I can tell, there was no reason to do this and it makes it impossible to update other dependencies.

magik6k

Post merge review pass - looks really good, only 3 nitpicks

magik6k · 2021-08-20T12:11:56Z

api/api_storage.go

+	// It returns a stream of events to report progress.
+	DagstoreInitializeAll(ctx context.Context, params DagstoreInitializeAllParams) (<-chan DagstoreInitializeAllEvent, error) //perm:write
+
+	// DagstoreGC runs garbage collection on the DAG store.


Some basic description (or a docs link) on what this operation does would be great.

magik6k · 2021-08-20T12:17:35Z

cmd/lotus-miner/dagstore.go

+		colors := map[string]color.Attribute{
+			"ShardStateAvailable": color.FgGreen,
+			"ShardStateServing":   color.FgBlue,
+			"ShardStateErrored":   color.FgRed,
+			"ShardStateNew":       color.FgYellow,
+		}


Would drop the ShardState prefix, since it provides no information to the user, and makes it slightly harder to parse visually

magik6k · 2021-08-20T12:18:55Z

cmd/lotus-miner/dagstore.go

+
+var dagstoreGcCmd = &cli.Command{
+	Name:  "gc",
+	Usage: "Garbage collect the dagstore",


Like with API, some more detailed description would be really useful

aarshkshah1992 marked this pull request as draft July 3, 2021 07:28

aarshkshah1992 changed the title ~~Replace Multistore with CARv2~~ [WIP] Replace Multistore with CARv2 Jul 3, 2021

aarshkshah1992 force-pushed the feat/replace-multistore-carv2 branch from 996feea to de13b6e Compare July 3, 2021 08:56

aarshkshah1992 changed the title ~~[WIP] Replace Multistore with CARv2~~ [WIP] Replace Badger from deal-making with CARv2 Jul 3, 2021

aarshkshah1992 changed the title ~~[WIP] Replace Badger from deal-making with CARv2~~ [WIP] Replace Badger from deal-making with CARv2 and Sharded DAG Store Jul 3, 2021

aarshkshah1992 changed the title ~~[WIP] Replace Badger from deal-making with CARv2 and Sharded DAG Store~~ Replace Badger from deal-making with CARv2 and Sharded DAG Store Jul 3, 2021

aarshkshah1992 requested review from raulk, dirkmc and nonsense July 3, 2021 09:39

aarshkshah1992 changed the title ~~Replace Badger from deal-making with CARv2 and Sharded DAG Store~~ Replace Badger from deal-making with CARv2 and the Sharded DAG Store Jul 3, 2021

aarshkshah1992 mentioned this pull request Jul 4, 2021

[Meta Issue] Replace Badger in Deal Making with CARv2 and the Sharded DAG Store and test with MRA #6673

Closed

20 tasks

dirkmc reviewed Jul 5, 2021

View reviewed changes

node/impl/client/client.go Outdated Show resolved Hide resolved

node/impl/client/client.go Outdated Show resolved Hide resolved

node/repo/importmgr/mgr.go Outdated Show resolved Hide resolved

node/repo/importmgr/mgr.go Outdated Show resolved Hide resolved

magik6k reviewed Jul 6, 2021

View reviewed changes

magik6k reviewed Jul 7, 2021

View reviewed changes

node/impl/client/import.go Outdated Show resolved Hide resolved

hannahhoward reviewed Jul 7, 2021

View reviewed changes

hannahhoward mentioned this pull request Jul 8, 2021

Refactor retrieval commands #6702

Closed

aarshkshah1992 commented Jul 11, 2021

View reviewed changes

itests/deals_concurrent_test.go Outdated Show resolved Hide resolved

aarshkshah1992 commented Jul 11, 2021

View reviewed changes

itests/deals_concurrent_test.go Show resolved Hide resolved

raulk reviewed Jul 11, 2021

View reviewed changes

node/builder_miner.go Outdated Show resolved Hide resolved

node/builder_miner.go Outdated Show resolved Hide resolved

aarshkshah1992 mentioned this pull request Jul 12, 2021

Finish integrating Sharded DAG Store and CARv2 blockstore resumption in Lotus and Markets #6735

Closed

5 tasks

raulk mentioned this pull request Jul 12, 2021

implement DAG store CLI commands #6736

Closed

Base automatically changed from nonsense/split-market-miner-processes to master July 13, 2021 15:16

raulk reviewed Jul 18, 2021

View reviewed changes

node/modules/storageminer.go Outdated Show resolved Hide resolved

jennijuju added the M1-release label Jul 19, 2021

jennijuju modified the milestone: 1.11.1 Jul 19, 2021

dirkmc approved these changes Jul 22, 2021

View reviewed changes

jacobheun added team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs epic/sharded-dag-store labels Jul 22, 2021

raulk force-pushed the feat/replace-multistore-carv2 branch from f648a72 to 72391cc Compare July 29, 2021 20:36

raulk force-pushed the feat/replace-multistore-carv2 branch 4 times, most recently from 8c90489 to 892e146 Compare August 9, 2021 19:03

hannahhoward requested changes Aug 10, 2021

View reviewed changes

raulk mentioned this pull request Aug 10, 2021

efficient buffer-less ExtractV1File method ipld/go-car#207

Closed

jennijuju added this to the v1.11.2 milestone Aug 10, 2021

jennijuju added the P2 P2: Should be resolved label Aug 10, 2021

hannahhoward approved these changes Aug 11, 2021

View reviewed changes

raulk reviewed Aug 13, 2021

View reviewed changes

raulk mentioned this pull request Aug 16, 2021

migrate to DAG store + CARv2 blockstores for storage and retrieval filecoin-project/go-fil-markets#576

Merged

raulk force-pushed the feat/replace-multistore-carv2 branch 2 times, most recently from e3f4136 to e4c4699 Compare August 16, 2021 21:56

raulk force-pushed the feat/replace-multistore-carv2 branch from e4c4699 to 9d9fd9d Compare August 16, 2021 22:20

raulk approved these changes Aug 16, 2021

View reviewed changes

raulk marked this pull request as ready for review August 16, 2021 22:30

raulk requested a review from a team as a code owner August 16, 2021 22:30

raulk approved these changes Aug 16, 2021

View reviewed changes

raulk merged commit d707677 into master Aug 16, 2021

raulk deleted the feat/replace-multistore-carv2 branch August 16, 2021 22:34

raulk mentioned this pull request Aug 17, 2021

Deal Making Tests for M1 (MRA + DAGStore integration) #6789

Closed

13 tasks

Stebalien reviewed Aug 18, 2021

View reviewed changes

magik6k reviewed Aug 20, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

integrate DAG store and CARv2 in deal-making #6671

integrate DAG store and CARv2 in deal-making #6671

aarshkshah1992 commented Jul 3, 2021 •

edited by raulk

Loading

dirkmc left a comment

magik6k left a comment

hannahhoward left a comment •

edited

Loading

raulk left a comment

hannahhoward left a comment

hannahhoward Aug 9, 2021

ribasushi Aug 10, 2021

raulk Aug 16, 2021

hannahhoward Aug 9, 2021

hannahhoward left a comment

dirkmc commented Aug 12, 2021

raulk Aug 13, 2021

raulk Aug 16, 2021

raulk Aug 13, 2021

raulk Aug 16, 2021

raulk Aug 13, 2021

raulk Aug 13, 2021

raulk Aug 16, 2021

raulk Aug 13, 2021

raulk Aug 16, 2021

Stebalien Aug 18, 2021

magik6k left a comment

magik6k Aug 20, 2021

magik6k Aug 20, 2021

magik6k Aug 20, 2021

		@@ -169,7 +169,7 @@ func ConfigFullNode(c interface{}) Option {
		If(cfg.Client.UseIpfs,
		Override(new(dtypes.ClientBlockstore), modules.IpfsClientBlockstore(ipfsMaddr, cfg.Client.IpfsOnlineMode)),

		if err := a.imgr().AddLabel(id, "source", "import-local"); err != nil {
		return cid.Cid{}, err

		id := imports.ID(rand.Uint64())
		tmp, err := a.imgr().NewTempFile(id)

		)

		replace github.com/multiformats/go-multihash => github.com/multiformats/go-multihash v0.0.14

integrate DAG store and CARv2 in deal-making #6671

integrate DAG store and CARv2 in deal-making #6671

Conversation

aarshkshah1992 commented Jul 3, 2021 • edited by raulk Loading

Technical description

Client imports

About this PR

TODO

dirkmc left a comment

Choose a reason for hiding this comment

magik6k left a comment

Choose a reason for hiding this comment

hannahhoward left a comment • edited Loading

Choose a reason for hiding this comment

raulk left a comment

Choose a reason for hiding this comment

hannahhoward left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hannahhoward left a comment

Choose a reason for hiding this comment

dirkmc commented Aug 12, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

magik6k left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aarshkshah1992 commented Jul 3, 2021 •

edited by raulk

Loading

hannahhoward left a comment •

edited

Loading