Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement datastore #13

Closed
daviddias opened this issue Dec 11, 2015 · 23 comments
Closed

Implement datastore #13

daviddias opened this issue Dec 11, 2015 · 23 comments
Labels
exp/expert Having worked on the specific codebase is important help wanted Seeking public contribution on this issue

Comments

@daviddias
Copy link
Member

Almost missed blocks as there is no reference in the Spec (https://github.com/ipfs/specs/blob/fix-repo/repo/README.md). I'm guessing is that datastore is becoming the blocks as the transition form levelDB to flatfs happens.

@whyrusleeping @jbenet can I get some details on this?

To implement the fanout factor I'm going to use a blob-store that builds on top of fs-blob-store that knows how to do fanout (so that in the browser we don't have too)

@daviddias daviddias added the help wanted Seeking public contribution on this issue label Dec 11, 2015
@jbenet
Copy link
Member

jbenet commented Dec 12, 2015

@diasdavid blocks is just the name we give to a serialized ipfs object. Raw data. You don't need it in your code necessarily.

Maybe this helps:

  • a dag node is logically an object
  • IPLD is the format for dag nodes, which implies serialization
  • historically (from the ipfs paper) serialized objects were called blocks.
  • in go-ipfs we have a blocks package that layers on top of the repo and datastore. It handles making keys just the hash of an object, and puting/getting it from a datastore (blockstore wraps a datastore, think of a blockstore as a "specific to ipfs" abstraction on top of datastore)
  • func deserialize(block) object

There's a diagram somewhere, I'll look for it.

@jbenet
Copy link
Member

jbenet commented Dec 12, 2015

(Ipld and multicodec are newer than blocks so some there's more overlap than we'd have if these abstractions were created from scratch together)

@jbenet
Copy link
Member

jbenet commented Dec 12, 2015

@diasdavid does that make sense? lmk if not

@daviddias
Copy link
Member Author

Trying to wrap my mind around what would be the lifecycle of an file when added to IPFS.

What data does datastore folder have then (currently and near future)? Blocks is making sense, but having a blocks folder and a datastore folder is not.

@jbenet
Copy link
Member

jbenet commented Dec 13, 2015

What data does datastore folder have then (currently and near future)? Blocks is making sense, but having a blocks folder and a datastore folder is not.

conceptually, blocks/ is supposed to be inside datastore/ it's just that historically datastore/ was a leveldb instance, so blocks/ had to go outside. the fs-repo layout with leveldb in go is more historical than correct. we can do a migration to make it better. maybe we could:

  • move datastore/ to leveldb/
  • or, move datastore/ to datastore/leveldb and move blocks/ to datastore/blocks

@jbenet
Copy link
Member

jbenet commented Dec 13, 2015

cc @whyrusleeping

@daviddias
Copy link
Member Author

As talked on Sprint Meeting Dec 14, blocks will be 'datastore' and datastore with level db will be 'datastore-legacy'

@daviddias daviddias changed the title IPFS Repo Feature - Implement blocks IPFS Repo Feature - Implement datastore Dec 15, 2015
@daviddias
Copy link
Member Author

(probably going to ask the obvious) The objects stored on this datastore, are serialised versions of the protobufs, right?

Currently all the blocks have the .data extension, what is the migration plan for when we use IPLD + CBOR? Migrate and have .json objects, so we can both read from old repos and new repos at the same time?

Are we dropping protobufs soon enough that we can make the js impl only understand IPLD + CBOR?

@jbenet
Copy link
Member

jbenet commented Dec 16, 2015

(probably going to ask the obvious) The objects stored on this datastore, are serialised versions of the protobufs, right?

Yes

Currently all the blocks have the .data extension, what is the migration plan for when we use IPLD + CBOR?

New objects are done with cbor. old objects still readable. see https://github.com/ipfs/go-ipld/tree/master/coding/pb

Migrate and have .json objects, so we can both read from old repos and new repos at the same time?

Turns out we will need to still have the protobuf implementation to read old objects. yeah annoying, but we'll get them on the wire from other people, or even if we get them in json, we'll need to be able to serialize with protobuf to hash and verify.

So we support protobuf in a "backwards compatible" kind of way. which is unfortunate, but not so bad with multicodec. read the discussions with @mildred over here:

Are we dropping protobufs soon enough that we can make the js impl only understand IPLD + CBOR?

No, we'll have to make it understand old objects. Sorry.

On the bright side, mafintosh wrote a very nice protobuf parser: https://github.com/mafintosh/protocol-buffers


I think we should use this whenever "breaking links" or "breaking old data" is a possibility:

@jbenet
Copy link
Member

jbenet commented Dec 16, 2015

that said, we could migrate the protobufs to .pb if useful. though the multicodec should give it

@daviddias
Copy link
Member Author

With #20 we have a impl capable of reading/writing blocks into the datastore following the same folder structure go-ipfs does.

@vijayee
Copy link

vijayee commented Jan 14, 2016

So I started making an implementation of the go-ipfs merkledag for use in the browser as a data structure and I am trying to get a sense of how the block store was working here. Is it implemented in this project? I couldn't get a sense where this left off. I can implement it if its not because I need it to finish off the dagservice but I needs to know the assumptions. I have implemented the block, node, and link data structures and they are tested (knock on wood). I'll separate the block related stuff into a seperate project. If I am following this right it has to implement both the datastore api and the abstract-blob-store?
https://github.com/vijayee/ipfs-merkle-dag.

@daviddias
Copy link
Member Author

@vijayee 'block store' is now known as 'datastore', as it was in the beginning, but with the change from levelDB to flatFS, we got datastore and blocks which was confusing, to make it clear that our intention is to move everything out of levelDB we have:

datastore -> current blocks in go-ipfs
datastore.legacy -> current leftovers (DHT records for example) inside datastore in go-ipfs

As for implementing the MerkleDAG, we have currently started working on it here: https://github.com/ipfs/js-ipfs-data-importing, following the 'on going' Data Importing Spec here: ipfs/specs#57

@daviddias daviddias removed the help wanted Seeking public contribution on this issue label Jan 21, 2016
@daviddias
Copy link
Member Author

It seems that blockstore name that go-ipfs uses, grew well in our hearts. Let's make js-ipfs also call the blockstore where blocks live. (Note, we were calling it datastore, because it is its original name.) //cc @dignifiedquire

@daviddias daviddias mentioned this issue Aug 6, 2016
17 tasks
@dignifiedquire
Copy link
Member

@diasdavid lets get a proper document up for all the features, names and storage locations we need in js-ipfs-repo please.

@daviddias
Copy link
Member Author

@dignifiedquire can it be part of the readme -- https://github.com/ipfs/js-ipfs-repo#background -- ? It really is following what go-ipfs does //cc @whyrusleeping

@dignifiedquire
Copy link
Member

Sure I just want to be sure to know what the end goal is when the work starts, rather than having to discover this part and that which is still missing.

@jbenet
Copy link
Member

jbenet commented Aug 7, 2016

datastore is the thing touching the fs. get/put raw byte buffers
blockstore is the thing (on top of data store) get/put Block
On Sun, Aug 7, 2016 at 09:11 Friedel Ziegelmayer [email protected]
wrote:

Sure I just want to be sure to know what the end goal is when the work
starts, rather than having to discover this part and that which is still
missing.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#13 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAIcofox_nOREQgkSRhcBo-r3N8HQORlks5qddmCgaJpZM4Gz2jE
.

@daviddias
Copy link
Member Author

Correction:

datastore was the thing touching the fs through levelDB, where blocks and DHT records where stored. Today, only DHT records and the roothash of the pinset live there.

blockstore is the thing where blocks get stored into fs, through flatfs that offers get/put raw byte buffers

block service is the thing on top of blockstore that offers get/put Block semantics

@jbenet
Copy link
Member

jbenet commented Aug 8, 2016

@diasdavid not in Go. In Go, (which this issue claims to want to match) it's as I described. "Datastore" is a library touching both. That "$repo/datastore" was also used as the leveldb dir and not the flatfs one is historical.

@daviddias
Copy link
Member Author

got it, it is just a matter of having too many things with the same name :)

go-datastore is the equivalent of abstract-blob-store in JS (what @jbenet mentions), these are the modules (adapters) that we use to exchange the storage backends.

IPFS Repo divides itself into several 'stores', namely: keys (not used yet), config, blocks, datastore, logs (not used anymore) and locks; see more info here: https://github.com/ipfs/specs/tree/master/repo#repo-contents

Initially go-ipfs stored its blocks inside datastore, which used a levelDB adapter, then, when we stopped doing that, the migration was performed by creating a new folder, called 'blocks'. So now we have:

» tree -L 1 ~/.ipfs
/Users/ground-control/.ipfs
├── blocks # where blocks are stored, straight fs
├── config
├── datastore # levelDB, where DHT records and pinset roothash is stored
└── version

What we miss in js, is to create the interface that enables access to the datastore folder, that is, the levelDB that contains the DHT records and the pinset roothash.

Currently, that 'datastore' folder is references in JS code as 'datastore-legacy' (see https://github.com/ipfs/js-ipfs-repo/tree/master/src/stores), and the blocks folder is references as 'datastore'. However, it makes sense now, since we never phased out the 'datastore folder / levelDB', to match the names, that is blockstore or blocks for the blocks folder and datastore to the levelDB one :)

@Kubuxu
Copy link
Member

Kubuxu commented Aug 9, 2016

In go-ipfs we use mount datastore to mount different types of datastores into different parts of datastore path, right now it is:

/ -> levelDB(".ipfs/datastore")
/blocks -> flatFS(".ipfs/blocks")

This passes Gets and Sets in with key starting with /blocks/ (example /blocks/CIQA2W4TC4TRMQRHYJYDCBPLAXG4DCO2MOKHIRLYKWDKEO2NFZVYGVQ) the the flatFS datastore implementation, and the rest uses levelDB implementation.

Blockstore then wraps datastore, uses /blocks/ prefix and provides access to only blocks, using datastore.

I don't know if you want to go with this path in JS, but this is how it is done in Go.

@jbenet
Copy link
Member

jbenet commented Aug 13, 2016

One thing to note is that we ARE turning the entire state of an IPFS node
repo into an IPLD graph.
On Tue, Aug 9, 2016 at 03:24 Jakub Sztandera [email protected]
wrote:

In go-ipfs we use mount datastore to mount different types of datastores
into different parts of datastore path, right now it is:

/ -> levelDB(".ipfs/datastore")
/blocks -> flatFS(".ipfs/blocks")

This passes Gets and Sets in with key starting with /blocks/ (example
/blocks/CIQA2W4TC4TRMQRHYJYDCBPLAXG4DCO2MOKHIRLYKWDKEO2NFZVYGVQ) the the
flatFS datastore implementation, and the rest uses levelDB implementation.

Blockstore then wraps datastore, uses /blocks/ prefix and provides access
to only blocks, using datastore.

I don't know if you want to go with this path in JS, but this is how it is
done in Go.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#13 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAIcoaL0QgVop_w52pzTu0IzXBQ08uXBks5qeCsagaJpZM4Gz2jE
.

@daviddias daviddias changed the title IPFS Repo Feature - Implement datastore Implement datastore Sep 8, 2016
@daviddias daviddias added the help wanted Seeking public contribution on this issue label Sep 8, 2016
@daviddias daviddias added status/ready Ready to be worked exp/expert Having worked on the specific codebase is important and removed js-ipfs-ready labels Dec 5, 2016
@daviddias daviddias added status/deferred Conscious decision to pause or backlog and removed status/ready Ready to be worked labels Jan 29, 2017
@daviddias daviddias removed the status/deferred Conscious decision to pause or backlog label Mar 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exp/expert Having worked on the specific codebase is important help wanted Seeking public contribution on this issue
Projects
None yet
Development

No branches or pull requests

5 participants