Skip to content

Commit

Permalink
feat: pebbleds profile and plugin (#10530)
Browse files Browse the repository at this point in the history
* include pebble as built-in plugin

Pebble provides a high-performance alternative to leveldb as the datastore, and will serve as a replacement for badger1.

There are a number of tuning parameters available for tuning pebble's performance to your specific needs. Default values are used for any that are not configured or are set to the parameter's zero-value.

Requires ipfs/go-ds-pebble#39

Closes #10347

* docs: remove mention of ipfs-ds-convert. Rationale: ipfs-inactive/ipfs-ds-convert#50
* docs: pebbleds profile
* test: meaningful t0025-datastores.sh
* Update config/init.go
* Update docs/config.md
* Do not hard-code zero values into pebble config
  • Loading branch information
gammazero authored Oct 3, 2024
1 parent 1bc773f commit 52b0062
Show file tree
Hide file tree
Showing 16 changed files with 476 additions and 28 deletions.
11 changes: 11 additions & 0 deletions config/init.go
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,17 @@ func DefaultDatastoreConfig() Datastore {
}
}

func pebbleSpec() map[string]interface{} {
return map[string]interface{}{
"type": "measure",
"prefix": "pebble.datastore",
"child": map[string]interface{}{
"type": "pebbleds",
"path": "pebbleds",
},
}
}

func badgerSpec() map[string]interface{} {
return map[string]interface{}{
"type": "measure",
Expand Down
39 changes: 37 additions & 2 deletions config/profile.go
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,11 @@ You should use this datastore if:
* You want to minimize memory usage.
* You are ok with the default speed of data import, or prefer to use --nocopy.
This profile may only be applied when first initializing the node.
See configuration documentation at:
https://github.com/ipfs/kubo/blob/master/docs/datastores.md#flatfs
NOTE: This profile may only be applied when first initializing node at IPFS_PATH
via 'ipfs init --profile flatfs'
`,

InitOnly: true,
Expand All @@ -144,6 +148,32 @@ This profile may only be applied when first initializing the node.
return nil
},
},
"pebbleds": {
Description: `Configures the node to use the pebble high-performance datastore.
Pebble is a LevelDB/RocksDB inspired key-value store focused on performance
and internal usage by CockroachDB.
You should use this datastore if:
- You need a datastore that is focused on performance.
- You need reliability by default, but may choose to disable WAL for maximum performance when reliability is not critical.
- This datastore is good for multi-terabyte data sets.
- May benefit from tuning depending on read/write patterns and throughput.
- Performance is helped significantly by running on a system with plenty of memory.
See configuration documentation at:
https://github.com/ipfs/kubo/blob/master/docs/datastores.md#pebbleds
NOTE: This profile may only be applied when first initializing node at IPFS_PATH
via 'ipfs init --profile pebbleds'
`,

InitOnly: true,
Transform: func(c *Config) error {
c.Datastore.Spec = pebbleSpec()
return nil
},
},
"badgerds": {
Description: `Configures the node to use the legacy badgerv1 datastore.
Expand All @@ -160,7 +190,12 @@ Other caveats:
* Good for medium-size datastores, but may run into performance issues
if your dataset is bigger than a terabyte.
This profile may only be applied when first initializing the node.`,
See configuration documentation at:
https://github.com/ipfs/kubo/blob/master/docs/datastores.md#badgerds
NOTE: This profile may only be applied when first initializing node at IPFS_PATH
via 'ipfs init --profile badgerds'
`,

InitOnly: true,
Transform: func(c *Config) error {
Expand Down
10 changes: 10 additions & 0 deletions docs/changelogs/v0.31.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

- [Overview](#overview)
- [🔦 Highlights](#-highlights)
- [Experimental Pebble Datastore](#experimental-pebble-datastore)
- [New metrics](#new-metrics)
- [`lowpower` profile no longer breaks DHT announcements](#lowpower-profile-no-longer-breaks-dht-announcements)
- [📝 Changelog](#-changelog)
Expand All @@ -15,6 +16,15 @@

### 🔦 Highlights

#### Experimental Pebble Datastore

[Pebble](https://github.com/ipfs/kubo/blob/master/docs/config.md#pebbleds-profile) provides a high-performance alternative to leveldb as the datastore, and provides a modern replacement for [legacy badgerv1](https://github.com/ipfs/kubo/blob/master/docs/config.md#badgerds-profile).

A fresh Kubo node can be initialized with [`pebbleds` profile](https://github.com/ipfs/kubo/blob/master/docs/config.md#pebbleds-profile) via `ipfs init --profile pebbleds`.

There are a number of parameters available for tuning pebble's performance to your specific needs. Default values are used for any parameters that are not configured or are set to their zero-value.
For a description of the available tuning parameters, see [kubo/docs/datastores.md#pebbleds](https://github.com/ipfs/kubo/blob/master/docs/datastores.md#pebbleds).

#### New metrics

- Added 3 new go metrics: `go_gc_gogc_percent`, `go_gc_gomemlimit_bytes` and `go_sched_gomaxprocs_threads` as those are [recommended by the Go team](https://github.com/prometheus/client_golang/pull/1559)
Expand Down
45 changes: 34 additions & 11 deletions docs/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,7 @@ config file at runtime.
- [`local-discovery` profile](#local-discovery-profile)
- [`default-networking` profile](#default-networking-profile)
- [`flatfs` profile](#flatfs-profile)
- [`pebbleds` profile](#pebbleds-profile)
- [`badgerds` profile](#badgerds-profile)
- [`lowpower` profile](#lowpower-profile)
- [`announce-off` profile](#announce-off-profile)
Expand Down Expand Up @@ -524,13 +525,8 @@ Spec defines the structure of the ipfs datastore. It is a composable structure,
where each datastore is represented by a json object. Datastores can wrap other
datastores to provide extra functionality (eg metrics, logging, or caching).

This can be changed manually, however, if you make any changes that require a
different on-disk structure, you will need to run the [ipfs-ds-convert
tool](https://github.com/ipfs/ipfs-ds-convert) to migrate data into the new
structures.

For more information on possible values for this configuration option, see
[docs/datastores.md](datastores.md)
> [!NOTE]
> For more information on possible values for this configuration option, see [`kubo/docs/datastores.md`](datastores.md)
Default:
```
Expand Down Expand Up @@ -2403,9 +2399,9 @@ Inverse profile of the test profile.

### `flatfs` profile

Configures the node to use the flatfs datastore. Flatfs is the default datastore.
Configures the node to use the flatfs datastore.
Flatfs is the default, most battle-tested and reliable datastore.

This is the most battle-tested and reliable datastore.
You should use this datastore if:

- You need a very simple and very reliable datastore, and you trust your
Expand All @@ -2416,7 +2412,30 @@ You should use this datastore if:
- You want to minimize memory usage.
- You are ok with the default speed of data import, or prefer to use `--nocopy`.

This profile may only be applied when first initializing the node.
> [!WARNING]
> This profile may only be applied when first initializing the node via `ipfs init --profile flatfs`

> [!NOTE]
> See caveats and configuration options at [`datastores.md#flatfs`](datastores.md#flatfs)

### `pebbleds` profile

Configures the node to use the pebble high-performance datastore.

Pebble is a LevelDB/RocksDB inspired key-value store focused on performance and internal usage by CockroachDB.
You should use this datastore if:

- You need a datastore that is focused on performance.
- You need reliability by default, but may choose to disable WAL for maximum performance when reliability is not critical.
- This datastore is good for multi-terrabyte data sets.
- May benefit from tuning depending on read/write patterns and throughput.
- Performance is helped significantly by running on a system with plenty of memory.

> [!WARNING]
> This profile may only be applied when first initializing the node via `ipfs init --profile pebbleds`

> [!NOTE]
> See other caveats and configuration options at [`datastores.md#pebbleds`](datastores.md#pebbleds)

### `badgerds` profile

Expand All @@ -2437,7 +2456,11 @@ Also, be aware that:
- Good for medium-size datastores, but may run into performance issues if your dataset is bigger than a terabyte.
- The current implementation is based on old badger 1.x which is no longer supported by the upstream team.

This profile may only be applied when first initializing the node.
> [!WARNING]
> This profile may only be applied when first initializing the node via `ipfs init --profile badgerds`

> [!NOTE]
> See other caveats and configuration options at [`datastores.md#pebbleds`](datastores.md#pebbleds)

### `lowpower` profile

Expand Down
36 changes: 36 additions & 0 deletions docs/datastores.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,13 @@
This document describes the different possible values for the `Datastore.Spec`
field in the ipfs configuration file.

- [flatfs](#flatfs)
- [levelds](#levelds)
- [pebbleds](#pebbleds)
- [badgerds](#badgerds)
- [mount](#mount)
- [measure](#measure)

## flatfs

Stores each key value pair as a file on the filesystem.
Expand Down Expand Up @@ -35,6 +42,35 @@ Uses a leveldb database to store key value pairs.
}
```

## pebbleds

Uses [pebble](https://github.com/cockroachdb/pebble) as a key value store.

```json
{
"type": "pebbleds",
"path": "<location of pebble inside repo>",
}
```

The following options are availble for tuning pebble.
If they are not configured (or assigned their zero-valued), then default values are used.

* `bytesPerSync`: int, Sync sstables periodically in order to smooth out writes to disk. (default: 512KB)
* `bisableWAL`: true|false, Disable the write-ahead log (WAL) at expense of prohibiting crash recovery. (default: false)
* `cacheSize`: Size of pebble's shared block cache. (default: 8MB)
* `l0CompactionThreshold`: int, Count of L0 files necessary to trigger an L0 compaction.
* `l0StopWritesThreshold`: int, Limit on L0 read-amplification, computed as the number of L0 sublevels.
* `lBaseMaxBytes`: int, Maximum number of bytes for LBase. The base level is the level which L0 is compacted into.
* `maxConcurrentCompactions`: int, Maximum number of concurrent compactions. (default: 1)
* `memTableSize`: int, Size of a MemTable in steady state. The actual MemTable size starts at min(256KB, MemTableSize) and doubles for each subsequent MemTable up to MemTableSize (default: 4MB)
* `memTableStopWritesThreshold`: int, Limit on the number of queued of MemTables. (default: 2)
* `walBytesPerSync`: int: Sets the number of bytes to write to a WAL before calling Sync on it in the background. (default: 0, no background syncing)
* `walMinSyncSeconds`: int: Sets the minimum duration between syncs of the WAL. (default: 0)

> [!TIP]
> Start using pebble with only default values and configure tuning items are needed for your needs. For a more complete description of these values, see: `https://pkg.go.dev/github.com/cockroachdb/[email protected]#Options` (where `A.B.C` is pebble version from Kubo's `go.mod`).
## badgerds

Uses [badger](https://github.com/dgraph-io/badger) as a key value store.
Expand Down
12 changes: 12 additions & 0 deletions docs/examples/kubo-as-a-library/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ require (
require (
bazil.org/fuse v0.0.0-20200117225306-7b5117fecadc // indirect
github.com/AndreasBriese/bbloom v0.0.0-20190825152654-46b345b51c96 // indirect
github.com/DataDog/zstd v1.4.5 // indirect
github.com/Jorropo/jsync v1.0.1 // indirect
github.com/alecthomas/units v0.0.0-20240626203959-61d1e3462e30 // indirect
github.com/alexbrainman/goissue34681 v0.0.0-20191006012335-3fc7a47baff5 // indirect
Expand All @@ -25,6 +26,12 @@ require (
github.com/cenkalti/backoff/v4 v4.3.0 // indirect
github.com/ceramicnetwork/go-dag-jose v0.1.0 // indirect
github.com/cespare/xxhash/v2 v2.3.0 // indirect
github.com/cockroachdb/errors v1.11.3 // indirect
github.com/cockroachdb/fifo v0.0.0-20240606204812-0bbfbd93a7ce // indirect
github.com/cockroachdb/logtags v0.0.0-20230118201751-21c54148d20b // indirect
github.com/cockroachdb/pebble v1.1.2 // indirect
github.com/cockroachdb/redact v1.1.5 // indirect
github.com/cockroachdb/tokenbucket v0.0.0-20230807174530-cc333fc44b06 // indirect
github.com/containerd/cgroups v1.1.0 // indirect
github.com/coreos/go-systemd/v22 v22.5.0 // indirect
github.com/crackcomm/go-gitignore v0.0.0-20231225121904-e25f5bc08668 // indirect
Expand All @@ -43,6 +50,7 @@ require (
github.com/francoispqt/gojay v1.2.13 // indirect
github.com/fsnotify/fsnotify v1.7.0 // indirect
github.com/gabriel-vasile/mimetype v1.4.4 // indirect
github.com/getsentry/sentry-go v0.27.0 // indirect
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
Expand Down Expand Up @@ -74,6 +82,7 @@ require (
github.com/ipfs/go-ds-flatfs v0.5.1 // indirect
github.com/ipfs/go-ds-leveldb v0.5.0 // indirect
github.com/ipfs/go-ds-measure v0.2.0 // indirect
github.com/ipfs/go-ds-pebble v0.4.0 // indirect
github.com/ipfs/go-fs-lock v0.0.7 // indirect
github.com/ipfs/go-ipfs-blockstore v1.3.1 // indirect
github.com/ipfs/go-ipfs-delay v0.0.1 // indirect
Expand Down Expand Up @@ -103,6 +112,8 @@ require (
github.com/klauspost/compress v1.17.9 // indirect
github.com/klauspost/cpuid/v2 v2.2.8 // indirect
github.com/koron/go-ssdp v0.0.4 // indirect
github.com/kr/pretty v0.3.1 // indirect
github.com/kr/text v0.2.0 // indirect
github.com/libp2p/go-buffer-pool v0.1.0 // indirect
github.com/libp2p/go-cidranger v1.1.0 // indirect
github.com/libp2p/go-doh-resolver v0.4.0 // indirect
Expand Down Expand Up @@ -172,6 +183,7 @@ require (
github.com/quic-go/quic-go v0.45.2 // indirect
github.com/quic-go/webtransport-go v0.8.0 // indirect
github.com/raulk/go-watchdog v1.3.0 // indirect
github.com/rogpeppe/go-internal v1.12.0 // indirect
github.com/samber/lo v1.46.0 // indirect
github.com/spaolacci/murmur3 v1.1.0 // indirect
github.com/stretchr/testify v1.9.0 // indirect
Expand Down
Loading

0 comments on commit 52b0062

Please sign in to comment.