Disclaimer: I recommend the klauspost/reedsolomon erasure coding library over this one as it is more performant and has better support for multiple architectures.
Go bindings for erasure coding (Reed-Solomon coding).
Erasure coding is similar to RAID based parity encoding, but is more generalized and powerful. When defining an erasure code, you specify a k
and m
variable. m
is the number of shards you wish to encode and k
is the number shards it takes to recreate your original data. Hence k
must be less than m
and usually not equal (as that would be a pointless encoding). The real magic with erasure coding is that fact that ANY k
of the m
shards can recreate the original data. For example, a erasure coding scheme of k=8
and m=12
means any four of the encoded shards can be lost while the original data can still be constructed from the valid remaining eight shards.
This library is aimed at simplicity and performance. It only has three methods including a constructor which are all thread-safe! Internally it uses Cgo to utilize a complex C library. For a more in-depth look into this library be sure to check out the Intel® Storage Acceleration Library and especially their corresponding video. One feature it does add is an optimization for decoding. Since there are m choose k
possible inverse matrices for decoding, this library caches them (via lazy-loading) so as reduce the amount of time decoding. It does so by utilizing a trie where the sorted error list of shards is the key to the trie and the corresponding decode matrix is the value.
I hope you find it useful and pull requests are welcome!
See the GoDoc for an API reference
package main
import (
"bytes"
"log"
"math/rand"
"github.com/somethingnew2-0/go-erasure"
)
func corrupt(source, errList []byte, shardLength int) []byte {
corrupted := make([]byte, len(source))
copy(corrupted, source)
for _, err := range errList {
for i := 0; i < shardLength; i++ {
corrupted[int(err)*shardLength+i] = 0x00
}
}
return corrupted
}
func main() {
m := 12
k := 8
shardLength := 16 // Length of a shard
size := k * shardLength // Length of the data blob to encode
code := erasure.NewCode(m, k, size)
source := make([]byte, size)
for i := range source {
source[i] = byte(rand.Int63() & 0xff) //0x62
}
encoded := code.Encode(source)
errList := []byte{0, 2, 3, 4}
corrupted := corrupt(append(source, encoded...), errList, shardLength)
recovered := code.Decode(corrupted, errList, true)
if !bytes.Equal(source, recovered) {
log.Fatal("Source was not sucessfully recovered with 4 errors")
}
}
To start run source dev.sh
or more simply . dev.sh
to setup the git hooks and GOPATH for this project.
Run go test
or go test -bench .
to test the unit tests and benchmark tests.