A write-ahead logging (WAL) implementation in Go.
THIS SOFTWARE IS STILL IN ALPHA AND THERE ARE NO GUARANTEES REGARDING API STABILITY YET.
Package wal
implements an efficient Write-ahead log for Go applications.
The main goal of a Write-ahead Log (WAL) is to make the application more durable, so it does not lose data in case of a crash. WALs are used in applications such as database systems to flush all written data to disk before the changes are written to the database. In case of a crash, the WAL enables the application to recover lost in-memory changes by reconstructing all required operations from the log.
The code below is a copy of example_test.go
. It shows the
general usage of this library together with some explanation.
package wal_test
import (
"fmt"
"os"
"github.com/fgrosse/wal"
"github.com/fgrosse/wal/waltest"
"go.uber.org/zap"
)
// walEntries is an unexported package level variable that is used to register
// your own wal.Entry implementations. Such an Entry contains the logic of how
// to encode and decode a WAL with your custom data. Each wal.Entry is also
// associated with a unique wa.EntryType so we are able to map the binary
// representation back to your original Go type.
//
// In the example below we use two example implementations which are only
// available in unit tests. You might want to look into their implementation
// (see github.com/fgrosse/wal/waltest) to understand how you can efficiently
// implement your own encoding and decoding logic.
var walEntries = wal.NewEntryRegistry(
func() wal.Entry { return new(waltest.ExampleEntry1) },
func() wal.Entry { return new(waltest.ExampleEntry2) },
)
func Example() {
// The WAL will persist all written entries onto disk in an efficient
// append-only log file. Entries are split over multiple WAL segment files.
// To create a new WAL, you have to provide a path to the directory where
// the segment files will be stored.
path, err := os.MkdirTemp("", "WALExample")
check(err)
// There are a few runtime options for the WAL which have an impact on its
// performance and durability guarantees. By default, the WAL prefers strong
// durability and will fsync each write to disk immediately. Under high
// throughput, such a configuration can make the WAL a bottleneck of your
// application. Therefore, it might make sense to configure a SyncDelay to
// let the WAL automatically badge up fyncs for multiple writes.
conf := wal.DefaultConfiguration()
// This library uses go.uber.org/zap for efficient structured logging.
logger, err := zap.NewProduction()
check(err)
// When you create a new WAL instance, it will immediately try and load any
// existent WAL segments from the path you provided. The `walEntries` parameter
// that is passed to wal.New(…) is an EntryRegistry which lets the WAL know
// about your own Entry implementation. This way, you can specify your own types
// and encoding/decoding logic but the WAL is still able to load entries from
// the last segment.
w, err := wal.New(path, conf, walEntries, logger)
check(err)
// Now you can finally write your first WAL entry. When this function
// returns without an error you can be sure that it was fully written to disk.
offset, err := w.Write(&waltest.ExampleEntry1{
ID: 42,
Point: []float32{1, 2, 3},
})
check(err)
// You might use the offset in your application or ignore it altogether.
fmt.Print(offset)
// Finally, you need to close the WAL to release any resources and close the
// open segment file.
err = w.Close()
check(err)
}
// check is a simple helper function to check errors in Example().
// In a real application, you should implement proper error handling.
func check(err error) {
if err != nil {
panic(err)
}
}
Your custom entries must implement the wal.Entry
interface:
// Entry is a single record of the Write Ahead Log.
// It is up to the application that uses the WAL to provide at least one concrete
// Entry implementation to the WAL via the EntryRegistry.
type Entry interface {
Type() EntryType
// EncodePayload encodes the payload into the provided buffer. In case the
// buffer is too small to fit the entire payload, this function can grow the
// old and return a new slice. Otherwise, the old slice must be returned.
EncodePayload([]byte) []byte
// ReadPayload reads the payload from the reader but does not yet decode it.
// Reading and decoding are separate steps for performance reasons. Sometimes
// we might want to quickly seek through the WAL without having to decode
// every entry.
ReadPayload(r io.Reader) ([]byte, error)
// DecodePayload decodes an entry from a payload that has previously been read
// by ReadPayload(…).
DecodePayload([]byte) error
}
// EntryType is used to distinguish different types of messages that we write
// to the WAL.
type EntryType uint8
You can find an example implementation at entry_test.go
.
Each WAL.Write(…)
call creates a binary encoding of the passed wal.Entry
which
we call the entry's payload. This payload is written to disk together with some
metadata such as the entry type, a CRC checksum and an offset number.
The full binary layout looks like the following:
// Every Entry is written, using the following binary layout (big endian format):
//
// ┌─────────────┬───────────┬──────────┬─────────┐
// │ Offset (4B) │ Type (1B) │ CRC (4B) │ Payload │
// └─────────────┴───────────┴──────────┴─────────┘
//
// - Offset = 32bit WAL entry number for each record in order to implement a low-water mark
// - Type = Type of WAL entry
// - CRC = 32bit hash computed over the payload using CRC
// - Payload = The actual WAL entry payload data
This data is appended to a file and the WAL makes sure that it is actually written to non-volatile storage rather than just being stored in a memory-based write cache that would be lost if power failed (see fsynced).
When the WAL file reaches a configurable maximum size, it is closed and the WAL starts to append its records to a new and empty file. These files are called WAL segments. Typically, the WAL is split into multiple segments to enable other processes to take care of cleaning old segments, implement WAL segment backups and more. When the WAL is started, it will resume operation at the end of the last open segment file.
$ go get github.com/fgrosse/wal
Please read CONTRIBUTING.md for details on our code of conduct and on the process for submitting pull requests to this repository.
THIS SOFTWARE IS STILL IN ALPHA AND THERE ARE NO GUARANTEES REGARDING API STABILITY YET.
All significant (e.g. breaking) changes are documented in the CHANGELOG.md.
After the v1.0 release we plan to use SemVer for versioning. For the versions available, see the releases page.
- Friedrich Große - Initial work - fgrosse
See also the list of contributors who participated in this project.
This project is licensed under the BSD-3-Clause License - see the LICENSE file for details.