Configurable and lightweight backup utility with deduplication and encryption.
Python 3.8 (or newer) running on Linux, MacOS, or Windows.
- local path
- Backblaze B2
- Amazon S3
- any S3-compatible service
You can implement and use your own adapter for pretty much any backup destination without changing the source code of Replicat.
pip install replicat
For various reasons, I wasn't entirely happy with any of the similar projects that I've tried.
Highlights/goals of Replicat:
- efficient, concise, easily auditable implementation
- high customisability
- few external dependencies
- well-documented behaviour
- unified repository layout
- API that exists
This project borrows a few ideas from those other projects, but not enough to be considered a copycat.
You can use Replicat to backup files from your machine to a repository, located on a supported backend such as a local directory or cloud storage (like Backblaze B2). Files are transferred and stored in an optionally encrypted and chunked form, and references to chunks are stored in snapshots, along with file name and metadata.
To restore files from a snapshot, Replicat will download referenced chunks from the backend and use them to assemble the original files locally.
Replicat supports two types of repositories: encrypted (the default) and unencrypted. You may want to disable encryption if you trust your backend provider and network, for example. Duplicate chunks are reused between snapshots to save on bandwidth and storage costs.
See Cryptography for a more in-depth look into this, or Functional flow overview for the extremely cool and colorful diagrams that I worked really hard on.
The installer will create the replicat
command (shortcut for python -m replicat
).
There are several available subcommands:
init
- initialises the repository using the provided settingssnapshot
- creates a new snapshot in the repositorylist-snapshots
/ls
- lists snapshotslist-files
/lf
- lists files across snapshotsrestore
- restores files from snapshotsadd-key
- creates a new key for the encrypted repositorydelete
- deletes snapshots by their namesclean
- performs garbage collectionupload-objects
- uploads objects to the backend (a low-level command)download-objects
- downloads objects from the backend (a low-level command)list-objects
- lists objects at the backend (a low-level command)delete-objects
- deletes objects from the backend (a low-level command)
⚠️ WARNING: do not upload to and delete from the repository at the same time using the same key or shared keys. For example, it's not safe to runsnapshot
anddelete
orclean
concurrently, except when using independent keys.
There are several command line arguments that are common to all subcommands:
-
-r
/--repository
- used to specify the type and location of the repository backend (backup destination). The format is<backend>:<connection string>
, where<backend>
is the short name of a Replicat-compatible backend and<connection string>
is open to interpretation by the adapter for the selected backend. For example,b2:bucket-name
for the B2 backend, orlocal:some/local/path
for the local backend (or justsome/local/path
, since the<backend>:
part can be omitted for local destinations). If the backend requires additional arguments, they will appear in the--help
output -
-q
/--hide-progress
- suppresses progress indication for commands that support it -
-c
/--concurrent
- the number of concurrent connections to the backend -
--cache-directory
- specifies the directory to use for cache.--no-cache
disables cache completely -
-v
/--verbose
- increases the logging verbosity. The default verbosity iswarning
,-v
meansinfo
,-vv
meansdebug
Encrypted repositories additionally require a key for every operation:
-K
/--key-file
- the path to the key file
If the repository is encrypted and the key is password-protected, a matching password is also required:
-P
/--password-file
- path to the file with the password (preferred)-p
/--password
- the password in plaintext
If you often use many of these arguments, and their values mostly stay the same between invocations, you may find it easier to put them in a configuration file instead:
--profile
- load settings from this profile in the configuration file--config
- path to the configuration file (check--help
for the default config location)--ignore-config
- ignore the configuration file
Names of configuration file options mostly match the long names of command line arguments
(e.g., hide-progress = true
matches --hide-progress
, repository = "s3:bucket"
matches
-r s3:bucket
), but you can always refer to the
Configuration file section for full reference.
Repository (-r
, --repository
) can also be provided as the REPLICAT_REPOSITORY
environment variable.
Password can be provided as the REPLICAT_PASSWORD
environment variable.
If the backend needs additional parameters (account name, client secret, some boolean flag, or literally anything else), you'll also be able to set them via command line arguments, as environment variables, or in the configuration file. Refer to Backends section to learn more.
Note that values from CLI always take precedence over options from the configuration file.
Specifically, to build the final configuration, Replicat considers global defaults, the configuration
file (either the default one or the one supplied via --config
), environment variables, and command
line arguments, in that order.
As mentioned in the Command line interface section, options that you can put in the configuration file mostly match CLI arguments, with few exceptions.
Option name | Type | Supported values | Notes |
---|---|---|---|
repository |
string | <backend>:<connection string> |
|
concurrent |
integer | Integers greater than 0 | |
hide‑progress |
boolean | true , false |
|
cache‑directory |
path | Relative or absolute path | |
no‑cache |
boolean | true , false |
|
password |
string | Password as a string | Cannot be used together with password-file |
password‑file |
path | Relative or absolute path | Cannot be used together with password |
key |
string | JSON as a string | Cannot be used together with key-file |
key‑file |
path | Relative or absolute path | Cannot be used together with key |
log‑level |
string | debug , info , warning , error , critical , fatal |
CLI option -v increases logging verbosity starting from warning , while this option lets you set lower logging verbosity, such as error |
If the backend requires additional parameters (account id, access key, numeric port, or literally
anything else), Replicat lets you provide them via the configuration file. For example, if you see
a backend-specific argument --some-backend-option
in the --help
output, the equivalent
configuration file option will be called some-backend-option
.
Here's an example configuration file (it uses TOML syntax)
concurrent = 10
# Relative paths work
cache-directory = "~/.cache/directory/for/replicat"
[debugging]
log-level = "info"
hide-progress = true
[my-local-repo]
repository = "some/local/path"
password = "<secret>"
key = """
{
"kdf": { ... },
"kdf_params": { "!b": "..." },
"private": { "!b": "..." }
}
"""
concurrent = 15
no-cache = true
[some-s3-repo]
repository = "s3:bucket-name"
key-id = "..."
access-key = "..."
region = "..."
Options that you specify at the top of the configuration file are defaults and they will be inherited by all of the profiles. In the example above there are three profiles (not including the default one):
debugging
my-local-repo
some-s3-repo
You can tell Replicat which profile to use via the --profile
CLI argument (e.g. --profile my-local-repo
).
Notice that some-s3-repo
includes options that were not listed in the table. key-id
,
access-key
, region
are the backend-specific options for the S3 backend
(Backends).
Run replicat
commands with -r <backend>:<connection string>
and additional arguments
that are specific to the selected backend. Those arguments may have defaults and may also
be provided via environment variables or profiles. Use
replicat <command> -r <backend>:<connection string> --help
to see them.
The format is -r local:some/local/path
, or simply -r some/local/path
.
The format is -r b2:bucket-id
or -r b2:bucket-name
. This backend uses B2 native API and
requires
- key ID (
--key-id
argument, orB2_KEY_ID
environment variable, orkey-id
option in a profile) - application key (
--application-key
argument, orB2_APPLICATION_KEY
environment variable, orapplication-key
option in a profile)
Sign into your Backblaze B2 account to generate them. Note that you can use the master application key or a normal (non-master) application key that can be restricted to a single bucket. Refer to official B2 docs for more information.
The format is -r s3:bucket-name
. Requires
- AWS key ID (
--key-id
argument, orS3_KEY_ID
environment variable, or thekey-id
option in a profile) - AWS access key (
--access-key
argument, orS3_ACCESS_KEY
environment variable, oraccess-key
option in a profile) - region (
--region
argument, orS3_REGION
environment variable, orregion
option in a profile)
The format is -r s3c:bucket-name
. Requires
- key ID (
--key-id
argument, orS3C_KEY_ID
environment variable, or thekey-id
option in a profile) - access key (
--access-key
argument, orS3C_ACCESS_KEY
environment variable, oraccess-key
option in a profile) - host (
--host
argument, orS3C_HOST
environment variable, orhost
option in a profile) - region (
--region
argument, orS3C_REGION
environment variable, orregion
option in a profile)
Host must not include the scheme. The default scheme is https
, but can be changed via the
--scheme
argument (or, equivalently, the S3C_SCHEME
environment variable or scheme
option
in a profile).
You can use S3-compatible backend to connect to B2, S3, and many other cloud storage providers that offer S3-compatible API.
replicat.backends
is a Python namespace package, making it possible to add custom backends
without changing replicat
source code. See this guide
for a complete walkthrough.
If you've created a Replicat-compatible adapter for a backend that Replicat doesn't already support and your implementation doesn't depend on additional third-party libraries (or at least they are not too heavy and can be moved to extras), consider submitting a PR to include it in this repository.
Replicat's default parameters and selection of cryptographic primitives should work well for most users but they do allow for some customisation if you know what you are doing. Refer to Encryption > Settings for more information.
If you believe you've found a security issue with Replicat, please report it to [email protected] (or DM me on Twitter or Telegram).