Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
266 changes: 247 additions & 19 deletions docs/docs/users/guides/advanced/archival_snapshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,35 +5,263 @@ sidebar_position: 2

# Archival Snapshots

ChainSafe hosts two kinds of snapshots: hourly snapshots (guaranteed to be no
more than a few hours old) and archival snapshots (similar to the regular
snapshots, but with less duplicate data). Archival snapshots come as 'lite'
snapshots, which include the entire block header history back to genesis, and
'diff' snapshots, which only contain the new data since the previous diff
snapshot.
Forest supports building partial or full archival nodes using **lite** and
**diff** snapshots. This guide explains the snapshot types, how they relate to
each other, and how to use them for common workflows.

## Snapshot types

ChainSafe publishes two kinds of archival snapshots:

Archival snapshots are publicly available here:
| Type | Naming pattern | Frequency | Contents |
| -------- | ------------------------------------------------------------------- | ------------------------------ | --------------------------------------------------------------------------------------------------------------------------------- |
| **Lite** | `forest_snapshot_<network>_<date>_height_<EPOCH>.forest.car.zst` | Every 30,000 epochs (~10 days) | Complete state trees from `EPOCH - 900` to `EPOCH`, plus the full block header history back to genesis. |
| **Diff** | `forest_diff_<network>_<date>_height_<BASE>+<RANGE>.forest.car.zst` | Every 3,000 epochs (~1 day) | Only the new IPLD key-value pairs added between `BASE` and `BASE + RANGE`. Does **not** contain a complete state tree on its own. |

Archival snapshots are publicly available at:

- Mainnet lite: https://forest-archive.chainsafe.dev/list/mainnet/lite
- Mainnet diff: https://forest-archive.chainsafe.dev/list/mainnet/diff
- Calibnet lite: https://forest-archive.chainsafe.dev/list/calibnet/lite
- Calibnet diff: https://forest-archive.chainsafe.dev/list/calibnet/diff

## Merging snapshots
## How lite and diff snapshots work together

:::warning

A diff snapshot is useless on its own. It **must** be combined with its
matching base lite snapshot (and all intermediate diffs) to form a complete
state tree. Using a diff without the correct base will lead to incomplete state
trees and validation errors such as:

```
failed to lookup actor f410f...
```

or

```
failed to read init actor address map
```

:::

### The golden rule

A **complete state tree** at epoch `E` requires:

1. The **lite snapshot** whose epoch is at or just before `E`, and
2. **All consecutive diff snapshots** that bridge from that lite epoch up to
`E`.

### Visual example

Consider calibnet snapshots with lite snapshots every 30,000 epochs and diffs
every 3,000 epochs:

```
Lite @3,480,000 ─┬─ Diff @3,480,000+3,000 ─── Diff @3,483,000+3,000 ─── ... ─── Diff @3,507,000+3,000
└─ provides complete state trees from @3,479,100 to @3,480,000
each diff extends the complete state trees forward by 3,000 epochs

Lite @3,510,000 ─┬─ Diff @3,510,000+3,000 ─── ...
└─ new base: complete state trees from @3,509,100 to @3,510,000
```

Since 'diff' snapshots only contain the new data since the previous diff
snapshot, they need to be merged with the previous 'lite' snapshot to form a
complete snapshot. This can be done with the `forest-tool archive merge`
command.
There are 10 diff snapshots between consecutive lite snapshots (30,000 / 3,000
= 10).

## Figuring out which snapshots you need

Given a target epoch `E` and a network, follow these steps:

1. **Find the base lite snapshot.** Take the largest lite epoch that is `≤ E`.
Lite epochs are multiples of 30,000 for the network.

```
base_epoch = floor(E / 30,000) × 30,000
```

2. **List the required diffs.** Starting from `base_epoch`, collect every diff
snapshot until you reach or pass `E`:

```
diff @base_epoch + 3,000
diff @(base_epoch + 3,000) + 3,000
...
diff @(last_epoch_before_E) + 3,000
```

### Worked example

To get complete state for **calibnet epoch 3,506,992**:

| # | Snapshot | Purpose |
| --- | ------------------------------------------------------------- | ---------------------------------------- |
| 1 | `forest_snapshot_calibnet_..._height_3480000.forest.car.zst` | Base lite (complete state @3,480,000) |
| 2 | `forest_diff_calibnet_..._height_3480000+3000.forest.car.zst` | State changes through epoch 3,483,000 |
| 3 | `forest_diff_calibnet_..._height_3483000+3000.forest.car.zst` | ... through 3,486,000 |
| 4 | `forest_diff_calibnet_..._height_3486000+3000.forest.car.zst` | ... through 3,489,000 |
| 5 | `forest_diff_calibnet_..._height_3489000+3000.forest.car.zst` | ... through 3,492,000 |
| 6 | `forest_diff_calibnet_..._height_3492000+3000.forest.car.zst` | ... through 3,495,000 |
| 7 | `forest_diff_calibnet_..._height_3495000+3000.forest.car.zst` | ... through 3,498,000 |
| 8 | `forest_diff_calibnet_..._height_3498000+3000.forest.car.zst` | ... through 3,501,000 |
| 9 | `forest_diff_calibnet_..._height_3501000+3000.forest.car.zst` | ... through 3,504,000 |
| 10 | `forest_diff_calibnet_..._height_3504000+3000.forest.car.zst` | ... through 3,507,000 (covers 3,506,992) |

## Setting up a partial archival node

A partial archival node stores historical data from a chosen starting epoch up
to the present. This is the most common setup for operators who need historical
chain data without syncing from genesis.

### Step 1: Download the snapshots

Download the base lite snapshot and all diff snapshots up to the present. Order
does not matter.

```shell
forest-tool archive merge --output-file <output-file> <lite-snapshot> <diff-snapshots>
# Example: calibnet, starting from epoch 3,480,000
# Download the base lite snapshot
aria2c -x5 https://forest-archive.chainsafe.dev/archive/forest/calibnet/lite/forest_snapshot_calibnet_2026-02-22_height_3480000.forest.car.zst

# Download all diff snapshots from 3,480,000 onward
aria2c -x5 https://forest-archive.chainsafe.dev/archive/forest/calibnet/diff/forest_diff_calibnet_2026-02-22_height_3480000+3000.forest.car.zst
aria2c -x5 https://forest-archive.chainsafe.dev/archive/forest/calibnet/diff/forest_diff_calibnet_2026-02-23_height_3483000+3000.forest.car.zst
# ... continue for all diffs up to the present
```

As an example, to get a snapshot that covers epoch 30_000 to epoch 36_000, you
merge `forest_snapshot_mainnet_2020-09-04_height_30000.forest.car.zst` with
`forest_diff_mainnet_2020-09-04_height_30000+3000.forest.car.zst` and
`forest_diff_mainnet_2020-09-05_height_33000+3000.forest.car.zst`.
### Step 2: Import snapshots into Forest

Import all snapshot files into the node's CAR database. You can import them in
any order.

```shell
# Initialize the node (creates the database) and stop it
forest --chain calibnet --encrypt-keystore=false --halt-after-import

# Symlink or copy snapshot files into the car_db directory
# (the car_db directory is inside the Forest data directory)
ln -s /path/to/downloaded/snapshots/*.forest.car.zst ~/.local/share/forest/calibnet/car_db/
```

Alternatively, import a recent standard snapshot (for the latest state) and
then add the archival snapshots:

```shell
# Start with a recent standard snapshot
forest --chain calibnet --encrypt-keystore=false --halt-after-import

# Add archival snapshot files to the car_db directory
ln -s /path/to/archival/snapshots/*.forest.car.zst ~/.local/share/forest/calibnet/car_db/
```

### Step 3: Compute states and verify

Start the node and compute states from the lite snapshot's head epoch:

```shell
# Start the node
forest --chain calibnet --encrypt-keystore=false

# Compute states from the base lite epoch forward
# This re-executes all messages and populates the state cache
forest-cli state compute --epoch <LITE_EPOCH> -n <NUMBER_OF_EPOCHS>
```

For example, to compute 200 epochs starting from epoch 3,480,000:

```shell
forest-cli state compute --epoch 3480000 -n 200
```
Comment thread
LesnyRumcajs marked this conversation as resolved.

### Step 4: Validate (optional)

You can validate specific epochs using `forest-dev`:

```shell
forest-dev state validate --chain calibnet --epoch 3506992
Comment thread
LesnyRumcajs marked this conversation as resolved.
```

:::info

Validation can look back up to **2000 epochs**, but each lite snapshot only
contains 900 epochs of state trees. If you are validating an epoch close to the
lite snapshot's head (e.g., within the first few epochs), you may need to also
import the **previous** lite snapshot and its diffs to provide enough state
history. In the worst case, this means downloading the previous segment (1 lite

- 10 diffs).

For epochs well past the lite snapshot's head (more than 2000 epochs ahead),
this is not an issue because the diffs will have extended the state history
sufficiently.

:::

### Backfilling after downtime

If your archival node was offline and missed some epochs, download the missing
diff snapshots (and any new lite snapshots if a new 30,000-epoch boundary was
crossed) and add them to the `car_db` directory. Then restart the node — it
will pick up the new data automatically.

## Retrieving historical data without an archival node

If you only need data at a specific epoch and do not want to run a full
archival node, you can:

1. Download only the base lite snapshot and the diffs up to your target epoch
(see [Figuring out which snapshots you need](#figuring-out-which-snapshots-you-need)).
2. Import them and compute the state:

```shell
# Initialize and stop
forest --chain calibnet --encrypt-keystore=false --halt-after-import

# Add the snapshots
ln -s /path/to/snapshots/*.forest.car.zst ~/.local/share/forest/calibnet/car_db/

# Start the node
forest --chain calibnet --encrypt-keystore=false

# Compute state at the target epoch range
forest-cli state compute --epoch <LITE_EPOCH> -n <EPOCHS_TO_COMPUTE>

# Validate
forest-dev state validate --chain calibnet --epoch <TARGET_EPOCH>

# Now you can query historical state via RPC
forest-cli state compute --epoch <TARGET_EPOCH>
```

## Common pitfalls

| Problem | Cause | Fix |
| ---------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- |
| `failed to lookup actor f410f...` | Incomplete state tree — missing base lite or intermediate diffs | Ensure the correct base lite snapshot and **all** diffs between it and your target epoch are imported |
| `failed to read init actor address map` | Same as above — state tree is partially loaded from a diff without its base | Import the matching base lite snapshot |
| `Parent state root did not match computed state` | State was computed from an incomplete state tree | Re-import with the correct base lite and all diffs, then re-compute |
| `forest-cli state compute` fails but `forest-dev state validate` works | `state compute` requires a running daemon with complete state; `validate` works directly on the database | Import the base lite snapshot covering the epoch and compute from there |
| Validation passes with a standard snapshot but fails with diffs | Diffs were imported without their matching base lite | Always pair diffs with their base lite snapshot |

## Merging snapshots into a single file

If you prefer a single snapshot file instead of keeping multiple files in the
CAR database, you can merge them:

```shell
forest-tool archive merge \
--output-file merged.forest.car.zst \
forest_snapshot_..._height_30000.forest.car.zst \
forest_diff_..._height_30000+3000.forest.car.zst \
forest_diff_..._height_33000+3000.forest.car.zst
```

The output file will contain the combined data and can be used as a standalone
snapshot.

## Generating archival snapshots

Expand All @@ -43,8 +271,8 @@ commands require a large snapshot file as input.

To generate archival snapshots manually, use these settings:

- one lite snapshot every 30_000 epochs,
- one diff snapshot every 3_000 epochs,
- one lite snapshot every 30,000 epochs,
- one diff snapshot every 3,000 epochs,
- a depth of 900 epochs for the diff snapshots,
- a depth of 900 for the lite snapshots.

Expand Down
6 changes: 4 additions & 2 deletions docs/docs/users/knowledge_base/snapshot_service.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,10 @@ The snapshots are compressed with the `zstd` algorithm. Both Forest and Lotus ca

Archival snapshots are available free of charge. Note that they are not actively generated and are provided on a best-effort basis. Two types of archival snapshots are available:

- **Lite snapshots**: historical snapshots containing the last 2000 tipsets. They are available at 30,000 epoch intervals. Lite snapshots are useful for bootstrapping a node with historical data. They must be complemented with _diff_ snapshots for a complete historical chain.
- **Diff snapshots**: incomplete snapshots containing the new key-value pairs since the last diff snapshot.
- **Lite snapshots**: historical snapshots containing complete state trees for the 900 most recent epochs, plus the full block header history back to genesis. They are available at 30,000 epoch intervals.
- **Diff snapshots**: incremental snapshots containing only the new IPLD key-value pairs since the last diff snapshot. Published every 3,000 epochs. A diff snapshot **must** be used together with its matching base lite snapshot — it does not contain a complete state tree on its own.

For detailed usage instructions, including how to set up a partial archival node and how to figure out which snapshots you need, see the [Archival Snapshots guide](../guides/advanced/archival_snapshots.md).

# Snapshot generation details

Expand Down
Loading