refactor: use file backed CID hashset to reduce memory usage in chain export by hanabi1224 · Pull Request #6939 · ChainSafe/forest

hanabi1224 · 2026-04-20T08:40:31Z

Summary of changes

To reduce memory requirement for chain export and garbage collection

Performance impact: time cost of chain export is ~20% larger on my box

PR: 

❯ for i in {1..3}; do time forest-cli snapshot export --format v1 -d 900 -t 5944500 --dry-run > /dev/null; done
  Exporting v1 snapshot to ./forest_snapshot_mainnet_2026-04-19_height_5944500.forest.car.zst ... 0 B written in 36m (0 B/s)                                                                                                                    forest-cli snapshot export --format v1 -d 900 -t 5944500 --dry-run > /dev/nul  1.17s user 0.80s system 0% cpu 36:22.24 total
  Exporting v1 snapshot to ./forest_snapshot_mainnet_2026-04-19_height_5944500.forest.car.zst ... 0 B written in 35m (0 B/s)                                                                                                                    forest-cli snapshot export --format v1 -d 900 -t 5944500 --dry-run > /dev/nul  1.14s user 0.73s system 0% cpu 34:54.57 total
  Exporting v1 snapshot to ./forest_snapshot_mainnet_2026-04-19_height_5944500.forest.car.zst ... 0 B written in 36m (0 B/s)                                                                                                                    forest-cli snapshot export --format v1 -d 900 -t 5944500 --dry-run > /dev/nul  1.16s user 0.81s system 0% cpu 35:38.21 total
  
main:  
❯ for i in {1..3}; do time forest-cli snapshot export --format v1 -d 900 -t 5944500 --dry-run > /dev/null; done
  Exporting v1 snapshot to ./forest_snapshot_mainnet_2026-04-19_height_5944500.forest.car.zst ... 0 B written in 31m (0 B/s)                                                                                                                    forest-cli snapshot export --format v1 -d 900 -t 5944500 --dry-run > /dev/nul  0.92s user 0.68s system 0% cpu 30:42.69 total
  Exporting v1 snapshot to ./forest_snapshot_mainnet_2026-04-19_height_5944500.forest.car.zst ... 0 B written in 29m (0 B/s)                                                                                                                    forest-cli snapshot export --format v1 -d 900 -t 5944500 --dry-run > /dev/nul  0.94s user 0.64s system 0% cpu 29:22.42 total
  Exporting v1 snapshot to ./forest_snapshot_mainnet_2026-04-19_height_5944500.forest.car.zst ... 0 B written in 30m (0 B/s)                                                                                                                    forest-cli snapshot export --format v1 -d 900 -t 5944500 --dry-run > /dev/nul  0.97s user 0.69s system 0% cpu 30:15.48 total

Changes introduced in this pull request:

Reference issue to close (if applicable)

Closes

Change checklist

I have performed a self-review of my own code,
I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
I have added tests that prove my fix is effective or that my feature works (if possible),
I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

I have read and agree to the CONTRIBUTING document.
I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

Refactor
- Snapshot export and garbage-collection now use a disk-backed deduplication store, greatly reducing RAM during traversal (~6–7 GiB) while increasing temporary disk usage (~7–8 GiB on mainnet).
Documentation
- GC and snapshot guides updated with revised RAM/disk resource estimates for mainnet and calibnet.
Chores
- Project dependency adjusted to enable an additional storage-related feature.
Tests
- Export and snapshot tests updated to exercise the new deduplication behavior.

… export

…shet

coderabbitai · 2026-04-20T08:40:39Z

Walkthrough

Generic "seen" tracking for IPLD/chain streaming was introduced: a CidHashSetLike trait and a disk-backed FileBackedCidHashSet (parity-db + LRU) were added. Streams and export APIs were made generic over the seen-set; call sites, RPC state, and snapshot GC were updated to supply a temp-dir-backed seen set and initialize temp directories.

Changes

Cohort / File(s)	Summary
Dependency `Cargo.toml`	Enabled `bytes` feature on `parity-db` (`parity-db = { version = "0.5", features = ["bytes"] }`).
CID set trait & file-backed impl `src/cid_collections/hash_set.rs`	Added `pub trait CidHashSetLike { fn insert(&mut self, cid: Cid) -> anyhow::Result<bool>; }`, impl for `CidHashSet`, and new `pub struct FileBackedCidHashSet` (parity-db + `hashlink::LruCache`) with constructors and tests.
Re-exports & accessors `src/cid_collections/mod.rs`	Re-exported `CidHashSetLike`, `FileBackedCidHashSet`, and `SmallCid`; added `digest()` and `inner()` accessors in internal types.
Stream generalization `src/ipld/util.rs`	Generalized `ChainStream`, `IpldStream`, `stream_chain`, and `stream_graph` to accept a generic `S: CidHashSetLike` (`seen: S`); made `insert` fallible across traversal logic.
Export API updates `src/chain/mod.rs`	Made `ExportOptions` generic `ExportOptions<S> { seen: S }`, added `impl<S: Default> Default`, and updated `export_from_head`, `export`, `export_v2` signatures to accept `ExportOptions<S>` and pass `seen` explicitly into `stream_chain`.
Call-site adjustments `src/dev/subcommands/export_state_tree_cmd.rs`, `src/tool/subcommands/archive_cmd.rs`, `src/tool/subcommands/benchmark_cmd.rs`, `src/tool/subcommands/snapshot_cmd.rs`, `src/chain/tests.rs`, `src/utils/db/car_stream/tests.rs`	Updated call sites to supply concrete seen implementations (`FileBackedCidHashSet::new_in_temp_dir()?` or `CidHashSet::default()`), adjusted generics (e.g., `::<Sha256, _>`), and updated imports.
Daemon / RPC state changes `src/daemon/context.rs`, `src/daemon/mod.rs`, `src/rpc/mod.rs`, `src/rpc/methods/chain.rs`, `src/rpc/methods/sync.rs`, `src/tool/offline_server/server.rs`, `src/tool/subcommands/api_cmd/*`	Added `temp_dir: PathBuf` / `Arc<PathBuf>` to `AppContext` / `RPCState`, create tmp dir on init, and pass `ctx.temp_dir` into `FileBackedCidHashSet` constructions.
Snapshot GC integration `src/db/gc/snapshot.rs`	Added `chain_tmp_root: PathBuf`, create tmp dir, and switched lite snapshot export to use `FileBackedCidHashSet::new(&self.chain_tmp_root)?` for `seen`.
Docs & changelog `CHANGELOG.md`, `docs/docs/users/guides/gc.md`	Documented disk-backed deduplication refactor and updated GC resource estimates (RAM ↓, disk ↑) to reflect file-backed seen-set usage.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant RPC
  participant Chain
  participant IpldStream
  participant FileBackedSet
  participant ParityDB

  rect rgba(200,200,255,0.5)
    Client->>RPC: export request
    RPC->>Chain: export_from_head(ExportOptions{ seen: FileBackedCidHashSet::new(temp_dir) })
    Chain->>IpldStream: stream_chain(..., seen)
  end

  rect rgba(200,255,200,0.5)
    IpldStream->>FileBackedSet: insert(cid)
    FileBackedSet->>ParityDB: get/check key
    ParityDB-->>FileBackedSet: present? / not present
    alt not present
      FileBackedSet->>ParityDB: write Set (empty bytes)
      FileBackedSet-->>IpldStream: inserted = true
    else present
      FileBackedSet-->>IpldStream: inserted = false
    end
    IpldStream-->>Chain: yield block(s)
    Chain-->>RPC: stream CAR / response
    RPC-->>Client: deliver export
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

fix: reduce db read ops in chain export #5868: Modifies chain streaming internals that overlap changes in src/ipld/util.rs and stream handling.
fix: update TipsetKey save and load logic to match Lotus #6712: Changes export plumbing and ExportOptions threading through export call sites.
feat: add --unordered to forest-cli snapshot export #5867: Alters export pipeline and streaming code closely related to the genericization here.

Suggested reviewers

sudo-shashank
LesnyRumcajs
akaladarshi

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 68.29% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately summarizes the main change: refactoring to use a file-backed CID hashset to reduce memory usage during chain export operations.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch hm/file-backed-cid-hashet

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch hm/file-backed-cid-hashet

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

src/cid_collections/hash_set.rs (2)

101-144: Consider returning Result instead of panicking in Default.

The panic! on line 140-142 could crash the process if temp directory creation fails (e.g., disk full, permissions issues). Per coding guidelines, production code should avoid unwrap() and panics.

Consider either:

Implementing a fallible constructor new() -> anyhow::Result<Self> and using that at call sites
Or documenting this panic clearly if it's an intentional fail-fast for unrecoverable conditions

Given the author's preference for fail-fast approaches (per learnings), this may be intentional, but the panic message should be more descriptive about potential causes.

💡 Alternative: fallible constructor

impl FileBackedCidHashSet {
    pub fn new() -> anyhow::Result<Self> {
        const MAX_ATTEMPTS: usize = 10;
        for attempt in 0..MAX_ATTEMPTS {
            if let Some(dir) = tempfile::tempdir_in(".")
                .ok()
                .or_else(|| tempfile::tempdir().ok())
            {
                // ... existing setup ...
                if let Ok(db) = parity_db::Db::open_or_create(&options) {
                    return Ok(Self { db, _dir: dir, lru: ... });
                }
            }
        }
        anyhow::bail!(
            "failed to create parity db temp directory after {MAX_ATTEMPTS} attempts. \
             Possible causes: disk full, permission denied, or too many open files"
        );
    }
}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cid_collections/hash_set.rs` around lines 101 - 144, The Default impl for
FileBackedCidHashSet currently panics after MAX_ATTEMPTS when creating a temp
parity DB; instead implement a fallible constructor FileBackedCidHashSet::new()
-> anyhow::Result<Self> that contains the loop (using
parity_db::Db::open_or_create and the same options) and returns Ok(Self { db,
_dir: dir, lru: ... }) on success or anyhow::bail! with a descriptive message
listing likely causes (disk full, permission denied, etc.) on failure, then
update impl Default for FileBackedCidHashSet to call
FileBackedCidHashSet::new().expect("descriptive message") if you must keep panic
in Default, or remove Default usages and propagate the Result at call sites.

159-168: Silent error handling may mask issues.

Lines 159 and 163-165 silently ignore errors:

self.db.get(col, &key).ok().flatten() - DB read errors treated as "not found"
_ = self.db.commit_changes_bytes(...) - DB write errors silently discarded

For a file-backed set, write failures could lead to duplicate CIDs being reported as "newly inserted" on subsequent calls, potentially causing incorrect behavior during chain export.

Consider at minimum logging these errors at trace/debug level for diagnosability.

💡 Suggested improvement for error visibility

-        if self.db.get(col, &key).ok().flatten().is_some() {
+        let exists = match self.db.get(col, &key) {
+            Ok(Some(_)) => true,
+            Ok(None) => false,
+            Err(e) => {
+                tracing::trace!("FileBackedCidHashSet get error (treating as miss): {e}");
+                false
+            }
+        };
+        if exists {
             self.lru.insert(small, ());
             false
         } else {
-            _ = self
+            if let Err(e) = self
                 .db
-                .commit_changes_bytes([(col, parity_db::Operation::Set(key, EMPTY_VALUE.clone()))]);
+                .commit_changes_bytes([(col, parity_db::Operation::Set(key, EMPTY_VALUE.clone()))])
+            {
+                tracing::trace!("FileBackedCidHashSet commit error: {e}");
+            }
             self.lru.insert(small, ());
             true
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cid_collections/hash_set.rs` around lines 159 - 168, The DB read and
write errors are currently ignored in the contains/insert path (calls to
self.db.get(...) and self.db.commit_changes_bytes(...)), which can hide
failures; change the logic to inspect the Result from self.db.get(col, &key) and
from self.db.commit_changes_bytes(...) and log any Err at debug/trace level
(e.g., using tracing::debug! or log::debug!) including the column (col) and key
details, and for the read error treat it as an error case (not silently "not
found") and for the write error log the failure and ensure the method's return
reflects the failed persistence (do not unconditionally return true on commit
error). Ensure you update places referencing EMPTY_VALUE, small, and lru
insertion: only insert to lru after a successful read or successful commit, and
include the error logs near the calls to self.db.get and
self.db.commit_changes_bytes to aid diagnosability.

src/tool/subcommands/archive_cmd.rs (1)

700-700: Consider using FileBackedCidHashSet for large snapshot merges.

The merge operation uses in-memory CidHashSet. For merging large snapshots (potentially the entire chain from genesis), this could consume significant memory. Was the in-memory choice intentional for merge performance, or should this also use FileBackedCidHashSet for consistency with the memory-reduction goals?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tool/subcommands/archive_cmd.rs` at line 700, The code currently
constructs the stream graph using an in-memory CidHashSet via stream_graph::<_,
_, _, CidHashSet>(&store, heaviest_tipset.chain(&store), 0) which can OOM for
large merges; replace CidHashSet with FileBackedCidHashSet and instantiate a
FileBackedCidHashSet (e.g., using a temp file or configured backing path) so
stream_graph::<_, _, _, FileBackedCidHashSet>(...) uses the file-backed
implementation; update any constructors or trait-bound calls around
stream_graph, and ensure FileBackedCidHashSet is created and passed where the
previous in-memory set was assumed (for example where blocks is defined) so
large snapshot merges use disk-backed storage.

src/ipld/util.rs (1)

424-430: Consider adding a default type parameter for consistency.

ChainStream defines S = CidHashSet as a default type parameter, but IpldStream does not. While the single existing usage of IpldStream in the codebase (src/dev/subcommands/export_state_tree_cmd.rs:112) already explicitly provides both type parameters, aligning the pattern with ChainStream would improve consistency.
♻️ Suggested change
 pin_project! {
-    pub struct IpldStream<DB, S> {
+    pub struct IpldStream<DB, S = CidHashSet> {
         db: DB,
         cid_vec: Vec<Cid>,
         seen: S,
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ipld/util.rs` around lines 424 - 430, Add a default type parameter for
IpldStream so its second generic defaults to CidHashSet for consistency with
ChainStream; update the type declaration pub struct IpldStream<DB, S =
CidHashSet> and ensure any uses that relied on the old signature (e.g.,
IpldStream in export_state_tree_cmd.rs) continue to compile without changes.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/chain/mod.rs`:
- Around line 35-42: The ExportOptions struct lost its Clone and Debug derives
because its field seen: FileBackedCidHashSet contains
non-cloneable/non-debuggable types (parity_db::Db and tempfile::TempDir), so
restore correct behavior by documenting this as a breaking change rather than
reintroducing derives; update the crate's CHANGELOG and public API notes to
state that ExportOptions no longer implements Clone/Debug (mention ExportOptions
and FileBackedCidHashSet and the underlying parity_db::Db and tempfile::TempDir
types), and if external consumers require clone/debug, provide guidance or an
alternative API (e.g., a lightweight Clone/Debug DTO or factory functions)
rather than attempting to derive these traits on ExportOptions itself.

---

Nitpick comments:
In `@src/cid_collections/hash_set.rs`:
- Around line 101-144: The Default impl for FileBackedCidHashSet currently
panics after MAX_ATTEMPTS when creating a temp parity DB; instead implement a
fallible constructor FileBackedCidHashSet::new() -> anyhow::Result<Self> that
contains the loop (using parity_db::Db::open_or_create and the same options) and
returns Ok(Self { db, _dir: dir, lru: ... }) on success or anyhow::bail! with a
descriptive message listing likely causes (disk full, permission denied, etc.)
on failure, then update impl Default for FileBackedCidHashSet to call
FileBackedCidHashSet::new().expect("descriptive message") if you must keep panic
in Default, or remove Default usages and propagate the Result at call sites.
- Around line 159-168: The DB read and write errors are currently ignored in the
contains/insert path (calls to self.db.get(...) and
self.db.commit_changes_bytes(...)), which can hide failures; change the logic to
inspect the Result from self.db.get(col, &key) and from
self.db.commit_changes_bytes(...) and log any Err at debug/trace level (e.g.,
using tracing::debug! or log::debug!) including the column (col) and key
details, and for the read error treat it as an error case (not silently "not
found") and for the write error log the failure and ensure the method's return
reflects the failed persistence (do not unconditionally return true on commit
error). Ensure you update places referencing EMPTY_VALUE, small, and lru
insertion: only insert to lru after a successful read or successful commit, and
include the error logs near the calls to self.db.get and
self.db.commit_changes_bytes to aid diagnosability.

In `@src/ipld/util.rs`:
- Around line 424-430: Add a default type parameter for IpldStream so its second
generic defaults to CidHashSet for consistency with ChainStream; update the type
declaration pub struct IpldStream<DB, S = CidHashSet> and ensure any uses that
relied on the old signature (e.g., IpldStream in export_state_tree_cmd.rs)
continue to compile without changes.

In `@src/tool/subcommands/archive_cmd.rs`:
- Line 700: The code currently constructs the stream graph using an in-memory
CidHashSet via stream_graph::<_, _, _, CidHashSet>(&store,
heaviest_tipset.chain(&store), 0) which can OOM for large merges; replace
CidHashSet with FileBackedCidHashSet and instantiate a FileBackedCidHashSet
(e.g., using a temp file or configured backing path) so stream_graph::<_, _, _,
FileBackedCidHashSet>(...) uses the file-backed implementation; update any
constructors or trait-bound calls around stream_graph, and ensure
FileBackedCidHashSet is created and passed where the previous in-memory set was
assumed (for example where blocks is defined) so large snapshot merges use
disk-backed storage.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 717a2e40-5926-4156-9884-6eae73a37938

📥 Commits

Reviewing files that changed from the base of the PR and between 919b53e and 1507532.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (9)

Cargo.toml
src/chain/mod.rs
src/cid_collections/hash_set.rs
src/cid_collections/mod.rs
src/dev/subcommands/export_state_tree_cmd.rs
src/ipld/util.rs
src/tool/subcommands/archive_cmd.rs
src/tool/subcommands/benchmark_cmd.rs
src/tool/subcommands/snapshot_cmd.rs

codecov · 2026-04-20T09:02:10Z

Codecov Report

❌ Patch coverage is 65.14286% with 61 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.93%. Comparing base (462e824) to head (d438415).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/ipld/util.rs	37.50%	13 Missing and 2 partials ⚠️
src/cid_collections/hash_set.rs	86.58%	8 Missing and 3 partials ⚠️
src/rpc/methods/chain.rs	0.00%	9 Missing ⚠️
src/tool/subcommands/archive_cmd.rs	33.33%	5 Missing and 1 partial ⚠️
src/db/gc/snapshot.rs	0.00%	5 Missing ⚠️
src/dev/subcommands/export_state_tree_cmd.rs	0.00%	4 Missing ⚠️
src/chain/mod.rs	86.95%	3 Missing ⚠️
src/tool/subcommands/benchmark_cmd.rs	0.00%	3 Missing ⚠️
src/daemon/mod.rs	0.00%	2 Missing ⚠️
src/daemon/context.rs	66.66%	0 Missing and 1 partial ⚠️
... and 2 more

Additional details and impacted files

Files with missing lines	Coverage Δ
src/cid_collections/mod.rs	`97.26% <100.00%> (+0.24%)`	⬆️
src/rpc/methods/sync.rs	`72.39% <100.00%> (+0.17%)`	⬆️
src/rpc/mod.rs	`89.20% <ø> (ø)`
src/tool/offline_server/server.rs	`31.57% <100.00%> (+0.30%)`	⬆️
src/tool/subcommands/api_cmd/test_snapshot.rs	`85.61% <100.00%> (+0.09%)`	⬆️
src/daemon/context.rs	`66.21% <66.66%> (+<0.01%)`	⬆️
...tool/subcommands/api_cmd/generate_test_snapshot.rs	`7.07% <0.00%> (-0.04%)`	⬇️
src/tool/subcommands/snapshot_cmd.rs	`0.00% <0.00%> (ø)`
src/daemon/mod.rs	`28.26% <0.00%> (-0.10%)`	⬇️
src/chain/mod.rs	`76.85% <86.95%> (+1.85%)`	⬆️
... and 7 more

... and 9 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 462e824...d438415. Read the comment docs.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

LesnyRumcajs · 2026-04-20T09:07:08Z

What's the memory footprint reduction?

hanabi1224 · 2026-04-20T09:12:54Z

What's the memory footprint reduction?

@LesnyRumcajs 7-8GiB on mainnet and 1.8-2GiB on calibnet

LesnyRumcajs · 2026-04-20T09:18:24Z

What's the memory footprint reduction?

@LesnyRumcajs 7-8GiB on mainnet and 1.8-2GiB on calibnet

Worth it!

LesnyRumcajs

Let's add a changelog entry for this; folks would like to know Forest potentially dropped its memory requirements by half (at the cost of slightly longer export times)

…shet

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/rpc/methods/chain.rs (1)

384-407: ⚠️ Potential issue | 🟠 Major

Guarantee export-state cleanup across early ? returns.

The export lock and status are set before this new FileBackedCidHashSet::new(..)? call. If it fails, handle returns before end_export() and before CHAIN_EXPORT_LOCK is cleared, so every later export request is rejected until process restart.

Suggested fix

         start_export();
-
-        let head = ctx.chain_store().load_required_tipset_or_heaviest(&tsk)?;
-        let start_ts =
-            ctx.chain_index()
-                .tipset_by_height(epoch, head, ResolveNullTipset::TakeOlder)?;
-
-        let options = ExportOptions {
-            skip_checksum,
-            include_receipts,
-            include_events,
-            include_tipset_keys,
-            seen: FileBackedCidHashSet::new(ctx.temp_dir.as_path())?,
-        };
-        let writer = if dry_run {
-            tokio_util::either::Either::Left(VoidAsyncWriter)
-        } else {
-            tokio_util::either::Either::Right(tokio::fs::File::create(&output_path).await?)
-        };
-        let result = match version {
+        let result: anyhow::Result<ApiExportResult> = async {
+            let head = ctx.chain_store().load_required_tipset_or_heaviest(&tsk)?;
+            let start_ts =
+                ctx.chain_index()
+                    .tipset_by_height(epoch, head, ResolveNullTipset::TakeOlder)?;
+
+            let options = ExportOptions {
+                skip_checksum,
+                include_receipts,
+                include_events,
+                include_tipset_keys,
+                seen: FileBackedCidHashSet::new(ctx.temp_dir.as_path())?,
+            };
+            let writer = if dry_run {
+                tokio_util::either::Either::Left(VoidAsyncWriter)
+            } else {
+                tokio_util::either::Either::Right(tokio::fs::File::create(&output_path).await?)
+            };
+
+            match version {
                 FilecoinSnapshotVersion::V1 => {
                     // ...
                 }
                 FilecoinSnapshotVersion::V2 => {
                     // ...
                 }
-        };
+            }
+        }.await;
 
         end_export();
         let mut guard = CHAIN_EXPORT_LOCK.lock().await;
         *guard = None;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/rpc/methods/chain.rs` around lines 384 - 407, The code sets
CHAIN_EXPORT_LOCK and calls start_export() before constructing
FileBackedCidHashSet::new(...)? which can early-return and leave the lock/status
set; to fix, ensure export-state cleanup on all exit paths by either creating
the FileBackedCidHashSet (FileBackedCidHashSet::new(ctx.temp_dir.as_path()))
before acquiring CHAIN_EXPORT_LOCK/start_export(), or wrap the locked section
with a scope guard/finalizer that always calls end_export() and clears
CHAIN_EXPORT_LOCK (using the same CancellationToken and token.clone()) so that
any early `?` return does not leave the export state stuck; update the code
around CHAIN_EXPORT_LOCK, start_export, end_export, and ExportOptions to
implement one of these approaches.

src/tool/subcommands/archive_cmd.rs (1)

587-603: ⚠️ Potential issue | 🟠 Major

Don’t back the seen-set in the current working directory.

FileBackedCidHashSet::new(".")? ignores the user-selected export destination. A large export can now fail because the cwd is read-only or out of space even when output_path is valid, and it can spill multi-GiB temp files into an unrelated directory.

Suggested fix

-    let seen = if let Some(diff) = diff {
+    let output_path = build_output_path(network.to_string(), genesis.timestamp, epoch, output_path);
+    let seen_root = output_path
+        .parent()
+        .unwrap_or_else(|| std::path::Path::new("."));
+
+    let seen = if let Some(diff) = diff {
         let diff_ts: Tipset = index
             .tipset_by_height(diff, ts.shallow_clone(), ResolveNullTipset::TakeOlder)
             .context("diff epoch must be smaller than target epoch")?;
         let diff_ts: &Tipset = &diff_ts;
         let diff_limit = diff_depth.map(|depth| diff_ts.epoch() - depth).unwrap_or(0);
         let mut stream = stream_chain(
             store.clone(),
             diff_ts.clone().chain_owned(store.clone()),
             diff_limit,
-            FileBackedCidHashSet::new(".")?,
+            FileBackedCidHashSet::new(seen_root)?,
         );
         while stream.try_next().await?.is_some() {}
         stream.into_seen()
     } else {
-        FileBackedCidHashSet::new(".")?
+        FileBackedCidHashSet::new(seen_root)?
     };
-
-    let output_path = build_output_path(network.to_string(), genesis.timestamp, epoch, output_path);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/tool/subcommands/archive_cmd.rs` around lines 587 - 603, The seen-set is
being created with FileBackedCidHashSet::new(".") which uses the current working
directory; change both instances where FileBackedCidHashSet::new(".")? is called
(inside the diff branch and the else branch) to create the file-backed set in
the user-selected export destination instead (e.g.
FileBackedCidHashSet::new(output_path) or a temp file/dir under output_path),
ensuring the export destination variable (output_path/output_dir) is in scope;
keep the same use with stream_chain(..., FileBackedCidHashSet::new(...)? ) and
when returning the fallback so the seen-set is always backed at the chosen
output location rather than ".".

🧹 Nitpick comments (3)

src/ipld/util.rs (1)
201-246: Update the public docs for the new seen parameter.

stream_chain/stream_graph now require callers to provide the dedup store, but the doc comment still describes only the old three-argument API. Please document what seen is expected to do and who owns its storage/lifecycle.

As per coding guidelines "Document public functions and structs with doc comments".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ipld/util.rs` around lines 201 - 246, Update the public doc comments for
stream_chain and stream_graph to document the new seen parameter: explain that
seen is a deduplication store of CIDs (implements CidHashSetLike) used to avoid
revisiting already-yielded nodes during DFS, specify that the caller provides
and owns the seen storage (ownership/lifecycle managed by caller and may be
reused across calls for incremental traversal), state expected behavior when
seen contains CIDs (they will be skipped) and when it's empty, and note any
thread-safety or mutability expectations (e.g., that seen will be mutated by the
stream). Also update the top-level function docs to mention seen in the argument
list and its purpose so downstream callers know to construct and pass an
appropriate CidHashSetLike instance.
src/db/gc/snapshot.rs (1)
101-103: Add path context to the temp-dir setup.

This introduces a new filesystem dependency during GC initialization, but a bare create_dir_all error won’t say which path failed.

As per coding guidelines "Use anyhow::Result<T> for most operations and add context with .context() when errors occur".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/db/gc/snapshot.rs` around lines 101 - 103, The call to
std::fs::create_dir_all(&chain_tmp_root) lacks contextual error information;
wrap the filesystem operation with anyhow context so failures include the path
(use .with_context or .context()) e.g. add context referencing chain_tmp_root
and chain_data_path; update the create_dir_all call near the chain_tmp_root
variable initialization so any error message names the tmp dir path, and keep
db_root(&chain_data_path)? unchanged.
src/chain/mod.rs (1)
35-41: Document the new seen backend contract on ExportOptions.

ExportOptions<S> now carries the main behavioral choice for export deduplication, but the struct itself has no rustdoc explaining what seen is expected to do or when callers should prefer CidHashSet versus FileBackedCidHashSet. A short doc comment here would make the generic API much easier to use correctly.

As per coding guidelines, **/*.rs: Document public functions and structs with doc comments.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/chain/mod.rs` around lines 35 - 41, Add a Rust doc comment to the public
ExportOptions<S> struct documenting that the seen field is the export
deduplication backend: it must behave like a set of CIDs with contains/insert
semantics used to decide whether a CID has already been exported; document when
to use the in-memory CidHashSet (ephemeral, faster) versus FileBackedCidHashSet
(persistent across process restarts), and note any expected trait/behavioral
requirements (e.g., that callers provide a type S implementing set-like
operations used by the export logic). Explain typical ownership/lifetime
expectations (caller supplies and retains the backend) so callers know how to
choose and supply seen.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cid_collections/hash_set.rs`:
- Around line 175-176: The QuickCheck helper test_cid_hashset is calling
Vec::dedup(), which only removes consecutive duplicates and allows inputs like
[a, b, a] to remain duplicated; fix by ensuring cids contains unique entries
before testing—either sort the vector then dedup (e.g., call sort() then
dedup()) or rebuild uniqueness via a HashSet and collect back to a Vec; apply
the same change to the other similar helper referenced around lines 187-188 so
both insertion loops operate on fully de-duplicated CID lists.
- Around line 13-17: Change the CidHashSetLike::insert signature to return
anyhow::Result<bool> and update the FileBackedCidHashSet implementation to
surface I/O errors instead of swallowing them: propagate errors from Db::get and
from commit_changes_bytes (do not ignore with "_ ="), wrapping them with context
(e.g., "reading seen-set" / "writing seen-set"). Update all call sites that
previously expected bool (the conditionals in functions referenced in
src/ipld/util.rs, the filter in src/utils/db/car_util.rs, and the four tests) to
handle anyhow::Result<bool> (propagate the error with ? or map_err with context,
or adjust tests to expect Result). Ensure unique symbols updated are
CidHashSetLike::insert and FileBackedCidHashSet::insert plus any internal calls
to Db::get and commit_changes_bytes.

In `@src/dev/subcommands/export_state_tree_cmd.rs`:
- Line 112: The current call creates a CWD-dependent seen-set with
FileBackedCidHashSet::new(".") inside IpldStream::new, which can fail or leak
files; change this to create a deterministic/writable path (e.g. use
tempfile::tempdir() or std::env::temp_dir() or a configurable output directory)
and pass that path to FileBackedCidHashSet::new instead of "." (update the code
around the IpldStream::new invocation and the FileBackedCidHashSet::new call to
construct and use the temp/config path and propagate any errors accordingly).

---

Outside diff comments:
In `@src/rpc/methods/chain.rs`:
- Around line 384-407: The code sets CHAIN_EXPORT_LOCK and calls start_export()
before constructing FileBackedCidHashSet::new(...)? which can early-return and
leave the lock/status set; to fix, ensure export-state cleanup on all exit paths
by either creating the FileBackedCidHashSet
(FileBackedCidHashSet::new(ctx.temp_dir.as_path())) before acquiring
CHAIN_EXPORT_LOCK/start_export(), or wrap the locked section with a scope
guard/finalizer that always calls end_export() and clears CHAIN_EXPORT_LOCK
(using the same CancellationToken and token.clone()) so that any early `?`
return does not leave the export state stuck; update the code around
CHAIN_EXPORT_LOCK, start_export, end_export, and ExportOptions to implement one
of these approaches.

In `@src/tool/subcommands/archive_cmd.rs`:
- Around line 587-603: The seen-set is being created with
FileBackedCidHashSet::new(".") which uses the current working directory; change
both instances where FileBackedCidHashSet::new(".")? is called (inside the diff
branch and the else branch) to create the file-backed set in the user-selected
export destination instead (e.g. FileBackedCidHashSet::new(output_path) or a
temp file/dir under output_path), ensuring the export destination variable
(output_path/output_dir) is in scope; keep the same use with stream_chain(...,
FileBackedCidHashSet::new(...)? ) and when returning the fallback so the
seen-set is always backed at the chosen output location rather than ".".

---

Nitpick comments:
In `@src/chain/mod.rs`:
- Around line 35-41: Add a Rust doc comment to the public ExportOptions<S>
struct documenting that the seen field is the export deduplication backend: it
must behave like a set of CIDs with contains/insert semantics used to decide
whether a CID has already been exported; document when to use the in-memory
CidHashSet (ephemeral, faster) versus FileBackedCidHashSet (persistent across
process restarts), and note any expected trait/behavioral requirements (e.g.,
that callers provide a type S implementing set-like operations used by the
export logic). Explain typical ownership/lifetime expectations (caller supplies
and retains the backend) so callers know how to choose and supply seen.

In `@src/db/gc/snapshot.rs`:
- Around line 101-103: The call to std::fs::create_dir_all(&chain_tmp_root)
lacks contextual error information; wrap the filesystem operation with anyhow
context so failures include the path (use .with_context or .context()) e.g. add
context referencing chain_tmp_root and chain_data_path; update the
create_dir_all call near the chain_tmp_root variable initialization so any error
message names the tmp dir path, and keep db_root(&chain_data_path)? unchanged.

In `@src/ipld/util.rs`:
- Around line 201-246: Update the public doc comments for stream_chain and
stream_graph to document the new seen parameter: explain that seen is a
deduplication store of CIDs (implements CidHashSetLike) used to avoid revisiting
already-yielded nodes during DFS, specify that the caller provides and owns the
seen storage (ownership/lifecycle managed by caller and may be reused across
calls for incremental traversal), state expected behavior when seen contains
CIDs (they will be skipped) and when it's empty, and note any thread-safety or
mutability expectations (e.g., that seen will be mutated by the stream). Also
update the top-level function docs to mention seen in the argument list and its
purpose so downstream callers know to construct and pass an appropriate
CidHashSetLike instance.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 89436308-be9e-4926-85a1-f39f55421a73

📥 Commits

Reviewing files that changed from the base of the PR and between 1507532 and de26dcd.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (22)

CHANGELOG.md
Cargo.toml
docs/docs/users/guides/gc.md
src/chain/mod.rs
src/chain/tests.rs
src/cid_collections/hash_set.rs
src/cid_collections/mod.rs
src/daemon/context.rs
src/daemon/mod.rs
src/db/gc/snapshot.rs
src/dev/subcommands/export_state_tree_cmd.rs
src/ipld/util.rs
src/rpc/methods/chain.rs
src/rpc/methods/sync.rs
src/rpc/mod.rs
src/tool/offline_server/server.rs
src/tool/subcommands/api_cmd/generate_test_snapshot.rs
src/tool/subcommands/api_cmd/test_snapshot.rs
src/tool/subcommands/archive_cmd.rs
src/tool/subcommands/benchmark_cmd.rs
src/tool/subcommands/snapshot_cmd.rs
src/utils/db/car_stream/tests.rs

✅ Files skipped from review due to trivial changes (3)

docs/docs/users/guides/gc.md
src/tool/subcommands/benchmark_cmd.rs
CHANGELOG.md

🚧 Files skipped from review as they are similar to previous changes (1)

Cargo.toml

LesnyRumcajs

LGTM, anything from 🐰 to address?

…shet

coderabbitai

🧹 Nitpick comments (1)

src/cid_collections/hash_set.rs (1)

157-159: Consider propagating db.get() errors for consistency.

Read errors are swallowed with .ok() while write errors on line 164 are now propagated. Under disk corruption or I/O failure, this treats read failures as cache misses, potentially causing duplicate processing (though idempotent and not incorrect). For symmetric error handling:

♻️ Suggested change

-        if self.db.get(col, &key).ok().flatten().is_some() {
+        if self
+            .db
+            .get(col, &key)
+            .context("failed to read CID presence from parity-db")?
+            .is_some()
+        {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cid_collections/hash_set.rs` around lines 157 - 159, The read path
currently swallows errors by calling self.db.get(col,
&key).ok().flatten().is_some(), which makes read failures appear as cache misses
while writes propagate errors; change this to propagate read errors
consistently: call self.db.get(col, &key) and handle the Result so that any Err
from the DB is returned (propagated) instead of mapped to None, then only when
Ok(Some(_)) update self.lru.insert(small, ()) and return Ok(false); use the same
error type/propagation style used by the surrounding function so reads and
writes behave symmetrically.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/cid_collections/hash_set.rs`:
- Around line 157-159: The read path currently swallows errors by calling
self.db.get(col, &key).ok().flatten().is_some(), which makes read failures
appear as cache misses while writes propagate errors; change this to propagate
read errors consistently: call self.db.get(col, &key) and handle the Result so
that any Err from the DB is returned (propagated) instead of mapped to None,
then only when Ok(Some(_)) update self.lru.insert(small, ()) and return
Ok(false); use the same error type/propagation style used by the surrounding
function so reads and writes behave symmetrically.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 357cbd77-5a4e-495d-af73-61d7f8470bb5

📥 Commits

Reviewing files that changed from the base of the PR and between de26dcd and 042301a.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (3)

CHANGELOG.md
src/cid_collections/hash_set.rs
src/ipld/util.rs

🚧 Files skipped from review as they are similar to previous changes (1)

CHANGELOG.md

coderabbitai

♻️ Duplicate comments (2)

src/cid_collections/hash_set.rs (1)

161-168: ⚠️ Potential issue | 🟠 Major

Stop swallowing parity-db read failures.

Line 161 still converts Db::get errors into “not present”, so a corrupt or full backing store can silently reprocess CIDs and degrade export correctness. Since insert is already fallible, propagate the read error and add context on both DB operations. As per coding guidelines **/*.rs: Use `anyhow::Result` for most operations and add context with `.context()` when errors occur.

Suggested fix

-        if self.db.get(col, &key).ok().flatten().is_some() {
+        if self
+            .db
+            .get(col, &key)
+            .context("failed to read CID presence from parity-db")?
+            .is_some()
+        {
             self.lru.insert(small, ());
             Ok(false)
         } else {
-            self.db.commit_changes_bytes([(
-                col,
-                parity_db::Operation::Set(key, EMPTY_VALUE.clone()),
-            )])?;
+            self.db
+                .commit_changes_bytes([(
+                    col,
+                    parity_db::Operation::Set(key, EMPTY_VALUE.clone()),
+                )])
+                .context("failed to persist CID presence into parity-db")?;
             self.lru.insert(small, ());
             Ok(true)
         }

#!/bin/bash
# Verify that `FileBackedCidHashSet::insert` still suppresses parity-db read errors
# and lacks context on the write path.
sed -n '148,172p' src/cid_collections/hash_set.rs
rg -n 'db\.get|commit_changes_bytes|context\(' src/cid_collections/hash_set.rs

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cid_collections/hash_set.rs` around lines 161 - 168, In
FileBackedCidHashSet::insert, stop converting parity_db::Db::get errors into
"not present": change the function to return anyhow::Result and propagate the
error from self.db.get(...) instead of .ok().flatten(), using .context("reading
CID presence from parity-db") to add context; likewise wrap the
self.db.commit_changes_bytes(...) call with .context("writing empty value to
parity-db for CID") so both read and write failures surface with helpful
messages while keeping the existing LRU insert and Ok(false)/Ok(true) behavior.

src/tool/subcommands/archive_cmd.rs (1)

587-603: ⚠️ Potential issue | 🟠 Major

Avoid routing export dedupe state through the OS temp dir.

These branches hard-code FileBackedCidHashSet::new_in_temp_dir(). On hosts where std::env::temp_dir() is tmpfs, the “file-backed” seen-set moves back into RAM and undercuts this PR’s main memory win; it also makes export success depend on temp-dir policy instead of Forest’s configured storage root.

#!/bin/bash
# Inspect the call site and the default temp-root implementation.
sed -n '587,603p' src/tool/subcommands/archive_cmd.rs
sed -n '143,145p' src/cid_collections/hash_set.rs

# Run this on a representative host to confirm what temp dir Forest will use.
python - <<'PY'
import tempfile, pathlib
print("temp_dir =", pathlib.Path(tempfile.gettempdir()).resolve())
PY

# If `/tmp` or the resolved temp dir is tmpfs/memory-backed, the seen-set is not
# really offloaded from RAM.
mount | grep -E ' on /tmp | on /var/tmp ' || true

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/tool/subcommands/archive_cmd.rs` around lines 587 - 603, The code
currently uses FileBackedCidHashSet::new_in_temp_dir() for the dedupe "seen" set
in archive_cmd.rs (the seen variable and stream_chain creation), which forces
reliance on OS temp directory; change this to create the file-backed set inside
Forest's configured storage root instead. Replace calls to
FileBackedCidHashSet::new_in_temp_dir() with a constructor that accepts a Path
(e.g., FileBackedCidHashSet::new_at(...) or new_in_dir(...)) and pass a path
derived from the configured storage root (use the archive command's
config/Store/root path or similar configuration value) with a unique per-run
filename (like "archive-seen-<pid|uuid>") and ensure any cleanup/rotation
semantics are preserved; update both branches (the diff branch and the else
branch) so the same storage-root-backed path is used.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/cid_collections/hash_set.rs`:
- Around line 161-168: In FileBackedCidHashSet::insert, stop converting
parity_db::Db::get errors into "not present": change the function to return
anyhow::Result and propagate the error from self.db.get(...) instead of
.ok().flatten(), using .context("reading CID presence from parity-db") to add
context; likewise wrap the self.db.commit_changes_bytes(...) call with
.context("writing empty value to parity-db for CID") so both read and write
failures surface with helpful messages while keeping the existing LRU insert and
Ok(false)/Ok(true) behavior.

In `@src/tool/subcommands/archive_cmd.rs`:
- Around line 587-603: The code currently uses
FileBackedCidHashSet::new_in_temp_dir() for the dedupe "seen" set in
archive_cmd.rs (the seen variable and stream_chain creation), which forces
reliance on OS temp directory; change this to create the file-backed set inside
Forest's configured storage root instead. Replace calls to
FileBackedCidHashSet::new_in_temp_dir() with a constructor that accepts a Path
(e.g., FileBackedCidHashSet::new_at(...) or new_in_dir(...)) and pass a path
derived from the configured storage root (use the archive command's
config/Store/root path or similar configuration value) with a unique per-run
filename (like "archive-seen-<pid|uuid>") and ensure any cleanup/rotation
semantics are preserved; update both branches (the diff branch and the else
branch) so the same storage-root-backed path is used.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6ff6f0eb-9891-4aa4-8739-e3890ba00ae7

📥 Commits

Reviewing files that changed from the base of the PR and between 042301a and d438415.

📒 Files selected for processing (3)

src/cid_collections/hash_set.rs
src/dev/subcommands/export_state_tree_cmd.rs
src/tool/subcommands/archive_cmd.rs

🚧 Files skipped from review as they are similar to previous changes (1)

src/dev/subcommands/export_state_tree_cmd.rs

hanabi1224 added 2 commits April 20, 2026 16:36

refactor: use file backed CID hashset to reduce memory usage in chain…

3d48c18

… export

Merge remote-tracking branch 'origin/main' into hm/file-backed-cid-ha…

20776f0

…shet

hanabi1224 commented Apr 20, 2026

View reviewed changes

Comment thread src/cid_collections/hash_set.rs

Merge branch 'main' into hm/file-backed-cid-hashet

1507532

hanabi1224 marked this pull request as ready for review April 20, 2026 08:41

hanabi1224 requested a review from a team as a code owner April 20, 2026 08:41

hanabi1224 requested review from LesnyRumcajs and sudo-shashank and removed request for a team April 20, 2026 08:41

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/chain/mod.rs Outdated

LesnyRumcajs reviewed Apr 20, 2026

View reviewed changes

LesnyRumcajs requested changes Apr 20, 2026

View reviewed changes

Comment thread src/cid_collections/hash_set.rs Outdated

Comment thread src/cid_collections/hash_set.rs Outdated

hanabi1224 added 2 commits April 20, 2026 20:51

resolve comments

69d5b5d

Merge remote-tracking branch 'origin/main' into hm/file-backed-cid-ha…

de26dcd

…shet

hanabi1224 added the RPC requires calibnet RPC checks to run on CI label Apr 20, 2026

github-actions Bot deployed to preview April 20, 2026 12:52 View deployment

github-actions Bot deployed to preview April 20, 2026 12:53 View deployment

hanabi1224 requested a review from LesnyRumcajs April 20, 2026 12:53

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/cid_collections/hash_set.rs Outdated

Comment thread src/cid_collections/hash_set.rs Outdated

Comment thread src/dev/subcommands/export_state_tree_cmd.rs Outdated

LesnyRumcajs reviewed Apr 20, 2026

View reviewed changes

Comment thread src/chain/mod.rs

LesnyRumcajs previously approved these changes Apr 20, 2026

View reviewed changes

hanabi1224 added 2 commits April 21, 2026 10:26

resolve AI comments

d070931

Merge remote-tracking branch 'origin/main' into hm/file-backed-cid-ha…

6311928

…shet

hanabi1224 dismissed LesnyRumcajs’s stale review via 6311928 April 21, 2026 02:26

github-actions Bot deployed to preview April 21, 2026 02:27 View deployment

impl Default for FileBackedCidHashSet for test target

042301a

github-actions Bot deployed to preview April 21, 2026 02:30 View deployment

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

resolve AI comment

d438415

github-actions Bot deployed to preview April 21, 2026 02:42 View deployment

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

LesnyRumcajs approved these changes Apr 21, 2026

View reviewed changes

hanabi1224 added this pull request to the merge queue Apr 21, 2026

Merged via the queue into main with commit 188c99b Apr 21, 2026
44 checks passed

hanabi1224 deleted the hm/file-backed-cid-hashet branch April 21, 2026 09:50

Conversation

hanabi1224 commented Apr 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of changes

Reference issue to close (if applicable)

Other information and links

Change checklist

Outside contributions

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

LesnyRumcajs commented Apr 20, 2026

Uh oh!

hanabi1224 commented Apr 20, 2026

Uh oh!

LesnyRumcajs commented Apr 20, 2026

Uh oh!

LesnyRumcajs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LesnyRumcajs left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hanabi1224 commented Apr 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading

codecov Bot commented Apr 20, 2026 •

edited

Loading