Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,10 @@
### Added

- [#6057](https://github.com/ChainSafe/forest/issues/6057) Added `--no-progress-timeout` to `forest-cli f3 ready` subcommand to exit when F3 is stuck for the given timeout.
- [#6000](https://github.com/ChainSafe/forest/pull/6000) Add support for the `Filecoin.StateDecodeParams` API methods to enable decoding actors method params.

- [#6000](https://github.com/ChainSafe/forest/pull/6000) Added support for the `Filecoin.StateDecodeParams` API methods to enable decoding actors method params.

- [#6079](https://github.com/ChainSafe/forest/pull/6079) Added prometheus metrics `network_version`, `network_version_revision` and `actor_version`.

- [#6068](https://github.com/ChainSafe/forest/issues/6068) Added `--index-backfill-epochs` to `forest-tool api serve`.

Expand Down
30 changes: 30 additions & 0 deletions docs/docs/users/reference/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ title: Metrics
| `full_peers` | Gauge | Count | Number of healthy peers recognized by the node |
| `bad_peers` | Gauge | Count | Number of bad peers recognized by the node |
| `expected_network_height` | Gauge | Count | The expected network height based on the current time and the genesis block time |
| `network_version` | Gauge | Count | Network version of the current chain head |
| `network_version_revision` | Gauge | Count | Network version revision of the current chain head |
| `actor_version` | Gauge | Count | Actor version of the current chain head |
| `forest_db_size` | Gauge | Bytes | Size of Forest database in bytes |
| `bitswap_message_count` | Counter | Count | Number of `bitswap` messages. Indexed by `type` |
| `bitswap_container_capacities` | Gauge | Count | Capacity for each `bitswap` container. Indexed by `type` |
Expand Down Expand Up @@ -288,6 +291,33 @@ expected_network_height 2519530
```
</details>

<details>
<summary>Example `network_version` output</summary>
```
# HELP network_version Network version of the current chain head
# TYPE network_version gauge
network_version 27
```
</details>

<details>
<summary>Example `network_version_revision` output</summary>
```
# HELP network_version_revision Network version revision of the current chain head
# TYPE network_version_revision gauge
network_version_revision 0
```
</details>

<details>
<summary>Example `actor_version` output</summary>
```
# HELP actor_version Actor version of the current chain head
# TYPE actor_version gauge
actor_version 17
```
</details>

<details>
<summary>Example `build_info` output</summary>
```
Expand Down
6 changes: 1 addition & 5 deletions src/chain/store/chain_store.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ use super::{
index::{ChainIndex, ResolveNullTipset},
tipset_tracker::TipsetTracker,
};
use crate::db::{EthMappingsStore, EthMappingsStoreExt, IndicesStore, IndicesStoreExt};
use crate::interpreter::{BlockMessages, VMTrace};
use crate::libp2p_bitswap::{BitswapStoreRead, BitswapStoreReadWrite};
use crate::message::{ChainMessage, Message as MessageTrait, SignedMessage};
Expand All @@ -22,10 +23,6 @@ use crate::{
blocks::{CachingBlockHeader, Tipset, TipsetKey, TxMeta},
db::HeaviestTipsetKeyProvider,
};
use crate::{
chain_sync::metrics,
db::{EthMappingsStore, EthMappingsStoreExt, IndicesStore, IndicesStoreExt},
};
use crate::{fil_cns, utils::cache::SizeTrackingLruCache};
use ahash::{HashMap, HashMapExt, HashSet};
use anyhow::Context as _;
Expand Down Expand Up @@ -146,7 +143,6 @@ where

/// Sets heaviest tipset
pub fn set_heaviest_tipset(&self, ts: Arc<Tipset>) -> Result<(), Error> {
metrics::HEAD_EPOCH.set(ts.epoch());
self.heaviest_tipset_key_provider
.set_heaviest_tipset_key(ts.key())?;
if self.publisher.send(HeadChange::Apply(ts)).is_err() {
Expand Down
11 changes: 1 addition & 10 deletions src/chain_sync/metrics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

use prometheus_client::{
encoding::{EncodeLabelKey, EncodeLabelSet, EncodeLabelValue, LabelSetEncoder},
metrics::{counter::Counter, family::Family, gauge::Gauge, histogram::Histogram},
metrics::{counter::Counter, family::Family, histogram::Histogram},
};
Comment on lines +6 to 7
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

LGTM: import cleanup matches HEAD_EPOCH removal

No stray Gauge usage here. Please confirm no remaining references to HEAD_EPOCH across the repo.


🏁 Script executed:

#!/bin/bash
# Expect: no results
rg -nP '\bHEAD_EPOCH\b'

Length of output: 188


HEAD_EPOCH references remain — remove or justify
Found in scripts/tests/calibnet_eth_mapping_check.sh:19–20 (HEAD_EPOCH=$(...) and EPOCH=$((HEAD_EPOCH - 1))). Remove or document why this reference must remain.

🤖 Prompt for AI Agents
In src/chain_sync/metrics.rs around lines 6 to 7, the review flags lingering
references to HEAD_EPOCH in scripts/tests/calibnet_eth_mapping_check.sh (lines
19–20); either remove those HEAD_EPOCH references from the script or add a clear
justification. Fix by deleting the HEAD_EPOCH calculation and adjust downstream
logic to use a documented alternative (e.g., explicit EPOCH input, a
configurable env var, or a call to a stable API), or add a comment above the
script lines explaining why HEAD_EPOCH is required, its expected source, and add
validation/fallback logic to avoid silent failures.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HEAD_EPOCH in the script is extracted from forest-cli info show output which is irrelevant to the metrics

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanabi1224 You're absolutely right - thank you for the clarification! The HEAD_EPOCH in the shell script scripts/tests/calibnet_eth_mapping_check.sh is indeed a shell variable extracted from forest-cli info show output, which is completely unrelated to the Rust metrics HEAD_EPOCH constant that was removed from src/chain_sync/metrics.rs.

My analysis incorrectly flagged this as a potential issue when it's actually a legitimate usage in a different context. The script usage is fine and should remain as-is.


✏️ Learnings added
Learnt from: hanabi1224
PR: ChainSafe/forest#6079
File: src/chain_sync/metrics.rs:6-7
Timestamp: 2025-09-16T12:55:26.912Z
Learning: HEAD_EPOCH references in shell scripts (like scripts/tests/calibnet_eth_mapping_check.sh) that extract data from `forest-cli info show` are unrelated to Rust metrics constants with the same name and should not be flagged when metrics cleanup is performed.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: hanabi1224
PR: ChainSafe/forest#5969
File: src/tool/subcommands/snapshot_cmd.rs:412-412
Timestamp: 2025-08-25T13:35:24.230Z
Learning: In src/tool/subcommands/snapshot_cmd.rs, the +1 in `last_epoch = ts.epoch() - epochs as i64 + 1` fixes an off-by-1 bug where specifying --check-stateroots=N would validate N+1 epochs instead of N epochs, causing out-of-bounds errors when the snapshot contains only N recent state roots.

Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.

use std::sync::LazyLock;

Expand Down Expand Up @@ -44,15 +44,6 @@ pub static INVALID_TIPSET_TOTAL: LazyLock<Counter> = LazyLock::new(|| {
);
metric
});
pub static HEAD_EPOCH: LazyLock<Gauge> = LazyLock::new(|| {
let metric = Gauge::default();
crate::metrics::default_registry().register(
"head_epoch",
"Latest epoch synchronized to the node",
metric.clone(),
);
metric
});

#[derive(Clone, Debug, Hash, PartialEq, Eq)]
pub struct Libp2pMessageKindLabel(&'static str);
Expand Down
49 changes: 45 additions & 4 deletions src/daemon/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,10 @@ use crate::cli_shared::{
chain_path,
cli::{CliOpts, Config},
};
use crate::daemon::context::{AppContext, DbType};
use crate::daemon::db_util::import_chain_as_forest_car;
use crate::daemon::{
context::{AppContext, DbType},
db_util::import_chain_as_forest_car,
};
use crate::db::gc::SnapshotGarbageCollector;
use crate::db::ttl::EthMappingCollector;
use crate::libp2p::{Libp2pService, PeerManager};
Expand All @@ -26,6 +28,7 @@ use crate::rpc::RPCState;
use crate::rpc::eth::filter::EthEventHandler;
use crate::rpc::start_rpc;
use crate::shim::clock::ChainEpoch;
use crate::shim::state_tree::StateTree;
use crate::shim::version::NetworkVersion;
use crate::utils;
use crate::utils::{proofs_api::ensure_proof_params_downloaded, version::FOREST_VERSION_STRING};
Expand Down Expand Up @@ -202,10 +205,47 @@ async fn maybe_start_metrics_service(
);
let db_directory = crate::db::db_engine::db_root(&chain_path(config))?;
let db = ctx.db.writer().clone();
services.spawn(async {
crate::metrics::init_prometheus(prometheus_listener, db_directory, db)

let get_chain_head_height = Arc::new({
// Use `Weak` to not dead lock GC.
let chain_store = Arc::downgrade(ctx.state_manager.chain_store());
move || {
chain_store
.upgrade()
.map(|cs| cs.heaviest_tipset().epoch())
.unwrap_or_default()
}
});
let get_chain_head_actor_version = Arc::new({
// Use `Weak` to not dead lock GC.
let chain_store = Arc::downgrade(ctx.state_manager.chain_store());
move || {
if let Some(cs) = chain_store.upgrade()
&& let Ok(state) =
StateTree::new_from_root(cs.db.clone(), cs.heaviest_tipset().parent_state())
&& let Ok(bundle_meta) = state.get_actor_bundle_metadata()
&& let Ok(actor_version) = bundle_meta.actor_major_version()
{
return actor_version;
}
0
}
});
services.spawn({
let chain_config = ctx.chain_config().clone();
let get_chain_head_height = get_chain_head_height.clone();
async {
crate::metrics::init_prometheus(
prometheus_listener,
db_directory,
db,
chain_config,
get_chain_head_height,
get_chain_head_actor_version,
)
.await
.context("Failed to initiate prometheus server")
}
});

crate::metrics::register_collector(Box::new(
Expand All @@ -215,6 +255,7 @@ async fn maybe_start_metrics_service(
.chain_store()
.genesis_block_header()
.timestamp,
get_chain_head_height,
),
));
}
Expand Down
12 changes: 11 additions & 1 deletion src/metrics/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

pub mod db;

use crate::db::DBStatistics;
use crate::{db::DBStatistics, networks::ChainConfig, shim::clock::ChainEpoch};
use axum::{Router, http::StatusCode, response::IntoResponse, routing::get};
use parking_lot::{RwLock, RwLockWriteGuard};
use prometheus_client::{
Expand Down Expand Up @@ -86,6 +86,9 @@ pub async fn init_prometheus<DB>(
prometheus_listener: TcpListener,
db_directory: PathBuf,
db: Arc<DB>,
chain_config: Arc<ChainConfig>,
get_chain_head_height: Arc<impl Fn() -> ChainEpoch + Send + Sync + 'static>,
get_chain_head_actor_version: Arc<impl Fn() -> u64 + Send + Sync + 'static>,
) -> anyhow::Result<()>
where
DB: DBStatistics + Send + Sync + 'static,
Expand All @@ -101,6 +104,13 @@ where
crate::utils::version::ForestVersionCollector::new(),
));
register_collector(Box::new(crate::metrics::db::DBCollector::new(db_directory)));
register_collector(Box::new(
crate::networks::metrics::NetworkVersionCollector::new(
chain_config,
get_chain_head_height,
get_chain_head_actor_version,
),
));

// Create an configure HTTP server
let app = Router::new()
Expand Down
Loading
Loading