A Caddy dynamic upstream module that intelligently monitors blockchain node health across multiple protocols (Cosmos, EVM) and removes unhealthy nodes from the load balancer pool in real-time. This plugin provides intelligent failover capabilities for blockchain infrastructure, ensuring high availability and optimal performance.
Note
This is not an official repository of the Caddy Web Server organization.
- Cosmos SDK chains - RPC (
/status) and REST API (/cosmos/base/tendermint/v1beta1/syncing) health checks - EVM chains - JSON-RPC (
eth_blockNumber) validation - Beacon (Ethereum consensus) - REST (
/eth/v1/node/syncing,/eth/v1/beacon/headers/head) validation - Flexible endpoints - Support for separated RPC/REST services or combined nodes
- Block height comparison - Within pools and against external references
- Sync status monitoring - Detects
catching_upstate for Cosmos nodes andis_syncingfor Beacon nodes - Real-time validation - Immediate unhealthy node removal from pools
- External references - Validate against trusted providers
- Concurrent checks - Parallel health validation with configurable limits
- Circuit breaker pattern - Prevents overwhelming unhealthy nodes
- Graceful degradation - Minimum healthy node enforcement
- Retry logic - Exponential backoff with jitter
- TTL-based caching - Optimized performance with configurable duration
- Prometheus metrics - Comprehensive monitoring with node-level granularity
- Health endpoint - Real-time status API with detailed diagnostics
- Structured logging - Configurable log levels with request tracing
- Performance tracking - Response times and error rates
Build Caddy with this plugin using xcaddy:
xcaddy build --with github.com/chalabi2/caddy-blockchain-healthMigration Note: This module replaces traditional HTTP health checks with blockchain-aware monitoring. The directive is
dynamic blockchain_healthwithin reverse proxy configurations.
Or add to your xcaddy.json:
{
"dependencies": [
{
"module": "github.com/chalabi2/caddy-blockchain-health",
"version": "latest"
}
]
}Basic Caddyfile configuration using environment variables:
{
admin localhost:2019
}
blockchain-api.example.com {
reverse_proxy {
dynamic blockchain_health {
# Explicit configuration (recommended)
rpc_servers {$COSMOS_SERVERS}
node_type "cosmos" # ← Protocol type (health checker)
chain_type "cosmos-hub" # ← Chain identifier (grouping)
# Production settings
min_healthy_nodes 2
circuit_breaker_threshold 0.8
metrics_enabled true
}
}
# Health endpoint
handle /health {
reverse_proxy localhost:8080
}
}
# Ethereum configuration
### Beacon (Ethereum Consensus) Example
```caddy
beacon.example.com {
reverse_proxy {
dynamic blockchain_health {
servers {$BEACON_SERVERS}
node_type "beacon"
chain_type "ethereum-beacon"
# Recommended production tolerance
block_height_threshold 32
check_interval "10s"
timeout "10s"
min_healthy_nodes 1
metrics_enabled true
}
}
}Environment variables:
export BEACON_SERVERS="http://beacon-1:3500 http://beacon-2:3500"ethereum-api.example.com { reverse_proxy { dynamic blockchain_health { # Explicit configuration (recommended) evm_servers {$ETH_SERVERS} node_type "evm" # ← Protocol type (health checker) chain_type "ethereum" # ← Chain identifier (grouping)
# Production settings
min_healthy_nodes 1
metrics_enabled true
}
}
}
Set your environment variables:
```bash
export COSMOS_SERVERS="http://cosmos-1:26657 http://cosmos-2:26657 http://cosmos-3:26657"
export ETH_SERVERS="http://eth-1:8545 http://eth-2:8545"
Note: Complete example configurations are available in the
example_configs/directory.
The plugin supports three main usage patterns:
blockchain-api.example.com {
reverse_proxy {
dynamic blockchain_health {
# Auto-discovery from environment variables
auto_discover_from_env "COSMOS"
# Additional EVM servers
evm_servers {$ETH_SERVERS}
# Comprehensive health monitoring
check_interval "10s"
timeout "3s"
retry_attempts 3
block_height_threshold 3
# Production resilience
min_healthy_nodes 2
circuit_breaker_threshold 0.8
cache_duration "30s"
max_concurrent_checks 10
# Full monitoring
metrics_enabled true
}
}
}Environment variables:
export COSMOS_RPC_SERVERS="http://cosmos-us-east-1:26657 http://cosmos-eu-west-1:26657"
export COSMOS_API_SERVERS="http://cosmos-us-east-1:1317 http://cosmos-eu-west-1:1317"
export COSMOS_WS_SERVERS="ws://cosmos-us-east-1:26657/websocket ws://cosmos-eu-west-1:26657/websocket"
export ETH_SERVERS="http://ethereum-1:8545 http://ethereum-2:8545"# Cosmos RPC endpoint
cosmos-rpc.example.com {
reverse_proxy {
dynamic blockchain_health {
rpc_servers {$COSMOS_RPC_SERVERS}
chain_type "cosmos"
service_type "rpc"
check_interval "15s"
min_healthy_nodes 1
metrics_enabled true
}
}
}
# Cosmos REST API endpoint
cosmos-api.example.com {
reverse_proxy {
dynamic blockchain_health {
api_servers {$COSMOS_API_SERVERS}
chain_type "cosmos"
service_type "api"
check_interval "15s"
min_healthy_nodes 1
metrics_enabled true
}
}
}Environment variables:
export COSMOS_RPC_SERVERS="http://cosmos-node-1:26657 http://cosmos-node-2:26657"
export COSMOS_API_SERVERS="http://cosmos-node-1:1317 http://cosmos-node-2:1317"dev-blockchain.localhost {
reverse_proxy {
dynamic blockchain_health {
# Generic server list with auto-detection
servers {$DEV_SERVERS}
# Relaxed settings for development
check_interval "5s"
timeout "2s"
block_height_threshold 10
circuit_breaker_threshold 0.9
fallback_behavior "disable_health_checks"
log_level "debug"
# No minimum nodes required in development
min_healthy_nodes 0
}
}
}Environment variables:
export DEV_SERVERS="http://localhost:26657 http://localhost:1317 http://localhost:8545"Pattern 1 (Multi-Chain): Full health validation - Checks all configured endpoints with comprehensive monitoring.
Pattern 2 (Separated Services): Service-specific validation - Only checks the specific service type (RPC or REST) without redundant checks.
Pattern 3 (Development): Relaxed validation - Lenient settings suitable for local development and testing.
Recommendation: Use Pattern 1 for production APIs requiring maximum reliability, Pattern 2 for microservice architectures, and Pattern 3 for development environments.
The plugin now supports simplified environment variable-based configuration:
| Option | Description | Example |
|---|---|---|
servers |
Generic space-separated server list with auto-detection | {$BLOCKCHAIN_SERVERS} |
rpc_servers |
Cosmos RPC servers (port 26657) | {$COSMOS_RPC_SERVERS} |
api_servers |
Cosmos REST API servers (port 1317) | {$COSMOS_API_SERVERS} |
websocket_servers |
Cosmos WebSocket servers | {$COSMOS_WS_SERVERS} |
evm_servers |
EVM JSON-RPC servers (port 8545) | {$ETH_SERVERS} |
evm_ws_servers |
EVM WebSocket servers (port 8546) | {$ETH_WS_SERVERS} |
chain_preset |
Predefined chain configuration (cosmos-hub, ethereum, althea) |
"cosmos-hub" |
auto_discover_from_env |
Auto-discover from environment variables with prefix | "COSMOS" |
chain_type |
Specific blockchain identifier for grouping (ethereum, base, akash, etc.) |
"cosmos" |
node_type |
Protocol type for health checker selection (cosmos, evm) |
Auto-detected |
legacy_mode |
Backward compatibility mode | true |
| Option | Description | Default | Required |
|---|---|---|---|
name |
Unique identifier for the node | - | yes |
url |
Primary endpoint URL (RPC for Cosmos, JSON-RPC for EVM) | - | yes |
api_url |
Optional REST API URL for Cosmos nodes | - | no |
websocket_url |
Optional WebSocket URL for real-time connections | - | no |
type |
Node type (cosmos or evm) |
- | yes |
weight |
Load balancing weight | 100 |
no |
metadata |
Optional key-value metadata | {} |
no |
The plugin intelligently handles Cosmos SDK chains with separate RPC and REST endpoints:
Scenario 1: Combined Node (Single Service)
cosmos-combined.example.com {
reverse_proxy {
dynamic blockchain_health {
# Auto-discovery will find both RPC and API servers
auto_discover_from_env "COSMOS"
# Or specify both explicitly
rpc_servers {$COSMOS_RPC_SERVERS}
api_servers {$COSMOS_API_SERVERS}
}
}
}Environment variables:
export COSMOS_RPC_SERVERS="http://cosmos-node:26657"
export COSMOS_API_SERVERS="http://cosmos-node:1317"- Health checks RPC first, REST as fallback - Tries RPC (
/status), then REST (/cosmos/base/tendermint/v1beta1/syncing) if RPC fails - Fallback redundancy - Node stays available if either service responds
- Recommended for full-node infrastructure
Scenario 2: Separated Services (Microservice Architecture)
# Cosmos RPC load balancer
cosmos-rpc.example.com {
reverse_proxy {
dynamic blockchain_health {
rpc_servers {$COSMOS_RPC_SERVERS} # Only RPC
chain_type "cosmos"
service_type "rpc"
}
}
}
# Cosmos REST API load balancer
cosmos-api.example.com {
reverse_proxy {
dynamic blockchain_health {
api_servers {$COSMOS_API_SERVERS} # Only REST
chain_type "cosmos"
service_type "api"
}
}
}Environment variables:
export COSMOS_RPC_SERVERS="http://cosmos-rpc-1:26657 http://cosmos-rpc-2:26657"
export COSMOS_API_SERVERS="http://cosmos-api-1:1317 http://cosmos-api-2:1317"- Health checks appropriate endpoint - RPC or REST based on URL pattern
- No redundant checks - Each service validates its specific protocol
- Recommended for microservice deployments
Auto-Detection Logic:
- Port 26657 or
/statuspath → RPC health check - Port 1317 or
/cosmos/path → REST API health check - Both
urlandapi_urlspecified → Checks both endpoints
The plugin provides comprehensive WebSocket support for real-time blockchain connections:
Cosmos WebSocket Configuration:
cosmos-websocket.example.com {
reverse_proxy {
dynamic blockchain_health {
# Auto-discovery generates WebSocket URLs automatically
auto_discover_from_env "COSMOS"
# Or specify explicit WebSocket servers
websocket_servers {$COSMOS_WS_SERVERS}
chain_type "cosmos"
service_type "websocket"
}
}
}Environment variables:
export COSMOS_RPC_SERVERS="http://cosmos-node:26657"
export COSMOS_API_SERVERS="http://cosmos-node:1317"
export COSMOS_WS_SERVERS="ws://cosmos-node:26657/websocket"- Tendermint WebSocket subscriptions - Tests
tm.event = 'NewBlock'subscriptions - Real-time event streaming - Validates connectivity for live event monitoring
- Auto scheme conversion - Converts
http/httpstows/wssautomatically
EVM WebSocket Configuration:
ethereum-websocket.example.com {
reverse_proxy {
dynamic blockchain_health {
# IMPORTANT: Both HTTP and WebSocket servers must be specified
# for proper health checking and correlation
evm_servers {$ETH_SERVERS} # HTTP endpoints for health checks
evm_ws_servers {$ETH_WS_SERVERS} # WebSocket endpoints for proxy
chain_type "evm"
}
}
}Environment variables (servers correlated by index/hostname):
# HTTP JSON-RPC endpoints (used for health checks)
export ETH_SERVERS="http://geth-node1:8545 http://geth-node2:8545"
# WebSocket endpoints (used for proxy, correlated with HTTP endpoints)
export ETH_WS_SERVERS="ws://geth-node1:8546 ws://geth-node2:8546"Custom Ports Example:
# Your production setup with custom ports
export BASE_SERVERS="http://95.216.38.96:13245 http://8.40.118.101:13245"
export BASE_WS_SERVERS="ws://95.216.38.96:13246 ws://8.40.118.101:13246"- Intelligent correlation - Automatically correlates WebSocket and HTTP endpoints by hostname or index
- HTTP health checks - Uses correlated HTTP endpoints for
eth_blockNumbervalidation - WebSocket proxy - Routes WebSocket traffic to healthy WebSocket endpoints
- Block height validation - Full blockchain health checking via HTTP while proxying to WebSocket
- Custom ports supported - No assumptions about standard ports (8545/8546)
WebSocket Health Checking:
- Correlation-based - WebSocket nodes use correlated HTTP endpoints for health validation
- Full blockchain validation - Block height, sync status, and external reference checking
- Non-blocking WebSocket tests - Optional WebSocket connectivity verification (informational)
- Timeout protection - 3-second read timeout prevents hanging connections
- Protocol-specific tests - Uses appropriate subscription methods per blockchain type
EVM nodes use JSON-RPC protocol and don't have separate RPC/REST endpoints like Cosmos:
Standard EVM Configuration:
ethereum-primary.example.com {
reverse_proxy {
dynamic blockchain_health {
# Ethereum preset with auto-configuration
chain_preset "ethereum"
evm_servers {$ETH_SERVERS}
# Auto-generates WebSocket URLs and external references
min_healthy_nodes 1
metrics_enabled true
}
}
}Environment variables:
export ETH_SERVERS="http://ethereum-node:8545"- Single endpoint - All requests use JSON-RPC over HTTP
- Health check via
eth_blockNumber- Validates node responsiveness and current block - No separate API URL needed - EVM protocol is unified
EVM Service Types (by function, not protocol):
ethereum-mixed.example.com {
reverse_proxy {
dynamic blockchain_health {
# Mixed node types with different capabilities
evm_servers {$ETH_ARCHIVE_SERVERS} {$ETH_FULL_SERVERS} {$ETH_LIGHT_SERVERS}
chain_type "evm"
# Standard EVM health checking
check_interval "12s"
block_height_threshold 2
}
}
}Environment variables:
export ETH_ARCHIVE_SERVERS="http://archive-node:8545"
export ETH_FULL_SERVERS="http://full-node:8545"
export ETH_LIGHT_SERVERS="http://light-node:8545"Key Differences from Cosmos:
| Aspect | Cosmos SDK | EVM Chains |
|---|---|---|
| Protocol | RPC (26657) + REST (1317) | JSON-RPC (8545) |
| Health Check | /status + /cosmos/base/tendermint/v1beta1/syncing |
eth_blockNumber |
| Endpoints | Separate RPC/REST URLs possible | Single JSON-RPC endpoint |
| Sync Status | catching_up boolean |
Block height comparison |
| Differentiation | Service type (RPC vs REST) | Node type (archive/full/light) |
Problem Solved: Previously, all EVM chains (Ethereum, Base, Arbitrum, etc.) were compared against each other, causing nodes to be incorrectly marked as unhealthy due to vastly different block heights across chains.
Solution: The system now uses two separate configuration fields:
node_type: Determines which health checker to use (cosmosorevm)chain_type: Groups nodes for block height validation (e.g.,ethereum,base,akash)
Before (Cross-Chain Comparison):
All EVM nodes grouped together:
- Ethereum node: 36,282,000 blocks
- Base node: 23,485,000 blocks
- Arbitrum node: 7,829,000 blocks
→ Base and Arbitrum marked unhealthy (millions of blocks "behind" Ethereum)
After (Chain-Specific Isolation):
Each chain has its own validation pool:
- Ethereum pool: [ethereum nodes only] → compared among themselves
- Base pool: [base nodes only] → compared among themselves
- Arbitrum pool: [arbitrum nodes only] → compared among themselves
→ All nodes healthy within their respective chains
Configuration Examples:
# Ethereum nodes - isolated validation pool
ethereum.api.com {
reverse_proxy {
dynamic blockchain_health {
evm_servers {$ETH_SERVERS}
node_type "evm" # ← Protocol type (health checker selection)
chain_type "ethereum" # ← Chain identifier (grouping)
block_height_threshold 3
}
}
}
# Base nodes - separate validation pool
base.api.com {
reverse_proxy {
dynamic blockchain_health {
evm_servers {$BASE_SERVERS}
node_type "evm" # ← Same protocol as Ethereum
chain_type "base" # ← Different chain (separate group)
block_height_threshold 5
}
}
}
# Arbitrum nodes - separate validation pool
arbitrum.api.com {
reverse_proxy {
dynamic blockchain_health {
evm_servers {$ARBITRUM_SERVERS}
node_type "evm" # ← Same protocol as Ethereum/Base
chain_type "arbitrum" # ← Different chain (separate group)
block_height_threshold 10
}
}
}Cosmos Chains Work Similarly:
# Akash nodes - isolated validation
akash.api.com {
reverse_proxy {
dynamic blockchain_health {
rpc_servers {$AKASH_RPC_SERVERS}
node_type "cosmos" # ← Protocol type (health checker selection)
chain_type "akash" # ← Chain identifier (grouping)
}
}
}
# Osmosis nodes - separate validation
osmosis.api.com {
reverse_proxy {
dynamic blockchain_health {
rpc_servers {$OSMOSIS_RPC_SERVERS}
node_type "cosmos" # ← Same protocol as Akash
chain_type "osmosis" # ← Different chain (separate group)
}
}
}Key Benefits:
- No cross-chain interference - Base nodes won't be marked unhealthy because Ethereum has higher block numbers
- Accurate health validation - Each chain validates against its own network state
- Proper failover - Only truly lagging nodes within the same chain are removed
- Multi-chain support - Run multiple blockchain APIs with confidence
- Explicit configuration - No hardcoded chain names, users specify both protocol and chain
- Backward compatibility - Existing configurations continue to work
For maximum clarity and control, explicitly specify both node_type and chain_type:
# Ethereum configuration
ethereum.api.com {
reverse_proxy {
dynamic blockchain_health {
evm_servers {$ETH_SERVERS}
node_type "evm" # ← Protocol (determines health checker)
chain_type "ethereum" # ← Chain ID (determines grouping)
}
}
}
# Base configuration (same protocol, different chain)
base.api.com {
reverse_proxy {
dynamic blockchain_health {
evm_servers {$BASE_SERVERS}
node_type "evm" # ← Same protocol as Ethereum
chain_type "base" # ← Different chain (separate validation)
}
}
}
# Custom EVM chain
my-chain.api.com {
reverse_proxy {
dynamic blockchain_health {
evm_servers {$CUSTOM_SERVERS}
node_type "evm" # ← EVM protocol
chain_type "my-l2-chain" # ← Custom chain name
}
}
}Existing configurations without explicit node_type continue to work:
# Auto-detects node_type based on chain_type
legacy.api.com {
reverse_proxy {
dynamic blockchain_health {
evm_servers {$ETH_SERVERS}
chain_type "ethereum" # ← Auto-detects node_type as "evm"
}
}
}Auto-Detection Rules:
- Known Cosmos chains:
cosmos,cosmos-hub,akash,osmosis,juno, etc. →node_type "cosmos" - Known EVM chains:
ethereum,base,arbitrum,polygon, etc. →node_type "evm" - Unknown chains: Falls back to URL-based detection
The plugin performs internal pool validation and external reference monitoring:
Compares nodes within the same pool and removes lagging nodes from the load balancer:
dynamic blockchain_health {
# These nodes will be compared against each other
evm_servers {$ETH_POOL_SERVERS}
chain_type "evm"
# If any node is more than 5 blocks behind the highest in the pool
block_height_threshold 5
}Environment variables:
export ETH_POOL_SERVERS="http://eth-1.internal:8545 http://eth-2.internal:8545 http://eth-3.internal:8545"Logic: If eth-node-1 is at block 18,500,000 and eth-node-2 is at 18,499,994, then eth-node-2 is removed from load balancer (6 blocks behind > threshold of 5).
Monitors your nodes against trusted external sources for observability (does not affect load balancing):
EVM External References:
dynamic blockchain_health {
# Your nodes
evm_servers {$YOUR_ETH_SERVERS}
chain_type "evm"
# Automatically configured external references when using preset
chain_preset "ethereum"
# Or manually configure external references
external_reference evm {
name "infura_mainnet"
url "https://mainnet.infura.io/v3/YOUR_PROJECT_ID"
enabled true
}
external_reference evm {
name "alchemy_backup"
url "https://eth-mainnet.alchemyapi.io/v2/YOUR_API_KEY"
enabled true
}
external_reference evm {
name "public_ethereum"
url "https://ethereum-rpc.publicnode.com"
enabled true
}
# If your nodes are more than 10 blocks behind external references
external_reference_threshold 10
}Environment variables:
export YOUR_ETH_SERVERS="http://your-node:8545"Multi-Chain EVM Examples:
# Polygon network
polygon.example.com {
reverse_proxy {
dynamic blockchain_health {
evm_servers {$POLYGON_SERVERS}
chain_type "evm"
external_reference evm {
name "polygon_alchemy"
url "https://polygon-mainnet.g.alchemy.com/v2/YOUR_API_KEY"
enabled true
}
external_reference evm {
name "polygon_public"
url "https://polygon-rpc.com"
enabled true
}
}
}
}
# Binance Smart Chain
bsc.example.com {
reverse_proxy {
dynamic blockchain_health {
evm_servers {$BSC_SERVERS}
chain_type "evm"
external_reference evm {
name "bsc_public"
url "https://bsc-dataseed.binance.org"
enabled true
}
external_reference evm {
name "bsc_backup"
url "https://bsc-dataseed1.defibit.io"
enabled true
}
}
}
}
# Arbitrum network
arbitrum.example.com {
reverse_proxy {
dynamic blockchain_health {
evm_servers {$ARBITRUM_SERVERS}
chain_type "evm"
external_reference evm {
name "arbitrum_alchemy"
url "https://arb-mainnet.g.alchemy.com/v2/YOUR_API_KEY"
enabled true
}
}
}
}Environment variables:
export POLYGON_SERVERS="http://polygon-node:8545"
export BSC_SERVERS="http://bsc-node:8545"
export ARBITRUM_SERVERS="http://arbitrum-node:8545"Cosmos External References:
cosmos-external.example.com {
reverse_proxy {
dynamic blockchain_health {
# Cosmos Hub preset automatically includes external references
chain_preset "cosmos-hub"
servers {$COSMOS_SERVERS}
# Or add custom external references
external_reference cosmos {
name "cosmos_polkachu"
url "https://cosmos-rpc.polkachu.com"
enabled true
}
}
}
}Environment variables:
export COSMOS_SERVERS="http://cosmos-node:26657"- Internal Check: Compare all pool nodes → Find highest block height in pool
- Remove Internal Laggards: Nodes >
block_height_thresholdbehind pool leader = removed from load balancer - External Monitoring: Query external references → Get external block heights
- Flag External Laggards: Nodes >
external_reference_thresholdbehind external references = flagged in monitoring only - Final Load Balancing: Only nodes passing internal validation receive traffic
Pool State:
- eth-node-1: Block 18,500,000 (highest in pool)
- eth-node-2: Block 18,499,996 (4 behind, healthy)
- eth-node-3: Block 18,499,990 (10 behind, unhealthy - exceeds threshold 5)
External References:
- infura_mainnet: Block 18,500,002
- alchemy_backup: Block 18,500,001
- Highest external: 18,500,002
External Monitoring (informational):
- eth-node-1: 2 blocks behind external (flagged as up-to-date)
- eth-node-2: 6 blocks behind external (flagged as up-to-date)
- eth-node-3: 12 blocks behind external (flagged as lagging in monitoring)
Final Result: Only eth-node-1 and eth-node-2 receive traffic (based on internal validation only)
| Option | Description | Default | Required |
|---|---|---|---|
check_interval |
How often to check node health | 15s |
no |
timeout |
Request timeout for health checks | 5s |
no |
retry_attempts |
Number of retry attempts for failed checks | 3 |
no |
retry_delay |
Delay between retry attempts | 1s |
no |
| Option | Description | Default | Required |
|---|---|---|---|
block_height_threshold |
Maximum blocks behind pool leader | 5 |
no |
external_reference_threshold |
Maximum blocks behind external reference | 10 |
no |
Syntax: external_reference <type> { ... }
| Option | Description | Default | Required |
|---|---|---|---|
<type> |
Reference type (cosmos or evm) specified as argument |
- | yes |
name |
Reference identifier | - | yes |
url |
External endpoint URL | - | yes |
enabled |
Enable this reference | true |
no |
Example:
external_reference cosmos {
name "cosmos_public"
url "https://cosmos-rpc.publicnode.com"
enabled true
}| Option | Description | Default | Required |
|---|---|---|---|
cache_duration |
How long to cache health results | 30s |
no |
max_concurrent_checks |
Maximum concurrent health checks | 10 |
no |
| Option | Description | Default | Required |
|---|---|---|---|
min_healthy_nodes |
Minimum healthy nodes required | 1 |
no |
grace_period |
How long to keep unhealthy nodes | 60s |
no |
circuit_breaker_threshold |
Failure ratio to open circuit breaker | 0.8 |
no |
| Option | Description | Default | Required |
|---|---|---|---|
metrics_enabled |
Enable Prometheus metrics | false |
no |
log_level |
Logging level (debug, info, warn, error) | info |
no |
health_endpoint |
HTTP endpoint for health status | /health |
no |
The plugin performs protocol-specific health checks:
{
"sub": "user_123",
"jti": "node_abc123",
"sync_info": {
"latest_block_height": "12345678",
"catching_up": false
},
"status": "healthy"
}{
"jsonrpc": "2.0",
"id": 1,
"result": "0xbc614e"
}Critical: The plugin validates sync status for Cosmos (
catching_up: false) and block height for both protocols to ensure nodes are current and healthy.
The module exposes a comprehensive health endpoint:
curl http://blockchain-api.example.com/healthResponse:
{
"status": "healthy",
"timestamp": "2024-01-15T10:30:00Z",
"nodes": {
"total": 4,
"healthy": 3,
"unhealthy": 1
},
"external_references": {
"cosmos_mainnet": {
"reachable": true,
"block_height": 12345678
},
"ethereum_infura": {
"reachable": true,
"block_height": 18500000
}
},
"cache": {
"total_entries": 4,
"valid_entries": 3,
"expired_entries": 1,
"cache_duration": "30s"
},
"last_check": "2024-01-15T10:29:45Z"
}When metrics_enabled is true, the module exposes the following metrics:
caddy_blockchain_health_checks_total: Total number of health checkscaddy_blockchain_health_healthy_nodes: Number of healthy nodescaddy_blockchain_health_unhealthy_nodes: Number of unhealthy nodescaddy_blockchain_health_check_duration_seconds: Health check durationcaddy_blockchain_health_block_height: Current block height per nodecaddy_blockchain_health_errors_total: Error count by node and type
This plugin implements a health-first architecture for optimal blockchain infrastructure management:
- Extract node configuration from Caddyfile/JSON
- Concurrent health checks with protocol-specific validation
- Circuit breaker evaluation per node with failure thresholds
- Block height validation within pools and against external references
- Cache results with TTL to optimize performance
- Dynamic upstream selection based on health status
This design ensures:
- Fast rejection of unhealthy nodes (~0.1-1ms)
- Protocol awareness - blockchain-specific health validation
- High availability - intelligent failover with minimum node enforcement
- Performance - cached results with configurable refresh
- Latency: ~0.1-1ms per request (with caching)
- Memory: Minimal overhead with connection pooling
- Health check operations: Concurrent with configurable limits
- Throughput: Tested at >10,000 RPS with negligible impact
- Cache efficiency: Configurable TTL balances freshness vs performance
git clone https://github.com/chalabi2/caddy-blockchain-health
cd caddy-blockchain-health
make dev-setup# Run all tests
make test-all
# Run with coverage
make test-coverage
# Run benchmarks
make benchmark
# Run integration tests (requires Docker)
make test-integration# Build custom Caddy binary
make xcaddy-build
# Start example configuration
make example-start
# Test with example configs
make example-validate# Run performance tests with real load
make perf-testIf you're currently using basic HTTP health checks for blockchain nodes:
api.example.com {
reverse_proxy {
to http://node1:26657 http://node2:26657
health_uri /health
health_interval 30s
}
}api.example.com {
reverse_proxy {
dynamic blockchain_health {
# Explicit configuration (recommended)
rpc_servers {$COSMOS_SERVERS}
node_type "cosmos" # ← Protocol type (health checker)
chain_type "cosmos-hub" # ← Chain identifier (grouping)
# Enhanced settings
check_interval "15s"
block_height_threshold 5
metrics_enabled true
}
}
}Environment variables:
export COSMOS_SERVERS="http://node1:26657 http://node2:26657"api.example.com {
reverse_proxy {
dynamic blockchain_health {
node node1 {
url "http://node1:26657"
type "cosmos"
chain_type "cosmos-hub" # ← Can specify chain_type per node
weight 100
}
node node2 {
url "http://node2:26657"
type "cosmos"
chain_type "cosmos-hub" # ← Same chain for grouping
weight 100
}
check_interval "15s"
block_height_threshold 5
metrics_enabled true
}
}
}Benefits of Explicit Configuration Approach:
- Clear separation of concerns - Protocol type vs chain identifier
- No hardcoded chain names - Support any blockchain without code changes
- Simplified configuration - No manual node definitions needed
- Auto-discovery - Automatic detection of service types
- Custom chain support - Define your own L2s and testnets
- Environment integration - Better CI/CD and deployment workflows
- Backward compatibility - Legacy mode for existing configurations
Benefits of Blockchain-Aware Health Checks:
- Protocol-specific validation (sync status, block height)
- Intelligent failover based on blockchain health
- External reference validation against trusted sources
- Circuit breaker protection for unhealthy nodes
- Comprehensive monitoring with Prometheus metrics
If you're experiencing WebSocket connection failures (e.g., "Received unexpected status code (200 OK)"):
Problem: WebSocket upgrade fails because health checks are being performed on WebSocket URLs instead of HTTP URLs.
Solution: Ensure both evm_servers and evm_ws_servers are configured for proper correlation:
handle @websocket {
reverse_proxy {
dynamic blockchain_health {
# CORRECT: Specify both HTTP and WebSocket servers
evm_servers {$BASE_SERVERS} # HTTP for health checks
evm_ws_servers {$BASE_WS_SERVERS} # WebSocket for proxy
chain_type "evm"
}
}
}# CORRECT: Correlated by hostname/index
export BASE_SERVERS="http://node1:8545 http://node2:8545"
export BASE_WS_SERVERS="ws://node1:8546 ws://node2:8546"Common Mistakes:
# WRONG: Only WebSocket servers specified
dynamic blockchain_health {
evm_ws_servers {$BASE_WS_SERVERS} # Missing HTTP servers for health checks
chain_type "evm"
}
# WRONG: Old service_type approach (deprecated)
dynamic blockchain_health {
evm_ws_servers {$BASE_WS_SERVERS}
service_type "evm_websocket" # No longer needed - causes issues
chain_type "evm"
}
# WRONG: Mismatched server counts
# BASE_SERVERS="http://node1:8545 http://node2:8545" # 2 servers
# BASE_WS_SERVERS="ws://node1:8546" # 1 server - can't correlateVerification:
Test your WebSocket connection:
# Should work after fix
websocat wss://your-domain.com/base -H "Authorization: Bearer YOUR_JWT"Check correlation in logs:
# Enable debug logging to see correlation
./caddy run --config Caddyfile --adapter caddyfile
# Look for: "WebSocket node health check successful via HTTP"Health endpoint verification:
curl http://your-domain.com/health
# Should show both HTTP and WebSocket nodes as healthy- Caddy: v2.7.0 or higher
- Go: 1.21 or higher
- Protocols: Cosmos SDK, Ethereum/EVM JSON-RPC
MIT License - see LICENSE file.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Add tests for new functionality (
make test) - Ensure all tests pass (
make test-all) - Submit a pull request
When reporting bugs, please include:
- Caddy version (
./caddy version) - Plugin version and build info
- Configuration (Caddyfile or JSON)
- Blockchain node types and versions
- Steps to reproduce
- Expected vs actual behavior
- Relevant logs with debug level enabled
Example:
# Enable debug logging
make xcaddy-build
./caddy run --config example_configs/Caddyfile