Skip to content

feat: peer scoring#20047

Merged
mrzeszutko merged 1 commit intonextfrom
feature/peer-scoring
Feb 9, 2026
Merged

feat: peer scoring#20047
mrzeszutko merged 1 commit intonextfrom
feature/peer-scoring

Conversation

@mrzeszutko
Copy link
Contributor

@mrzeszutko mrzeszutko commented Jan 29, 2026

Gossipsub Peer Scoring

Summary

This PR implements comprehensive gossipsub peer scoring improvements for the Aztec P2P network:

  • Balanced P1/P2/P3 configuration following Lodestar's approach (P3 > P1+P2 for mesh pruning)
  • Dynamic per-topic scoring parameters based on expected message rates
  • Tightened gossipsub score thresholds aligned with application-level scoring
  • Documented application score weight for gossipsub integration
  • Reviewed and documented application-level penalties
  • Network outage analysis showing non-contributing peers are pruned but not disconnected

Motivation

Previously, all gossipsub topics used identical hardcoded scoring parameters. This doesn't account for the vastly different message frequencies across topics:

  • Transactions: Unpredictable rate
  • Block proposals: N-1 per slot (where N = blocks per slot in MBPS mode)
  • Checkpoint proposals: 1 per slot
  • Checkpoint attestations: ~48 per slot (committee size)

Additionally, the gossipsub thresholds were borrowed from Lighthouse (Ethereum beacon chain) and were too lax for our scoring system. A banned peer (app score -100) only contributed -1000 to gossipsub, far above the -4000 gossipThreshold, so banned peers still received gossip.

Changes

New Shared Module: @aztec/stdlib/timetable

Created a shared timetable constants module that both p2p and sequencer-client import from:

  • CHECKPOINT_INITIALIZATION_TIME (1s)
  • CHECKPOINT_ASSEMBLE_TIME (1s)
  • DEFAULT_P2P_PROPAGATION_TIME (2s)
  • DEFAULT_L1_PUBLISHING_TIME (12s)
  • MIN_EXECUTION_TIME (2s)
  • calculateMaxBlocksPerSlot() - shared calculation for blocks per slot

Added targetCommitteeSize to L1RollupConstants

The committee size is needed to calculate expected attestation rates. Added to:

  • L1RollupConstants type and schema
  • EpochCache.create() to fetch from rollup contract
  • EpochCacheInterface.getL1Constants() method

New Topic Scoring Module: @aztec/p2p/services/gossipsub/topic_score_params.ts

Implements dynamic scoring parameter calculation with balanced P1/P2/P3 configuration following Lodestar's approach:

Parameter Max Score Configuration
P1: timeInMesh +8 per topic Slot-based, caps at 1 hour
P2: firstMessageDeliveries +25 per topic Convergence-based, fast decay
P3: meshMessageDeliveries -34 per topic Must exceed P1+P2 for pruning
P3b: meshFailurePenalty -34 per topic Sticky penalty after pruning
P4: invalidMessageDeliveries -20 per message Attack detection
Topic Expected/Slot Decay Window P1/P2/P3
tx Unpredictable N/A Disabled (only P4)
block_proposal N-1 (MBPS) 3 slots Enabled
checkpoint_proposal 1 5 slots Enabled
checkpoint_attestation ~48 2 slots Enabled

Key features:

  • Score balance for mesh pruning: P3 max (-34) exceeds P1+P2 max (+33), ensuring non-contributors get pruned
  • No free positive scores: tx topic has P1/P2 disabled to prevent offsetting penalties from other topics
  • P3b total: -102 across 3 topics (well above -500 gossipThreshold, so network issues don't cause disconnection)
  • Multi-slot decay windows: Low-frequency topics decay over more slots to accumulate meaningful counter values
  • Conservative thresholds: Set at 30% of convergence to avoid penalizing honest peers
  • 5-second delivery window: Balanced for TypeScript runtime (between Go implementations at 2s and Lodestar at 12s); accounts for JavaScript I/O latency while limiting replay attacks
  • 5× activation multiplier: Extra grace period during network bootstrap (activation timer starts at mesh join, not first message)

Tightened Gossipsub Thresholds

Updated scoring.ts with thresholds aligned to application-level scoring:

Threshold Old Value New Value Alignment
gossipThreshold -4000 -500 Matches Disconnect state (-50 × 10)
publishThreshold -8000 -1000 Matches Ban state (-100 × 10)
graylistThreshold -16000 -2000 For severe attacks (ban + topic penalties)

The 1:2:4 ratio follows Lodestar's approach and gossipsub spec recommendations.

Application Score Weight

Verified appSpecificWeight = 10 creates perfect alignment:

  • Disconnect (-50) × 10 = -500 = gossipThreshold
  • Ban (-100) × 10 = -1000 = publishThreshold

Added documentation in libp2p_service.ts explaining this alignment.

Application Penalties

The existing penalties are well-designed and unchanged:

Severity Points Errors to Disconnect Errors to Ban
HighToleranceError 2 25 50
MidToleranceError 10 5 10
LowToleranceError 50 1 2

Added documentation in peer_scoring.ts explaining the alignment with gossipsub thresholds.

How the Systems Work Together

Score Flow

Total Gossipsub Score = TopicScore + (AppScore × 10) + IPColocationPenalty

Peer State Alignment

App Score State App Score Gossipsub Contribution Effect
Healthy 0 to -49 0 to -490 Full participation
Disconnect -50 -500 Stops receiving gossip
Ban -100 -1000 Cannot publish
Attack -100 + P4 -2000+ Graylisted

Topic Score Contribution

Topic scores are balanced for mesh pruning while allowing recovery from network issues:

Parameter Per Topic Total (3 topics) Notes
P1 (timeInMesh) +8 max +24 Caps at 1 hour, resets on mesh leave
P2 (firstMessageDeliveries) +25 max +75 Fast decay, negligible after mesh leave
P3 (under-delivery) -34 max -102 Must exceed P1+P2 (+33) for pruning
P4 (invalid messages) -20 each Unlimited Can spike to -2000+ during attacks

Key insight: P3 max (-34) > P1+P2 max (+33), so non-contributors are always pruned regardless of how long they've been in mesh.

After pruning: P3b = -102 total, which is well above gossipThreshold (-500), so network issues don't cause disconnection.

Example Scenarios

  1. Honest peer: Score ~0, full participation
  2. Validation failures: Gets LowToleranceError → app score -50 → stops receiving gossip
  3. Banned peer: App score -100 → cannot publish messages
  4. Active attack: Banned + 10 invalid messages → -3000+ → graylisted

Technical Details

Decay Calculation

Counters decay to ~1% over the decay window:

heartbeatsPerSlot = slotDurationMs / heartbeatIntervalMs
heartbeatsInWindow = heartbeatsPerSlot * decayWindowSlots
decay = 0.01^(1 / heartbeatsInWindow)

Convergence and Threshold

Steady-state counter value and conservative threshold:

messagesPerHeartbeat = expectedPerSlot * (heartbeatMs / slotDurationMs)
convergence = messagesPerHeartbeat / (1 - decay)
threshold = convergence * 0.3  // 30% conservative factor

Blocks Per Slot

Calculated from timetable constants (same formula used by sequencer):

timeAvailable = slotDuration - initOffset - blockDuration - finalizationTime
blocksPerSlot = floor(timeAvailable / blockDuration)

Files Changed

New Files

  • yarn-project/stdlib/src/timetable/index.ts - Shared timetable constants
  • yarn-project/stdlib/src/config/sequencer-config.ts - Shared sequencer config mappings (e.g., blockDurationMs)
  • yarn-project/p2p/src/services/gossipsub/topic_score_params.ts - Topic scoring logic
  • yarn-project/p2p/src/services/gossipsub/topic_score_params.test.ts - Unit tests for scoring params
  • yarn-project/p2p/src/services/gossipsub/index.ts - Module exports
  • yarn-project/p2p/src/services/gossipsub/README.md - Documentation

Modified Files

  • yarn-project/stdlib/src/epoch-helpers/index.ts - Added targetCommitteeSize
  • yarn-project/stdlib/package.json - Added timetable export
  • yarn-project/epoch-cache/src/epoch_cache.ts - Fetch committee size, add getL1Constants()
  • yarn-project/p2p/src/config.ts - Added blockDurationMs to P2P config via Pick<SequencerConfig, 'blockDurationMs'> (uses shared mapping from @aztec/stdlib/config)
  • yarn-project/p2p/src/services/libp2p/libp2p_service.ts - Use dynamic topic params, pass blockDurationMs from config, added appSpecificWeight documentation
  • yarn-project/p2p/src/services/gossipsub/scoring.ts - Updated thresholds with documentation
  • yarn-project/p2p/src/services/peer-manager/peer_scoring.ts - Added alignment documentation
  • yarn-project/sequencer-client/src/config.ts - Import timetable constants and shared sequencer config mappings from stdlib
  • yarn-project/sequencer-client/src/sequencer/timetable.ts - Import from stdlib
  • yarn-project/archiver/src/factory.ts - Include targetCommitteeSize
  • Test files updated with targetCommitteeSize and getL1Constants mocks

Testing

  • All existing tests pass
  • Comprehensive unit tests for topic_score_params.ts (46 tests) verify:
    • calculateBlocksPerSlot - single block mode and MBPS mode
    • getDecayWindowSlots - frequency-based decay window selection
    • computeDecay - mathematical correctness (decays to ~1% over window)
    • computeConvergence - geometric series formula
    • computeThreshold - conservative threshold calculation
    • getExpectedMessagesPerSlot - per-topic expected rates
    • TopicScoreParamsFactory - shared value computation, per-topic params
    • Mathematical properties - decay, convergence, penalty calculations
    • Realistic network scenarios - checkpoint_proposal and checkpoint_attestation configs
    • P1/P2/P3 score balance - verifies max scores, non-contributor pruning, P3b limits

Documentation

Added comprehensive README at yarn-project/p2p/src/services/gossipsub/README.md covering:

  • Gossipsub scoring overview
  • P1-P4 parameters explained with Lodestar-style normalization
  • P1 slot-based configuration (caps at 1 hour)
  • P2 convergence-based configuration (fast decay)
  • P3 weight formula ensuring max penalty = -34 per topic
  • Score balance: P3 (-34) > P1+P2 (+33) for mesh pruning
  • Decay mechanics and multi-slot windows
  • Threshold calculations
  • Per-topic configuration rationale (tx topic has P1/P2/P3 disabled)
  • Tuning guidelines
  • Global score thresholds and their alignment with application scoring
  • Non-contributing peers analysis (why they're not disconnected, mesh pruning behavior)
  • Network outage analysis (what happens during connectivity loss, recovery timeline)
  • Application-level penalties (what triggers each severity level)
  • Score calculation examples (6 detailed scenarios from honest peer to attack recovery)

Fixes A-265

@mrzeszutko mrzeszutko changed the title Peer scoring feat: peer scoring Jan 29, 2026
@mrzeszutko mrzeszutko force-pushed the feature/peer-scoring branch 2 times, most recently from bfe83fc to 4a23ba3 Compare January 29, 2026 14:10
| `tx` | Unpredictable | N/A | P3/P3b disabled |
| `block_proposal` | N-1 | 3 slots | N = blocks per slot (MBPS mode) |
| `checkpoint_proposal` | 1 | 5 slots | One per slot |
| `checkpoint_attestation` | C (~48) | 2 slots | C = committee size |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this expectation be too high? I'm just thinking if a percentage of validators are non-responsive then we would penalize honest peers through no fault of their own.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just the numbers expressing the ideal scenario - more on penalization for under-delivery can be found here: https://github.com/AztecProtocol/aztec-packages/blob/feature/peer-scoring/yarn-project/p2p/src/services/gossipsub/README.md#how-p3-handles-under-delivery

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the main impact for underdelivering peers is the will be pruned from the mesh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And one more remark: to actually get penalized for under-delivery the score for the topic need to be below the threshold: https://github.com/AztecProtocol/aztec-packages/blob/feature/peer-scoring/yarn-project/p2p/src/services/gossipsub/README.md#threshold-calculation
Currently that is 30% of expected score that is calculated over 5 slots.

'Whether to run in fisherman mode: validates all proposals and attestations but does not broadcast attestations or participate in consensus.',
...booleanConfigHelper(false),
},
blockDurationMs: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be duplicated here. Can this env var mapping be moved from SequencerClientConfig to SequencerConfig (in stdlib) and then Pick<> into this config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@mrzeszutko mrzeszutko force-pushed the feature/peer-scoring branch 3 times, most recently from fc701ef to 6ef0c21 Compare January 30, 2026 16:30
@mralj mralj force-pushed the feature/peer-scoring branch 2 times, most recently from 3f01630 to ebb9b04 Compare January 30, 2026 23:48
@mrzeszutko mrzeszutko force-pushed the feature/peer-scoring branch 2 times, most recently from 19ac2c7 to f7c6e8e Compare February 2, 2026 14:22
Copy link
Contributor

@alexghr alexghr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'll let @PhilWindle to do the final approval

@@ -100,6 +100,7 @@ export async function createArchiver(
slotDuration,
ethereumSlotDuration,
proofSubmissionEpochs: Number(proofSubmissionEpochs),
targetCommitteeSize: config.aztecTargetCommitteeSize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be read from the rollup. See above code block

const [l1StartBlock, l1GenesisTime, proofSubmissionEpochs, genesisArchiveRoot, slashingProposerAddress] =
    await Promise.all([
      rollup.getL1StartBlock(),
      rollup.getL1GenesisTime(),
      rollup.getProofSubmissionEpochs(),
      rollup.getGenesisArchiveTreeRoot(),
      rollup.getSlashingProposerAddress(),
    ] as const);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! fixed

@mrzeszutko mrzeszutko force-pushed the feature/peer-scoring branch 2 times, most recently from a3b04c4 to 1041ec6 Compare February 4, 2026 08:44
@mrzeszutko mrzeszutko requested a review from PhilWindle February 5, 2026 08:12
@PhilWindle PhilWindle added this pull request to the merge queue Feb 9, 2026
github-merge-queue bot pushed a commit that referenced this pull request Feb 9, 2026
# Gossipsub Peer Scoring

## Summary

This PR implements comprehensive gossipsub peer scoring improvements for
the Aztec P2P network:

- Balanced P1/P2/P3 configuration following Lodestar's approach (P3 >
P1+P2 for mesh pruning)
- Dynamic per-topic scoring parameters based on expected message rates
- Tightened gossipsub score thresholds aligned with application-level
scoring
- Documented application score weight for gossipsub integration
- Reviewed and documented application-level penalties
- Network outage analysis showing non-contributing peers are pruned but
not disconnected

## Motivation

Previously, all gossipsub topics used identical hardcoded scoring
parameters. This doesn't account for the vastly different message
frequencies across topics:
- **Transactions**: Unpredictable rate
- **Block proposals**: N-1 per slot (where N = blocks per slot in MBPS
mode)
- **Checkpoint proposals**: 1 per slot
- **Checkpoint attestations**: ~48 per slot (committee size)

Additionally, the gossipsub thresholds were borrowed from Lighthouse
(Ethereum beacon chain) and were too lax for our scoring system. A
banned peer (app score -100) only contributed -1000 to gossipsub, far
above the -4000 gossipThreshold, so **banned peers still received
gossip**.

## Changes

### New Shared Module: `@aztec/stdlib/timetable`

Created a shared timetable constants module that both `p2p` and
`sequencer-client` import from:

- `CHECKPOINT_INITIALIZATION_TIME` (1s)
- `CHECKPOINT_ASSEMBLE_TIME` (1s)
- `DEFAULT_P2P_PROPAGATION_TIME` (2s)
- `DEFAULT_L1_PUBLISHING_TIME` (12s)
- `MIN_EXECUTION_TIME` (2s)
- `calculateMaxBlocksPerSlot()` - shared calculation for blocks per slot

### Added `targetCommitteeSize` to `L1RollupConstants`

The committee size is needed to calculate expected attestation rates.
Added to:
- `L1RollupConstants` type and schema
- `EpochCache.create()` to fetch from rollup contract
- `EpochCacheInterface.getL1Constants()` method

### New Topic Scoring Module:
`@aztec/p2p/services/gossipsub/topic_score_params.ts`

Implements dynamic scoring parameter calculation with balanced P1/P2/P3
configuration following Lodestar's approach:

| Parameter | Max Score | Configuration |
|-----------|-----------|---------------|
| **P1: timeInMesh** | +8 per topic | Slot-based, caps at 1 hour |
| **P2: firstMessageDeliveries** | +25 per topic | Convergence-based,
fast decay |
| **P3: meshMessageDeliveries** | -34 per topic | Must exceed P1+P2 for
pruning |
| **P3b: meshFailurePenalty** | -34 per topic | Sticky penalty after
pruning |
| **P4: invalidMessageDeliveries** | -20 per message | Attack detection
|

| Topic | Expected/Slot | Decay Window | P1/P2/P3 |
|-------|--------------|--------------|----------|
| `tx` | Unpredictable | N/A | **Disabled** (only P4) |
| `block_proposal` | N-1 (MBPS) | 3 slots | Enabled |
| `checkpoint_proposal` | 1 | 5 slots | Enabled |
| `checkpoint_attestation` | ~48 | 2 slots | Enabled |

Key features:
- **Score balance for mesh pruning**: P3 max (-34) exceeds P1+P2 max
(+33), ensuring non-contributors get pruned
- **No free positive scores**: tx topic has P1/P2 disabled to prevent
offsetting penalties from other topics
- **P3b total**: -102 across 3 topics (well above -500 gossipThreshold,
so network issues don't cause disconnection)
- **Multi-slot decay windows**: Low-frequency topics decay over more
slots to accumulate meaningful counter values
- **Conservative thresholds**: Set at 30% of convergence to avoid
penalizing honest peers
- **5-second delivery window**: Balanced for TypeScript runtime (between
Go implementations at 2s and Lodestar at 12s); accounts for JavaScript
I/O latency while limiting replay attacks
- **5× activation multiplier**: Extra grace period during network
bootstrap (activation timer starts at mesh join, not first message)

### Tightened Gossipsub Thresholds

Updated `scoring.ts` with thresholds aligned to application-level
scoring:

| Threshold | Old Value | New Value | Alignment |
|-----------|-----------|-----------|-----------|
| gossipThreshold | -4000 | -500 | Matches Disconnect state (-50 × 10) |
| publishThreshold | -8000 | -1000 | Matches Ban state (-100 × 10) |
| graylistThreshold | -16000 | -2000 | For severe attacks (ban + topic
penalties) |

The 1:2:4 ratio follows Lodestar's approach and gossipsub spec
recommendations.

### Application Score Weight

Verified `appSpecificWeight = 10` creates perfect alignment:
- Disconnect (-50) × 10 = -500 = gossipThreshold
- Ban (-100) × 10 = -1000 = publishThreshold

Added documentation in `libp2p_service.ts` explaining this alignment.

### Application Penalties

The existing penalties are well-designed and unchanged:

| Severity | Points | Errors to Disconnect | Errors to Ban |
|----------|--------|----------------------|---------------|
| HighToleranceError | 2 | 25 | 50 |
| MidToleranceError | 10 | 5 | 10 |
| LowToleranceError | 50 | 1 | 2 |

Added documentation in `peer_scoring.ts` explaining the alignment with
gossipsub thresholds.

## How the Systems Work Together

### Score Flow

```
Total Gossipsub Score = TopicScore + (AppScore × 10) + IPColocationPenalty
```

### Peer State Alignment

| App Score State | App Score | Gossipsub Contribution | Effect |
|-----------------|-----------|------------------------|--------|
| Healthy | 0 to -49 | 0 to -490 | Full participation |
| Disconnect | -50 | -500 | Stops receiving gossip |
| Ban | -100 | -1000 | Cannot publish |
| Attack | -100 + P4 | -2000+ | Graylisted |

### Topic Score Contribution

Topic scores are balanced for mesh pruning while allowing recovery from
network issues:

| Parameter | Per Topic | Total (3 topics) | Notes |
|-----------|-----------|------------------|-------|
| P1 (timeInMesh) | +8 max | +24 | Caps at 1 hour, resets on mesh leave
|
| P2 (firstMessageDeliveries) | +25 max | +75 | Fast decay, negligible
after mesh leave |
| P3 (under-delivery) | -34 max | -102 | Must exceed P1+P2 (+33) for
pruning |
| P4 (invalid messages) | -20 each | Unlimited | Can spike to -2000+
during attacks |

**Key insight**: P3 max (-34) > P1+P2 max (+33), so non-contributors are
always pruned regardless of how long they've been in mesh.

**After pruning**: P3b = -102 total, which is well above gossipThreshold
(-500), so network issues don't cause disconnection.

### Example Scenarios

1. **Honest peer**: Score ~0, full participation
2. **Validation failures**: Gets LowToleranceError → app score -50 →
stops receiving gossip
3. **Banned peer**: App score -100 → cannot publish messages
4. **Active attack**: Banned + 10 invalid messages → -3000+ → graylisted

## Technical Details

### Decay Calculation

Counters decay to ~1% over the decay window:
```
heartbeatsPerSlot = slotDurationMs / heartbeatIntervalMs
heartbeatsInWindow = heartbeatsPerSlot * decayWindowSlots
decay = 0.01^(1 / heartbeatsInWindow)
```

### Convergence and Threshold

Steady-state counter value and conservative threshold:
```
messagesPerHeartbeat = expectedPerSlot * (heartbeatMs / slotDurationMs)
convergence = messagesPerHeartbeat / (1 - decay)
threshold = convergence * 0.3  // 30% conservative factor
```

### Blocks Per Slot

Calculated from timetable constants (same formula used by sequencer):
```
timeAvailable = slotDuration - initOffset - blockDuration - finalizationTime
blocksPerSlot = floor(timeAvailable / blockDuration)
```

## Files Changed

### New Files
- `yarn-project/stdlib/src/timetable/index.ts` - Shared timetable
constants
- `yarn-project/stdlib/src/config/sequencer-config.ts` - Shared
sequencer config mappings (e.g., `blockDurationMs`)
- `yarn-project/p2p/src/services/gossipsub/topic_score_params.ts` -
Topic scoring logic
- `yarn-project/p2p/src/services/gossipsub/topic_score_params.test.ts` -
Unit tests for scoring params
- `yarn-project/p2p/src/services/gossipsub/index.ts` - Module exports
- `yarn-project/p2p/src/services/gossipsub/README.md` - Documentation

### Modified Files
- `yarn-project/stdlib/src/epoch-helpers/index.ts` - Added
`targetCommitteeSize`
- `yarn-project/stdlib/package.json` - Added timetable export
- `yarn-project/epoch-cache/src/epoch_cache.ts` - Fetch committee size,
add `getL1Constants()`
- `yarn-project/p2p/src/config.ts` - Added `blockDurationMs` to P2P
config via `Pick<SequencerConfig, 'blockDurationMs'>` (uses shared
mapping from `@aztec/stdlib/config`)
- `yarn-project/p2p/src/services/libp2p/libp2p_service.ts` - Use dynamic
topic params, pass `blockDurationMs` from config, added
appSpecificWeight documentation
- `yarn-project/p2p/src/services/gossipsub/scoring.ts` - Updated
thresholds with documentation
- `yarn-project/p2p/src/services/peer-manager/peer_scoring.ts` - Added
alignment documentation
- `yarn-project/sequencer-client/src/config.ts` - Import timetable
constants and shared sequencer config mappings from stdlib
- `yarn-project/sequencer-client/src/sequencer/timetable.ts` - Import
from stdlib
- `yarn-project/archiver/src/factory.ts` - Include `targetCommitteeSize`
- Test files updated with `targetCommitteeSize` and `getL1Constants`
mocks

## Testing

- All existing tests pass
- Comprehensive unit tests for `topic_score_params.ts` (46 tests)
verify:
  - `calculateBlocksPerSlot` - single block mode and MBPS mode
  - `getDecayWindowSlots` - frequency-based decay window selection
- `computeDecay` - mathematical correctness (decays to ~1% over window)
  - `computeConvergence` - geometric series formula
  - `computeThreshold` - conservative threshold calculation
  - `getExpectedMessagesPerSlot` - per-topic expected rates
- `TopicScoreParamsFactory` - shared value computation, per-topic params
  - Mathematical properties - decay, convergence, penalty calculations
- Realistic network scenarios - checkpoint_proposal and
checkpoint_attestation configs
- **P1/P2/P3 score balance** - verifies max scores, non-contributor
pruning, P3b limits

## Documentation

Added comprehensive README at
`yarn-project/p2p/src/services/gossipsub/README.md` covering:
- Gossipsub scoring overview
- P1-P4 parameters explained with Lodestar-style normalization
- P1 slot-based configuration (caps at 1 hour)
- P2 convergence-based configuration (fast decay)
- P3 weight formula ensuring max penalty = -34 per topic
- Score balance: P3 (-34) > P1+P2 (+33) for mesh pruning
- Decay mechanics and multi-slot windows
- Threshold calculations
- Per-topic configuration rationale (tx topic has P1/P2/P3 disabled)
- Tuning guidelines
- Global score thresholds and their alignment with application scoring
- Non-contributing peers analysis (why they're not disconnected, mesh
pruning behavior)
- **Network outage analysis** (what happens during connectivity loss,
recovery timeline)
- Application-level penalties (what triggers each severity level)
- Score calculation examples (6 detailed scenarios from honest peer to
attack recovery)

Fixes A-265
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 9, 2026
@mrzeszutko mrzeszutko force-pushed the feature/peer-scoring branch from 1041ec6 to 44b9068 Compare February 9, 2026 15:08
@AztecBot
Copy link
Collaborator

AztecBot commented Feb 9, 2026

Flakey Tests

🤖 says: This CI run detected 2 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/2cb9549eb0736128�2cb9549eb07361288;;�): yarn-project/scripts/run_test.sh p2p/src/client/test/p2p_client.integration_message_propagation.test.ts (22s) (code: 1) group:e2e-p2p-epoch-flakes
\033FLAKED\033 (8;;http://ci.aztec-labs.com/cacd06a417d52518�cacd06a417d525188;;�): yarn-project/scripts/run_test.sh ethereum/src/test/tx_delayer.test.ts (103s) (code: 1)

@mrzeszutko mrzeszutko added this pull request to the merge queue Feb 9, 2026
Merged via the queue into next with commit 084035e Feb 9, 2026
19 checks passed
@mrzeszutko mrzeszutko deleted the feature/peer-scoring branch February 9, 2026 16:03
ludamad added a commit that referenced this pull request Feb 23, 2026
Slide 19 (§4 insights · PR correlation): two-column layout showing which
PRs caused each weekly flake spike and which fixes produced each recovery:

Spikes:
- W02 (2,647 flakes): Santiago refactors #19532/#19509/#19564 exposed
  timing races across p2p/epoch simultaneously
- W04 (935 flakes): PhilWindle #19982 added cross-chain mbps tests
  without pre-deflaking — valid_epoch_pruned_slash 0→346 events
- W06 (850 flakes): three high-risk PRs merged same day (#20047 peer
  scoring, #20241 max checkpoints→32, #20257 hash constants)

Fixes:
- W03 recovery: Santiago #19914 — checkpointed chain tip for PXE
  (root fix; PXE was using latest not checkpointed block)
- W05 recovery: Santiago #20088 slasher multi-block fix + #20140
  discv5 deflake + GCP step-down (−6 testbed namespaces)
- W07 improvement: Santiago #20351 mbps fix (p2p_client 311→0),
  #20462 remove hardcoded 10s timeout, ludamad #20613 CI parallelism

Also: correct three factual errors spotted during full review —
- Summary: next P50 is growing (+10% in 3 weeks), not stable
- Flake trend W07 note: e2e-p2p-epoch-flakes dropped 373×, not just
  "251 flakes lowest since December"
- Gaps slide: replaced stale "ci_phases broken" card with GCP egress
  costs gap (bc→awk fix is deployed; egress attribution is the gap now)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants