feat(syncing): add grace period for missing force txs inclusion #2915

julienrbrt · 2025-12-09T14:55:02Z

Overview

Add grace period for missing force txs inclusion. The grace period of 1 epoch can increase depending of the chain congestion (size of blocks). The grace period becomes elastic if the chain congested, to give to the sequencer to process all txs before being considered malicious.

github-actions · 2025-12-09T14:55:20Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Dec 11, 2025, 1:09 PM

claude · 2025-12-09T14:55:23Z

Claude finished @julienrbrt's task in 3m 57s —— View job

Comprehensive Code Review: Grace Period for Forced Inclusion

I've performed a thorough review of this PR implementing a configurable grace period for forced inclusion transactions. Here's my detailed feedback:

🎯 Summary

This PR adds a sophisticated grace period mechanism with dynamic adjustment based on chain congestion. The implementation is well-designed and addresses a real operational need. However, there are several important issues to address, particularly around test coverage and the dynamic adjustment implementation.

🔴 Critical Issues

1. Insufficient Test Coverage for Grace Period Logic ⚠️

Location: block/internal/syncing/syncer_forced_inclusion_test.go

Gemini's review correctly identified a critical gap: None of the existing tests actually validate the grace period functionality with non-zero values. The test TestVerifyForcedInclusionTxs_SmoothingExceedsEpoch was added but it doesn't test the grace period behavior—it only validates smoothing within an epoch.

Missing test scenarios:

// MISSING: Test with grace period = 1
// - Forced tx from epoch ending at 100
// - Grace boundary = 100 + (1 * epochSize)
// - Verify tx can be included at DAHeight = 101..graceBoundary (should pass)
// - Verify tx NOT included at DAHeight > graceBoundary (should fail as malicious)

// MISSING: Test dynamic grace period adjustment
// - Verify high fullness increases grace period
// - Verify low fullness decreases grace period
// - Verify grace period stays within min/max bounds

Recommendation: Add comprehensive tests specifically for grace period enforcement:

Test transactions within grace period (should succeed)
Test transactions past grace boundary (should fail as malicious)
Test edge case: currentDAHeight == graceBoundary (should succeed per line 888)
Test edge case: currentDAHeight == graceBoundary + 1 (should fail)

2. Potential Integer Overflow in Grace Boundary Calculation ⚠️

Location: block/internal/syncing/syncer.go:886

graceBoundary := pending.EpochEnd + (effectiveGracePeriod * s.genesis.DAEpochForcedInclusion)

If effectiveGracePeriod or DAEpochForcedInclusion are very large, this could overflow uint64. While unlikely in practice, the calculation should use checked arithmetic or document the assumption.

Recommendation:

// Add overflow check or document maximum values
if effectiveGracePeriod > 0 && s.genesis.DAEpochForcedInclusion > math.MaxUint64/effectiveGracePeriod {
    s.logger.Error().Msg("grace period calculation would overflow")
    graceBoundary = math.MaxUint64
} else {
    graceBoundary = pending.EpochEnd + (effectiveGracePeriod * s.genesis.DAEpochForcedInclusion)
}

🟡 Architecture & Design Concerns

3. Dynamic Grace Period May Be Too Complex 🤔

Location: block/internal/syncing/syncer.go:760-809

The dynamic grace period adjustment adds significant complexity:

EMA tracking of block fullness
Multiplier adjustment based on congestion
Rounding and clamping logic

Concerns:

Attack Surface: A malicious sequencer could artificially inflate block fullness to extend grace periods and delay detection
Unpredictability: Dynamic adjustment makes it harder to reason about when censorship will be detected
Configuration Burden: Multiple tuning parameters (min/max multipliers, threshold, adjustment rate) that operators must understand

From ADR-019:

"Grace period is dynamically adjusted based on block fullness to balance censorship detection with operational reliability"

But the ADR doesn't discuss the security implications of this dynamic behavior.

Recommendations:

Document Security Trade-offs: Add section to ADR discussing attack vectors
Consider Simpler Alternative: Use a fixed grace period initially, add dynamic adjustment in a follow-up PR if needed
Add Metrics: Expose grace_period_multiplier and block_fullness_ema as metrics for observability
Rate Limiting: Consider capping how quickly the multiplier can change to prevent sudden spikes

4. Block Fullness Calculation is Heuristic and May Be Inaccurate

Location: block/internal/syncing/syncer.go:741-758

const maxDataSize = seqcommon.AbsoluteMaxBlobSize // 2MB

func (s *Syncer) calculateBlockFullness(data *types.Data) float64 {
    dataSize := uint64(0)
    for _, tx := range data.Txs {
        dataSize += uint64(len(tx))
    }
    return float64(dataSize) / float64(maxDataSize)
}

Issues:

Uses constant AbsoluteMaxBlobSize (2MB) which may not represent actual block capacity
Only considers data size, not transaction count or gas limits
Comment says "This is a heuristic - actual limits may vary by execution layer"

This means the grace period adjustment may not accurately reflect chain congestion.

Recommendations:

Use actual block gas limit or execution layer capacity if available
Document the heuristic nature prominently
Consider making this pluggable so different chains can provide accurate fullness metrics

🟢 Code Quality & Go Best Practices

5. Excellent Use of Atomic Pointers for Concurrent Access ✅

Location: block/internal/syncing/syncer.go:112-114

gracePeriodMultiplier *atomic.Pointer[float64]
blockFullnessEMA      *atomic.Pointer[float64]

Using atomic.Pointer[float64] is the correct approach for lock-free reads/writes. Well done!

6. Configuration Design is Clean ✅

Location: block/internal/syncing/syncer.go:33-72

The internal configuration struct forcedInclusionGracePeriodConfig properly encapsulates tuning parameters. Good separation of concerns.

Minor suggestion: Consider exposing some parameters via config file for operator tuning:

// In genesis.Genesis or config.Config
type ForcedInclusionConfig struct {
    BasePeriod uint64 `json:"base_period"` // Expose for operator tuning
}

🔵 Documentation & Testing

7. ADR Updates are Comprehensive ✅

Location: docs/adr/adr-019-forced-inclusion-mechanism.md

The ADR updates are thorough and well-structured:

Clear explanation of grace period concept
Multiple worked examples
Configuration recommendations

Minor improvements:

Add sequence diagram showing grace period verification flow
Document performance implications of pending tx tracking
Add migration guide for existing networks

8. Good Test Coverage for Dynamic Adjustment ✅

Location: block/internal/syncing/syncer_forced_inclusion_test.go:27-342

Excellent unit tests for the dynamic adjustment logic:

TestCalculateBlockFullness_* - Tests fullness calculation
TestUpdateDynamicGracePeriod_* - Tests adjustment algorithm
TestGetEffectiveGracePeriod_* - Tests period calculation
Integration tests for high/low congestion scenarios

9. Missing Edge Case: Zero Grace Period

The code doesn't explicitly test or document behavior when basePeriod = 0. From line 886:

graceBoundary := pending.EpochEnd + (effectiveGracePeriod * s.genesis.DAEpochForcedInclusion)

With basePeriod = 0, this becomes graceBoundary = EpochEnd, meaning strict enforcement at epoch boundary. This should be:

Explicitly tested
Documented as valid configuration for high-security deployments

🟣 Performance & Efficiency

10. Efficient Metrics Updates ✅

Location: block/internal/syncing/syncer.go:907, 920

s.metrics.ForcedInclusionTxsInGracePeriod.Set(float64(txsInGracePeriod))
s.metrics.ForcedInclusionTxsMalicious.Add(float64(len(maliciousTxs)))

Good use of gauge for current state and counter for cumulative malicious txs.

11. Potential Performance Issue: Pending Tx Iteration

Location: block/internal/syncing/syncer.go:881-905, 910-916

The code iterates through stillPending twice:

First to check grace boundaries (lines 881-905)
Then to repopulate the sync.Map (lines 910-916)

For large pending sets, this could be inefficient. Consider single-pass:

// Single-pass version
s.pendingForcedInclusionTxs.Range(func(key, value any) bool {
    pending := value.(pendingForcedInclusionTx)
    graceBoundary := pending.EpochEnd + (effectiveGracePeriod * s.genesis.DAEpochForcedInclusion)
    
    if currentState.DAHeight > graceBoundary {
        maliciousTxs = append(maliciousTxs, pending)
        s.pendingForcedInclusionTxs.Delete(key)
    } else if currentState.DAHeight > pending.EpochEnd {
        txsInGracePeriod++
    }
    return true
})

🔒 Security Considerations

12. Grace Period Attack Vector: Malicious Sequencer Could Exploit Dynamic Adjustment

Scenario:

Malicious sequencer fills blocks artificially to increase fullness
Block fullness EMA rises above threshold
Grace period multiplier increases
Sequencer can now delay forced tx inclusion longer without detection

Mitigation:

Document this risk in ADR
Consider maximum absolute grace period (e.g., never > 10 epochs)
Alert when grace period exceeds expected values

13. Good: Malicious Detection is Strict ✅

Location: block/internal/syncing/syncer.go:888

if currentState.DAHeight > graceBoundary {  // Strict >

Correctly uses strict inequality—transactions must be included by the grace boundary, not after.

📊 Metrics Implementation

14. Well-Designed Metrics ✅

Location: block/internal/common/metrics.go:70-72, 189-202

ForcedInclusionTxsInGracePeriod metrics.Gauge   // Current state
ForcedInclusionTxsMalicious     metrics.Counter // Cumulative total

Good choice of metric types:

Gauge for current pending count - allows monitoring grace period health
Counter for malicious txs - tracks cumulative censorship attempts

Suggestion: Add histogram for grace period multiplier distribution:

ForcedInclusionGracePeriodMultiplier metrics.Histogram

🧪 Test Review

15. E2E Test Change is Minimal ⚠️

Location: test/docker-e2e/resiliency_test.go:421

-		}, 10*time.Second, time.Second, "post-recovery transaction %d should be processed", i+1)
+		}, 15*time.Second, time.Second, "post-recovery transaction %d should be processed", i+1)

Timeout increased from 10s to 15s. This might indicate:

Grace period causing slower recovery
Test flakiness

Recommendation: Document why timeout was increased in commit message or comment.

📋 Action Items

Must Address Before Merge:

⚠️ Add comprehensive grace period tests (address Gemini's concern)
⚠️ Document security implications of dynamic adjustment in ADR
⚠️ Add overflow protection to grace boundary calculation
⚠️ Test zero grace period edge case

Should Address:

🔧 Expose grace period multiplier and EMA as Prometheus metrics
🔧 Optimize pending tx iteration (single-pass instead of double)
🔧 Document block fullness heuristic limitations

Nice to Have:

📚 Add sequence diagram to ADR
📚 Document migration path for existing networks
🧪 Add benchmark tests for pending tx processing

✅ Strengths

Well-thought-out design addressing real operational needs
Comprehensive ADR with clear examples
Good use of Go concurrency patterns (atomic pointers, sync.Map)
Thorough unit tests for dynamic adjustment logic
Clean separation of concerns with internal configuration struct
Proper metrics instrumentation

🎓 Learning & Best Practices

This PR demonstrates several Go best practices worth highlighting:

Atomic Operations:

gracePeriodMultiplier := &atomic.Pointer[float64]{}
gracePeriodMultiplier.Store(&initialMultiplier)

Using atomic.Pointer[T] (Go 1.19+) is cleaner than atomic.Value.

EMA Calculation:

newEMA := alpha*blockFullness + (1-alpha)*currentEMA

Proper exponential moving average with smoothing factor.

Sync.Map for Concurrent Access:

pendingForcedInclusionTxs sync.Map // map[string]pendingForcedInclusionTx

Correct choice for concurrent read/write with unknown keys.

📝 Verdict

Overall Assessment: This is a well-designed feature that addresses an important operational concern. The code quality is high, and the ADR documentation is comprehensive. However, the lack of test coverage for the actual grace period enforcement is a critical gap that must be addressed before merge.

Recommended Actions:

✅ Add the missing grace period enforcement tests
✅ Address security documentation in ADR
✅ Fix potential overflow issue
🤔 Consider simplifying by removing dynamic adjustment (or moving to follow-up PR)

Once these issues are addressed, this will be a solid addition to the codebase.

🔗 Related Files Reviewed

✅ block/internal/syncing/syncer.go (+182/-21) - Core implementation
⚠️ block/internal/syncing/syncer_forced_inclusion_test.go (+435/-55) - Tests (needs grace period tests)
✅ block/internal/common/metrics.go (+23/-0) - Metrics
✅ docs/adr/adr-019-forced-inclusion-mechanism.md (+162/-109) - Documentation
✅ test/docker-e2e/resiliency_test.go (+2/-1) - E2E test

gemini-code-assist · 2025-12-09T14:55:26Z

Summary of Changes

Hello @julienrbrt, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant improvement to the forced inclusion mechanism by implementing a configurable grace period. This grace period allows for a delay in marking a sequencer as malicious if forced inclusion transactions are not immediately included, providing crucial resilience against temporary Data Availability (DA) layer outages or network congestion. The change ensures that the system can tolerate transient issues without compromising censorship resistance, striking a better balance between security and operational reliability.

Highlights

Grace Period Implementation: Introduced a configurable "ForcedInclusionGracePeriod" to allow forced inclusion transactions to be included across multiple epochs before being marked as malicious.
Enhanced Malicious Detection: The system now flags a sequencer as malicious only if forced inclusion transactions are not included past their grace boundary, providing tolerance for temporary DA unavailability.
New Metrics: Added "ForcedInclusionTxsInGracePeriod" and "ForcedInclusionTxsMalicious" metrics to monitor the state of forced inclusion transactions.
Updated Documentation: The ADR-019 (Forced Inclusion Mechanism) has been comprehensively updated to detail the grace period, its calculation, configuration, benefits, and security implications.
Genesis Configuration: The "Genesis" configuration now includes "ForcedInclusionGracePeriod" with a default value of 1, allowing networks to customize this critical parameter.
Improved Test Coverage: A new test case ("TestVerifyForcedInclusionTxs_SmoothingExceedsEpoch") was added to specifically validate the grace period logic, alongside updates to existing tests.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a grace period for forced transaction inclusion, which is a great enhancement for network resilience against temporary DA layer issues. The implementation logic in the syncer, along with the new metrics and the extensive documentation updates in the ADR, are well-executed. My primary concern is the lack of test coverage for the new grace period functionality. The existing and newly added tests only cover the scenario where the grace period is zero, failing to validate the core behavior of the feature. I've added a comment with a suggestion to improve test coverage.

block/internal/syncing/syncer_forced_inclusion_test.go

codecov · 2025-12-09T15:04:48Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.53%. Comparing base (09b7efd) to head (02f8038).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2915      +/-   ##
==========================================
- Coverage   65.94%   63.53%   -2.41%     
==========================================
  Files          87       90       +3     
  Lines        7990     8584     +594     
==========================================
+ Hits         5269     5454     +185     
- Misses       2147     2547     +400     
- Partials      574      583       +9

Flag	Coverage Δ
combined	`63.53% <100.00%> (-2.41%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-12-11T20:25:36Z

PR Preview Action v1.6.3
Preview removed because the pull request was closed.
2025-12-11 20:25 UTC

* main: chore: execute goimports to format the code (#2924) refactor(block)!: remove GetLastState from components (#2923) feat(syncing): add grace period for missing force txs inclusion (#2915) chore: minor improvement for docs (#2918) feat: DA Client remove interface part 2, add client for celestia blob api (#2909) chore: update rust deps (#2917) feat(sequencers/based): add based batch time (#2911) build(deps): Bump golangci/golangci-lint-action from 9.1.0 to 9.2.0 (#2914) refactor(sequencers): implement batch position persistance (#2908)

feat(syncing): add grace period for missing force txs inclusion

a680309

github-actions bot assigned julienrbrt Dec 9, 2025

julienrbrt mentioned this pull request Dec 9, 2025

Force inclusion follow ups #2906

Open

13 tasks

gemini-code-assist bot reviewed Dec 9, 2025

View reviewed changes

block/internal/syncing/syncer_forced_inclusion_test.go Show resolved Hide resolved

cleanup

74b7b5f

julienrbrt marked this pull request as draft December 10, 2025 09:34

julienrbrt added 6 commits December 10, 2025 10:53

Merge branch 'main' into julien/validate-loosen

78449af

Add dynamic forced inclusion grace period

3a29a38

Update adr-019-forced-inclusion-mechanism.md

5985a11

wording

d2f0943

simplify

2091e13

simplify

c45fff6

julienrbrt marked this pull request as ready for review December 11, 2025 13:08

Merge branch 'main' into julien/validate-loosen

02f8038

julienrbrt requested review from alpe, chatton and tac0turtle December 11, 2025 13:10

tac0turtle approved these changes Dec 11, 2025

View reviewed changes

julienrbrt merged commit e867b3f into main Dec 11, 2025
28 of 31 checks passed

julienrbrt deleted the julien/validate-loosen branch December 11, 2025 20:24

claude bot mentioned this pull request Dec 12, 2025

fix(block): fix init logic sequencer for da epoch fetching #2926

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(syncing): add grace period for missing force txs inclusion #2915

feat(syncing): add grace period for missing force txs inclusion #2915

Uh oh!

julienrbrt commented Dec 9, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 9, 2025 •

edited

Loading

Uh oh!

claude bot commented Dec 9, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Dec 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

codecov bot commented Dec 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(syncing): add grace period for missing force txs inclusion #2915

feat(syncing): add grace period for missing force txs inclusion #2915

Uh oh!

Conversation

julienrbrt commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Uh oh!

github-actions bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comprehensive Code Review: Grace Period for Forced Inclusion

🎯 Summary

🔴 Critical Issues

1. Insufficient Test Coverage for Grace Period Logic ⚠️

2. Potential Integer Overflow in Grace Boundary Calculation ⚠️

🟡 Architecture & Design Concerns

3. Dynamic Grace Period May Be Too Complex 🤔

4. Block Fullness Calculation is Heuristic and May Be Inaccurate

🟢 Code Quality & Go Best Practices

5. Excellent Use of Atomic Pointers for Concurrent Access ✅

6. Configuration Design is Clean ✅

🔵 Documentation & Testing

7. ADR Updates are Comprehensive ✅

8. Good Test Coverage for Dynamic Adjustment ✅

9. Missing Edge Case: Zero Grace Period

🟣 Performance & Efficiency

10. Efficient Metrics Updates ✅

11. Potential Performance Issue: Pending Tx Iteration

🔒 Security Considerations

12. Grace Period Attack Vector: Malicious Sequencer Could Exploit Dynamic Adjustment

13. Good: Malicious Detection is Strict ✅

📊 Metrics Implementation

14. Well-Designed Metrics ✅

🧪 Test Review

15. E2E Test Change is Minimal ⚠️

📋 Action Items

Must Address Before Merge:

Should Address:

Nice to Have:

✅ Strengths

🎓 Learning & Best Practices

📝 Verdict

🔗 Related Files Reviewed

Uh oh!

gemini-code-assist bot commented Dec 9, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

codecov bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

julienrbrt commented Dec 9, 2025 •

edited

Loading

github-actions bot commented Dec 9, 2025 •

edited

Loading

claude bot commented Dec 9, 2025 •

edited

Loading

codecov bot commented Dec 9, 2025 •

edited

Loading