Skip to content

Conversation

@lklimek
Copy link
Collaborator

@lklimek lklimek commented Feb 19, 2025

Issue being fixed or feature implemented

We need state sync to be working properly. During testing, several issues arised, like:

  • deadlock
  • on genesis, state sync never timeouts
  • light client does not verify data correctly due to bug in QuorumVerify

What was done?

  • fix: deadlock in state sync
  • fix: QuorumVerify call has invalid order of operations
  • e2e: enable p2p state sync in rotate test
  • on statesync timeout, fall back to block sync
  • config: removed unused trust-height, trust-hash, trust-period from [statesync] section
  • config: added retries setting to [statesync] section
  • added safe integer type casting code

How Has This Been Tested?

Added statesync to p2p tests, also tested on dash platform mainnet as part of platform state sync effort.

Breaking Changes

Non-breaking changes in config file

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

Summary by CodeRabbit

  • New Features

    • Enhanced state synchronization with a built-in retry mechanism and improved snapshot handling.
    • Added a query to reveal the current consensus height for better visibility of network status.
    • Introduced a mock implementation for the ConsensusStateProvider interface to facilitate testing.
    • Updated state synchronization configurations to utilize a "p2p" method for specific nodes.
  • Refactor

    • Streamlined configuration settings by simplifying parameters and updating default connection options.
    • Adjusted logging verbosity across components for more granular and informative system monitoring.
    • Updated the handling of chunk management and error reporting within the synchronization process.
    • Transitioned to a more unified approach in initializing light clients across state providers.
    • Improved integer overflow handling and type-safe conversions in math utilities.
    • Enhanced type safety and validation in quorum-related data handling.
  • Chores

    • Upgraded several external dependencies to enhance overall performance and stability.
  • Tests

    • Expanded test coverage for synchronization, concurrency controls, and validator functionality.
    • Introduced new tests for the LockGuard and RLockGuard functions to ensure robust mutex handling.
    • Added a new test to validate hash generation from a given validator threshold public key and quorum hash.
    • Enhanced synchronization tests to support new retry logic and consensus state integration.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 19, 2025

Walkthrough

This pull request implements a series of updates across multiple modules. Trust parameters in state synchronization have been removed and replaced with a retry mechanism. A new ConsensusStateProvider interface (with a corresponding mock) is introduced and integrated into the reactor, while various logging levels have been adjusted for finer-grained tracing. Additional changes include enhanced chunk management, mutex guard utilities, safe math conversion functions with overflow checks, and several test updates. Minor documentation and configuration corrections are also applied throughout the codebase.

Changes

File(s) Change Summary
config/config.go, config/toml.go Removed trust parameters; added Retries field/parameter; updated default configuration and validation for state sync.
internal/statesync/reactor.go, internal/statesync/mocks/consensusstateprovider.go Introduced new ConsensusStateProvider interface, updated reactor signature, and provided a mock implementation.
internal/blocksync/reactor.go, internal/evidence/reactor.go, internal/mempool/reactor.go, internal/statesync/peer.go Changed logging levels (Debug/Info → Trace) to provide more granular output during peer and event processing.
internal/consensus/state.go, internal/consensus/state_data.go Added GetCurrentHeight method to State; corrected a comment typo.
internal/libs/sync/mutexguard.go, internal/libs/sync/mutexguard_test.go Added LockGuard and RLockGuard functions with associated interfaces and tests for safe mutex handling.
internal/statesync/chunks.go, internal/statesync/chunks_test.go Enhanced chunk management with new filename handling, pending checks, helper methods, and improved error messages in tests.
internal/p2p/transport_mconn.go, internal/statesync/stateprovider.go Updated default port value (26657 → 26656) and modified RPC/P2P state provider initialization (removing trust parameters).
internal/statesync/syncer.go, internal/statesync/syncer_test.go Modified SyncAny and requestChunk signatures to include retry logic and an initialHeight parameter; adjusted error handling and logging.
libs/math/safemath.go, libs/math/safemath_test.go Added safe conversion functions, new overflow error variable, and corresponding tests for robust integer conversion.
test/e2e/networks/rotate.toml, test/e2e/pkg/mockcoreserver/core_server.go, test/e2e/pkg/mockcoreserver/methods.go, test/e2e/runner/setup.go, test/e2e/runner/start.go, types/validator_test.go Updated state sync settings from commented-out to active "p2p"; improved height conversion and unmarshalling in mocks; removed dynamic config updates; cleaned up variable scope in network start; and added a validator set hash test.

Possibly related PRs

Suggested reviewers

  • QuantumExplorer
  • shumkov

Poem

Hopping through the lines of code, I cheer,
Trust is replaced with retries, now crystal clear.
New interfaces and tests make my heart race,
Logs now trace each step with elegant grace.
In the coding burrow, improvements gleam bright—
A bunny’s delight in every change tonight!
🐇✨

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (1.64.8)

Error: you are using a configuration file for golangci-lint v2 with golangci-lint v1: please use golangci-lint v2
Failed executing command with error: you are using a configuration file for golangci-lint v2 with golangci-lint v1: please use golangci-lint v2


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 723556e and d5d2adc.

📒 Files selected for processing (1)
  • internal/statesync/reactor.go (22 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/statesync/reactor.go
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: tests (04)
  • GitHub Check: tests (01)
  • GitHub Check: tests (02)
  • GitHub Check: tests (00)
  • GitHub Check: tests (03)
  • GitHub Check: Super linter
  • GitHub Check: e2e-test (dashcore)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: Build (amd64, linux)
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@lklimek lklimek marked this pull request as ready for review February 19, 2025 13:22
@lklimek lklimek changed the title Feat/statesync improvements fix(statesync): statesync is unstable and doesn't time out Feb 19, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (12)
libs/math/safemath.go (2)

164-199: Consider using type constraints for better readability.

The implementation is good, but readability could be improved by using a more descriptive type constraint name.

-type Integer interface {
+type Number interface {
   ~int | ~int8 | ~int16 | ~int32 | ~int64 | ~uint | ~uint8 | ~uint16 | ~uint32 | ~uint64
 }

-func MustConvert[FROM Integer, TO Integer](a FROM) TO {
+func MustConvert[FROM Number, TO Number](a FROM) TO {

263-312: Improve panic messages in Max/Min functions.

The implementation is good, but the panic messages could be more descriptive.

-       panic("unsupported type")
+       panic(fmt.Sprintf("unsupported type %T", max))
libs/math/safemath_test.go (1)

89-207: LGTM! Comprehensive test coverage.

The test cases are thorough and well-structured, covering various integer types and edge cases. Consider adding a helper function to reduce the nested switch statements and improve maintainability.

+func convertValue[F Integer, T Integer](from F, want T) (interface{}, error) {
+    return SafeConvert[F, T](from)
+}

 func TestSafeConvert(t *testing.T) {
     // ... test cases ...
     for i, tc := range testCases {
         testName := fmt.Sprintf("%d:%T(%d)-%T(%d)", i, tc.from, tc.from, tc.want, tc.want)
         t.Run(testName, func(t *testing.T) {
-            var result interface{}
-            var err error
-            switch from := tc.from.(type) {
-            case int:
-                switch tc.want.(type) {
-                case int64:
-                    result, err = SafeConvert[int, int64](from)
+            result, err := convertValue(tc.from, tc.want)
types/validator_test.go (5)

119-126: Consider parameterizing test vectors.

The test uses hardcoded test vectors. Consider parameterizing them into a table-driven test to make it easier to add more test cases in the future.

-func TestValidatorSetHashVectors(t *testing.T) {
-	thresholdPublicKey, err := base64.RawStdEncoding.DecodeString("gw5F5F5kFNnWFUc8woFOaxccUI+cd+ixaSS3RZT2HJlWpvoWM16YRn6sjYvbdtGH")
-	require.NoError(t, err)
-
-	quorumHash, err := hex.DecodeString("703ee5bfc78765cc9e151d8dd84e30e196ababa83ac6cbdee31a88a46bba81b9")
-	require.NoError(t, err)
-
-	expected := "81742F95E99EAE96ABC727FE792CECB4996205DE6BFC88AFEE1F60B96BC648B2"
+func TestValidatorSetHashVectors(t *testing.T) {
+	testCases := []struct {
+		name               string
+		thresholdPublicKey string
+		quorumHash        string
+		expectedHash      string
+	}{
+		{
+			name:               "valid hash generation",
+			thresholdPublicKey: "gw5F5F5kFNnWFUc8woFOaxccUI+cd+ixaSS3RZT2HJlWpvoWM16YRn6sjYvbdtGH",
+			quorumHash:        "703ee5bfc78765cc9e151d8dd84e30e196ababa83ac6cbdee31a88a46bba81b9",
+			expectedHash:      "81742F95E99EAE96ABC727FE792CECB4996205DE6BFC88AFEE1F60B96BC648B2",
+		},
+	}
+
+	for _, tc := range testCases {
+		t.Run(tc.name, func(t *testing.T) {
+			thresholdPublicKey, err := base64.RawStdEncoding.DecodeString(tc.thresholdPublicKey)
+			require.NoError(t, err)
+
+			quorumHash, err := hex.DecodeString(tc.quorumHash)
+			require.NoError(t, err)

128-134: Add test documentation and error cases.

The test function lacks documentation and error test cases. Consider:

  1. Adding a function comment explaining the test's purpose
  2. Including error cases for invalid inputs
+// TestValidatorSetHashVectors verifies that the ValidatorSet hash is correctly computed
+// from a given threshold public key and quorum hash.
 func TestValidatorSetHashVectors(t *testing.T) {
     // ... test cases setup ...
+    // Add error cases
+    t.Run("invalid public key", func(t *testing.T) {
+        valset := ValidatorSet{
+            ThresholdPublicKey: bls12381.PubKey([]byte("invalid")),
+            QuorumHash:         quorumHash,
+        }
+        assert.Panics(t, func() { valset.Hash() })
+    })

292-306: Improve error handling in Sync method.

The error handling for errNoSnapshots is good, but consider adding metrics or telemetry to track fallback occurrences.

 if errors.Is(err, errNoSnapshots) && r.postSyncHook != nil {
     r.logger.Warn("no snapshots available; falling back to block sync", "err", err)
+    if r.metrics != nil {
+        r.metrics.StateSyncFallbacks.Add(1)
+    }

1068-1104: LGTM! Enhanced snapshot safety checks.

Good addition of safety checks to ensure snapshots are only accepted when the next block is finalized.

This change improves system reliability by preventing potential state inconsistencies during snapshot synchronization.


877-877: Use safe integer conversion.

Good use of tmmath.MustConvertInt64 for safe integer conversion, but consider handling the potential panic.

-cp, err := r.stateStore.LoadConsensusParams(tmmath.MustConvertInt64(msg.Height))
+height, err := tmmath.SafeConvertInt64(msg.Height)
+if err != nil {
+    r.logger.Error("invalid height in params request",
+        "height", msg.Height,
+        "error", err)
+    return nil
+}
+cp, err := r.stateStore.LoadConsensusParams(height)
internal/libs/sync/mutexguard_test.go (4)

14-16: Consider increasing the timeout duration for CI environments.

While 1 second is reasonable for local testing, it might be too short for CI environments or slower machines. Consider making it configurable via an environment variable.

 const (
-	timeout = 1 * time.Second
+	timeout = getTestTimeout()
 )
+
+func getTestTimeout() time.Duration {
+	if t := os.Getenv("TEST_TIMEOUT"); t != "" {
+		if d, err := time.ParseDuration(t); err == nil {
+			return d
+		}
+	}
+	return 1 * time.Second
+}

18-46: Consider adding edge cases to strengthen the test.

The test covers basic unlock scenarios well. Consider adding these edge cases:

  1. Panic recovery when unlocking an already unlocked mutex
  2. Concurrent unlock attempts from multiple goroutines
 func TestLockGuardMultipleUnlocks(t *testing.T) {
+	t.Run("panic recovery", func(t *testing.T) {
+		var mtx deadlock.Mutex
+		mtx.Lock()
+		mtx.Unlock()
+		assert.NotPanics(t, func() {
+			unlock := tmsync.LockGuard(&mtx)
+			unlock()
+		})
+	})
+
+	t.Run("concurrent unlocks", func(t *testing.T) {
+		var mtx deadlock.Mutex
+		unlock := tmsync.LockGuard(&mtx)
+		var wg sync.WaitGroup
+		for i := 0; i < 10; i++ {
+			wg.Add(1)
+			go func() {
+				defer wg.Done()
+				assert.False(t, unlock())
+			}()
+		}
+		wg.Wait()
+	})

70-105: Verify concurrent read access.

The test should verify that multiple reads can occur simultaneously. Consider adding a channel-based synchronization to confirm this behavior.

 func TestRLockGuard(t *testing.T) {
 	var mtx deadlock.RWMutex
 	var counter int
 	var wg sync.WaitGroup
+	reading := make(chan struct{})
+	done := make(chan struct{})
 
 	read := func() {
 		defer wg.Done()
 		unlock := tmsync.RLockGuard(&mtx)
 		defer unlock()
+		reading <- struct{}{} // Signal that we're reading
+		<-done               // Wait for verification
 		_ = counter
 	}
+
+	// Verify concurrent reads
+	go func() {
+		// Wait for at least 2 readers
+		<-reading
+		<-reading
+		close(done)
+	}()

107-120: Make the helper more test-friendly.

Instead of panicking, consider returning an error to allow for better test control and error handling.

-func waitFor(f func()) {
+func waitFor(f func()) error {
 	done := make(chan struct{})
 	go func() {
 		f()
 		close(done)
 	}()
 
 	select {
 	case <-time.After(timeout):
-		panic("Test timed out")
+		return fmt.Errorf("test timed out after %v", timeout)
 	case <-done:
+		return nil
 	}
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6340e50 and fab980a.

📒 Files selected for processing (5)
  • internal/libs/sync/mutexguard_test.go (1 hunks)
  • internal/statesync/reactor.go (22 hunks)
  • libs/math/safemath.go (5 hunks)
  • libs/math/safemath_test.go (2 hunks)
  • types/validator_test.go (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
libs/math/safemath_test.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: libs/math/safemath.go:123-161
Timestamp: 2025-01-20T13:38:10.707Z
Learning: The `SafeConvert` function in `libs/math/safemath.go` correctly handles negative values during integer type conversion through two mechanisms: explicit checks for int64 to uint64 conversion, and the Min[U]() check that catches negative values when converting to unsigned types.
⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: test_apps
  • GitHub Check: test_abci_cli
  • GitHub Check: Super linter
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
🔇 Additional comments (12)
libs/math/safemath.go (2)

9-9: LGTM! Good error consolidation.

Consolidating multiple specific overflow errors into a single ErrOverflow simplifies error handling while maintaining descriptive error messages.


117-158: LGTM! Robust generic implementation.

The SafeConvert function is well-implemented with:

  • Efficient platform-specific size checks using constants
  • Proper handling of edge cases for int64/uint64
  • Comprehensive overflow checks using Max/Min
types/validator_test.go (3)

72-73: LGTM! Good practice to extract magic number into a constant.

The introduction of minPeers constant improves code maintainability by centralizing the minimum peer requirement.


141-148: LGTM! Well-documented interface definition.

The ConsensusStateProvider interface is well-documented and follows the interface segregation principle with minimal required methods.


1000-1000: LGTM! Improved logging granularity.

Good change from Info to Trace level for peer updates, which reduces log noise while maintaining observability.

Also applies to: 1009-1009

internal/libs/sync/mutexguard_test.go (2)

1-12: LGTM! Well-organized imports and package declaration.

The package is correctly named sync_test for external tests, and imports are properly organized with standard library packages first, followed by third-party and internal packages.


48-68:

❓ Verification inconclusive

Add race condition verification.

While the test correctly verifies concurrent access, it would be beneficial to explicitly run it with the race detector and verify the absence of data races.


🏁 Script executed:

#!/bin/bash
# Description: Check if tests pass with race detection enabled
go test -race ./internal/libs/sync/...

Length of output: 139


Update test instructions to enable race detection by setting CGO_ENABLED=1

While the test correctly verifies concurrent access, it's recommended to run it with the race detector to confirm the absence of data races. However, note that the race detector requires cgo to be enabled. Please update your test command accordingly:

  • Run the tests using:
    CGO_ENABLED=1 go test -race ./internal/libs/sync/...
internal/statesync/reactor.go (5)

24-24: LGTM! Good improvements to code organization.

The changes improve code maintainability by:

  1. Adding safe integer conversion utilities via tmmath
  2. Centralizing the minimum peer requirement into a constant instead of using magic numbers

Also applies to: 72-73


138-138: Great use of interface-based design!

The introduction of ConsensusStateProvider interface improves:

  1. Testability by enabling mock implementations
  2. Modularity by decoupling the reactor from concrete consensus state
  3. Code clarity through focused, well-documented interface

Also applies to: 141-148


292-306: Excellent error handling and fallback mechanism!

The changes improve system resilience by:

  1. Gracefully handling the case when no snapshots are available
  2. Providing a fallback to block sync
  3. Using clear, informative error messages

1068-1104: Robust snapshot safety improvements!

The changes enhance data consistency by:

  1. Adding consensus state validation
  2. Ensuring snapshots are only accepted when the next block is finalized
  3. Including helpful debug logging

1000-1000: Appropriate adjustment of logging levels!

Changed peer update logging from Info to Trace level, which:

  1. Reduces log noise for frequent events
  2. Maintains consistency with other reactor implementations
  3. Still provides detailed debugging capability when needed

Also applies to: 1009-1009

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
libs/math/safemath.go (3)

93-111: Add parameter documentation to conversion functions.

The functions are well-implemented, but the parameter documentation is missing. Add descriptions for the generic type parameter T.

-// SafeConvertUint64 takes a int and checks if it overflows.
+// SafeConvertUint64 takes a value of type T and converts it to uint64, checking for overflow.
+// Type parameter T must satisfy the Integer constraint.
 func SafeConvertUint64[T Integer](a T) (uint64, error) {

119-158: Simplify the type switch in SafeConvert.

The type switch could be simplified by removing redundant comments and empty returns.

 func SafeConvert[F Integer, T Integer](from F) (T, error) {
     const uintIsSmall = math.MaxUint < math.MaxUint64
     const intIsSmall = math.MaxInt < math.MaxInt64 && math.MinInt > math.MinInt64

     switch any(from).(type) {
     case int64:
         if _, ok := any(T(0)).(uint64); ok && from < 0 {
             return 0, ErrOverflow
         }
-        // return T(from), nil
     case uint64:
         if _, ok := any(T(0)).(int64); ok && uint64(from) > math.MaxInt64 {
             return 0, ErrOverflow
         }
-        // return T(from), nil
     case int:
         if !intIsSmall {
             return SafeConvert[int64, T](int64(from))
         }
-        // no return here - it's safe to use normal logic
     case uint:
         if !uintIsSmall {
             return SafeConvert[uint64, T](uint64(from))
         }
-        // no return here - it's safe to use normal logic
     }

263-312: Make panic messages consistent between Max and Min.

The error messages for unsupported types should be consistent:

  • Max uses: unsupported type %T
  • Min uses: unsupported type
-               panic("unsupported type")
+               panic(fmt.Sprintf("unsupported type %T", T(0)))
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fab980a and 2e8eee5.

📒 Files selected for processing (1)
  • libs/math/safemath.go (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: test_apps
  • GitHub Check: test_abci_cli
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: tests (01)
  • GitHub Check: e2e-test (dashcore)
  • GitHub Check: Super linter
🔇 Additional comments (2)
libs/math/safemath.go (2)

9-9: LGTM! Good error consolidation.

Consolidating multiple specific overflow errors into a single ErrOverflow reduces code duplication and improves maintainability.


26-28: LGTM! Consistent error handling.

The error handling is consistently updated across all functions to use the consolidated ErrOverflow.

Also applies to: 36-38, 66-68, 76-78, 86-88, 224-226, 234-236

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
libs/math/safemath_test.go (2)

89-207: LGTM! Comprehensive test coverage for safe integer conversions.

The test suite thoroughly validates integer type conversions, including edge cases and boundary conditions. The table-driven approach with descriptive test names enhances maintainability.

Consider these improvements:

  1. Add test cases for:
    • Conversions between uint16/uint8
    • Conversions to/from custom integer types
  2. Extract type conversion logic into helper functions to reduce switch statement complexity:
func convertValue[From, To any](t *testing.T, i int, from From) (To, error) {
    switch any(from).(type) {
    case int:
        return SafeConvert[From, To](from)
    // ... other cases
    default:
        t.Fatalf("unsupported source type %T", from)
        return *new(To), nil
    }
}

209-213: Add more test cases for panic behavior.

While the current test cases cover basic scenarios, consider adding more cases to thoroughly test panic behavior:

func TestMustConvertPanics(t *testing.T) {
    assert.NotPanics(t, func() { MustConvert[int32, int32](0) })
    assert.Panics(t, func() { MustConvert[int32, int16](math.MaxInt16 + 1) })
    assert.NotPanics(t, func() { MustConvert[int32, int16](math.MaxInt16) })
    
    // Additional test cases
    assert.Panics(t, func() { MustConvert[int64, uint64](math.MinInt64) })
    assert.Panics(t, func() { MustConvert[uint64, int64](math.MaxUint64) })
    assert.NotPanics(t, func() { MustConvert[int16, uint32](math.MaxInt16) })
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e8eee5 and 55d9001.

📒 Files selected for processing (1)
  • libs/math/safemath_test.go (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
libs/math/safemath_test.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: libs/math/safemath.go:123-161
Timestamp: 2025-01-20T13:38:10.707Z
Learning: The `SafeConvert` function in `libs/math/safemath.go` correctly handles negative values during integer type conversion through two mechanisms: explicit checks for int64 to uint64 conversion, and the Min[U]() check that catches negative values when converting to unsigned types.
⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: tests (01)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: golangci-lint
  • GitHub Check: Build (amd64, linux)
  • GitHub Check: e2e-test (dashcore)
  • GitHub Check: Super linter

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
internal/statesync/reactor.go (1)

1068-1104: Document the block height requirement.

While the code comments explain the need for finalized blocks, consider adding more detailed documentation about:

  1. Why exactly 2 blocks are needed
  2. The relationship between snapshot height and block finalization
  3. The potential edge cases this prevents
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 55d9001 and 8eeccd4.

📒 Files selected for processing (1)
  • internal/statesync/reactor.go (22 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: test_abci_cli
  • GitHub Check: test_apps
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: tests (01)
  • GitHub Check: Super linter
  • GitHub Check: e2e-test (dashcore)
  • GitHub Check: golangci-lint
🔇 Additional comments (5)
internal/statesync/reactor.go (5)

24-24: LGTM! Good addition of safe math utilities.

The addition of tmmath import enables safe integer conversions, helping prevent potential overflow issues.


72-73: LGTM! Good constant extraction.

Extracting minPeers into a constant improves maintainability by centralizing the minimum peer requirement.


141-148: LGTM! Well-designed interface for improved testability.

The ConsensusStateProvider interface is well-documented and follows interface segregation principle by including only the required methods.


138-138: LGTM! Good dependency inversion.

Replacing the concrete type with ConsensusStateProvider interface follows the Dependency Inversion Principle, improving testability and flexibility.


292-306:

✅ Verification successful

Verify the error variable definition.

The error handling for no snapshots case looks good, but please ensure that errNoSnapshots is properly defined.


🏁 Script executed:

#!/bin/bash
# Search for errNoSnapshots definition
rg "errNoSnapshots.*=.*" internal/statesync/

Length of output: 240


Verified: errNoSnapshots is properly defined.

The error variable is defined in internal/statesync/syncer.go as errNoSnapshots = errors.New("no suitable snapshots found"), so the handling in internal/statesync/reactor.go is correct.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
internal/statesync/reactor.go (1)

877-877: Consider using non-panicking integer conversion.

While tmmath.MustConvertInt64 prevents overflow issues, it can panic on invalid input. Consider using the non-panicking version that returns an error which can be handled gracefully.

Example approach:

-height := tmmath.MustConvertInt64(msg.Height)
+height, err := tmmath.SafeConvertInt64(msg.Height)
+if err != nil {
+    return fmt.Errorf("invalid height: %w", err)
+}

Also applies to: 1101-1101, 1120-1120

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8eeccd4 and a61dfd6.

📒 Files selected for processing (1)
  • internal/statesync/reactor.go (22 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (11)
  • GitHub Check: test_abci_cli
  • GitHub Check: test_apps
  • GitHub Check: build
  • GitHub Check: tests (05)
  • GitHub Check: tests (03)
  • GitHub Check: tests (02)
  • GitHub Check: tests (01)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: golangci-lint
  • GitHub Check: e2e-test (dashcore)
  • GitHub Check: Super linter
🔇 Additional comments (5)
internal/statesync/reactor.go (5)

71-74: LGTM! Good refactoring of the minimum peers constant.

Extracting the hardcoded value into a named constant improves maintainability and makes the code more self-documenting.


141-148: LGTM! Well-designed interface abstraction.

The ConsensusStateProvider interface improves testability and follows the Interface Segregation Principle by defining only the required methods.


292-306: LGTM! Improved error handling with graceful fallback.

The enhanced error handling now:

  1. Distinguishes between no snapshots and other errors
  2. Provides a graceful fallback to block sync when no snapshots are available
  3. Uses clear and informative error messages

1068-1104: LGTM! Enhanced snapshot safety with finalization check.

The addition of the finalization check is a critical safety improvement that ensures:

  1. Only snapshots with finalized next blocks are accepted
  2. The node won't sync to unfinalized state
  3. Appropriate debug logging for troubleshooting

1000-1000: LGTM! Appropriate adjustment of logging levels.

Changed peer update logging from Info to Trace level, which is more appropriate for these frequent events and aligns with similar changes in other reactor files.

Also applies to: 1009-1009

@lklimek lklimek mentioned this pull request Mar 4, 2025
5 tasks
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale label Apr 20, 2025
@lklimek lklimek removed the Stale label Apr 24, 2025
@lklimek lklimek merged commit 33cda2d into v1.5-dev Apr 25, 2025
19 checks passed
@lklimek lklimek deleted the feat/statesync-improvements branch April 25, 2025 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants