-
Notifications
You must be signed in to change notification settings - Fork 4.1k
refactor(cosmovisor)!: re-implement core logic to address reliability issues #24821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Ironbird - launch a networkTo use Ironbird, you can use the following commands:
Custom Load Test ConfigurationYou can provide a custom load test configuration using the `--load-test-config=` flag:Use |
📝 WalkthroughWalkthroughThis set of changes introduces a major refactor and enhancement of Cosmovisor's upgrade management, particularly for manual upgrades. The core process management, upgrade detection, and watcher logic have been rewritten with new internal modules, improved configuration handling, and new CLI commands. The upgrade flow now robustly supports both governance-triggered and manually scheduled upgrades, with extensive new tests and documentation updates. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Cosmovisor
participant Node
participant FileSystem
participant RPC
User->>Cosmovisor: add-upgrade (manual or batch)
Cosmovisor->>FileSystem: Write manual upgrade batch (JSON)
User->>Cosmovisor: run (start node)
Cosmovisor->>Node: Launch node process
loop Watch for upgrade
Cosmovisor->>FileSystem: Watch/Read upgrade-info.json & manual batch
Cosmovisor->>RPC: Poll /block endpoint for height
alt Height matches manual upgrade
Cosmovisor->>Node: Shutdown node
Cosmovisor->>FileSystem: Set current binary to upgrade binary
Cosmovisor->>Node: Restart node with upgraded binary
else Height matches governance upgrade
Cosmovisor->>Node: Shutdown node
Cosmovisor->>FileSystem: Set current binary to upgrade binary
Cosmovisor->>Node: Restart node with upgraded binary
end
end
Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
✨ Finishing Touches
🧪 Generate Unit Tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 18
♻️ Duplicate comments (28)
simapp/upgrades.go (1)
40-54: Address integer conversion security issue.The conversion from
uint64toint64on line 48 lacks bounds checking, which could cause integer overflow if the parsed height exceedsmath.MaxInt64.Apply this fix to add bounds checking:
+import ( + "math" + // ... other imports +) if manualUpgrade, ok := os.LookupEnv("SIMAPP_MANUAL_UPGRADE_HEIGHT"); ok { height, err := strconv.ParseUint(manualUpgrade, 10, 64) if err != nil { panic("invalid SIMAPP_MANUAL_UPGRADE_HEIGHT height: " + err.Error()) } + if height > math.MaxInt64 { + panic("SIMAPP_MANUAL_UPGRADE_HEIGHT exceeds maximum allowed value") + } upgradeInfo = upgradetypes.Plan{ Name: UpgradeName, Height: int64(height), }tools/cosmovisor/internal/watchers/file_poll_watcher.go (1)
16-35: Validate file path to address security concern.The file reading operation could be vulnerable to path traversal attacks if the filename parameter comes from untrusted input.
Consider adding path validation:
+import ( + "path/filepath" + // ... other imports +) func NewFilePollWatcher(ctx context.Context, errorHandler ErrorHandler, filename string, pollInterval time.Duration) Watcher[[]byte] { + // Validate and clean the file path to prevent path traversal + filename = filepath.Clean(filename) + if filepath.IsAbs(filename) { + // Additional validation for absolute paths if needed + } + stat, err := os.Stat(filename)Please verify that the
filenameparameter is properly validated at the caller level and cannot be manipulated by untrusted input.tools/cosmovisor/cmd/cosmovisor/run.go (1)
47-53: Document the rationale for the shutdown goroutine.The goroutine ensures process termination even if context cancellation doesn't exit cleanly. While this may trigger non-determinism warnings, it appears intentional for reliability. Consider adding a comment explaining why this fallback mechanism is necessary.
tools/cosmovisor/internal/watchers/data_watcher.go (1)
11-36: Well-designed generic watcher with proper resource management.The implementation effectively combines generics with the watcher pattern. The goroutine management is appropriate despite the security scanner flag - it's properly bounded by context cancellation and channel closure.
The goroutine spawning is flagged by the security scanner but is necessary for the asynchronous watcher pattern and is properly managed with context cancellation.
tools/cosmovisor/internal/watchers/fsnotify_watcher.go (2)
38-74: Goroutine management is appropriate despite security scanner flag.The goroutine spawning is flagged by the security scanner but is necessary for the asynchronous file watching pattern. The implementation properly handles context cancellation and resource cleanup.
The goroutine is properly bounded by context cancellation and channel closure, making it safe despite the non-determinism warning.
60-60: File inclusion is safe in this context.While the security scanner flags potential file inclusion via variable, this is safe because:
- The filename comes from validated fsnotify events
- It's filtered against a predefined set of allowed filenames
- The file is only read, not executed
The file reading is safe as it's bounded by the filename whitelist and only performs read operations.
tools/cosmovisor/internal/process.go (1)
9-27: Well-designed process management abstraction.The
ProcessRunnerimplementation provides a clean abstraction for managing external processes with proper synchronization via channels and graceful lifecycle management.tools/cosmovisor/internal/watchers/poll_watcher.go (1)
27-53: Proper async polling implementation with context handling.The goroutine implementation correctly:
- Handles context cancellation for graceful shutdown
- Uses ticker for consistent polling intervals
- Closes output channel on exit
- Filters file not found errors appropriately
tools/cosmovisor/internal/watchers/hybrid_watcher.go (1)
20-41: Goroutine management is handled correctly.The goroutine properly:
- Defers channel cleanup
- Handles context cancellation
- Exits on channel closure
- Filters directory updates by filename
This addresses potential non-determinism concerns through proper synchronization and cleanup.
x/upgrade/keeper/keeper.go (3)
44-44: Verify backward compatibility of codec interface change.Changing from
codec.BinaryCodectocodec.Codeccould potentially be a breaking change for consumers who depend on the specific interface type.Verify that all existing usage of the keeper's codec field is compatible with the more general
Codecinterface:#!/bin/bash # Search for direct access to keeper.cdc field rg -A 3 -B 3 "\.cdc\." --type go # Search for keeper construction to see what codec types are passed rg -A 5 -B 5 "NewKeeper.*codec" --type go
540-540: LGTM: Improved JSON marshaling consistency.Using the keeper's codec for JSON marshaling instead of the standard library ensures consistency with the rest of the codebase and proper protobuf handling.
610-622: Address static analysis warnings about unused assignments.The static analysis correctly identifies that
manualUpgradeInfoassignments are not read elsewhere, which could indicate missing functionality or dead code.Verify that the manual upgrade functionality is complete:
#!/bin/bash # Search for code that reads the manual upgrade info rg -A 5 -B 5 "GetManualUpgrade\|manualUpgradeInfo" --type go # Search for where manual upgrades are processed/applied rg -A 5 -B 5 "manual.*upgrade.*apply\|apply.*manual.*upgrade" --type gotools/cosmovisor/cmd/mock_node/main.go (5)
123-130: Validate file path to prevent directory traversal attacks.The
actualHeightFilepath is constructed from user input without validation, which could allow directory traversal attacks.Consider adding path validation:
actualHeightFile := path.Join(n.homePath, "data", "actual-height") +// Ensure the path is within the expected directory +actualHeightFile = path.Clean(actualHeightFile) +if !strings.HasPrefix(actualHeightFile, path.Clean(n.homePath)) { + return fmt.Errorf("invalid height file path") +}
152-152: Use more restrictive file permissions.-err := os.WriteFile(actualHeightFile, []byte(fmt.Sprintf("%d", n.height)), 0o644) +err := os.WriteFile(actualHeightFile, []byte(fmt.Sprintf("%d", n.height)), 0o600)
178-178: Use more restrictive directory permissions.-err = os.MkdirAll(path.Dir(upgradeInfoPath), 0o755) +err = os.MkdirAll(path.Dir(upgradeInfoPath), 0o750)
182-182: Use more restrictive file permissions.-err = os.WriteFile(upgradeInfoPath, []byte(out), 0o644) +err = os.WriteFile(upgradeInfoPath, []byte(out), 0o600)
214-216: Configure ReadHeaderTimeout to prevent Slowloris attacks.srv := &http.Server{ - Addr: n.httpAddr, + Addr: n.httpAddr, + ReadHeaderTimeout: 10 * time.Second, }tools/cosmovisor/manual.go (2)
67-71: Ensure deterministic ordering when iterating over map.Map iteration in Go is non-deterministic, which could cause issues in a blockchain context where deterministic behavior is required.
The upgrades are sorted after appending, but consider using a slice-based approach from the start:
-var newUpgrades ManualUpgradeBatch -for _, plan := range planMap { - newUpgrades = append(newUpgrades, plan) -} +// Extract keys and sort them first for deterministic iteration +var keys []string +for name := range planMap { + keys = append(keys, name) +} +sort.Strings(keys) + +var newUpgrades ManualUpgradeBatch +for _, name := range keys { + newUpgrades = append(newUpgrades, planMap[name]) +}
103-103: Use more restrictive file permissions for upgrade configuration.-return os.WriteFile(cfg.UpgradeInfoBatchFilePath(), manualUpgradesData, 0644) +return os.WriteFile(cfg.UpgradeInfoBatchFilePath(), manualUpgradesData, 0600)tools/cosmovisor/internal/upgrader.go (4)
154-154: Subprocess launched with variable.
223-224: Calling the system time.
207-207: Subprocess launched with variable.
235-235: Calling the system time.systemtests/system.go (1)
225-228: Subprocess launched with variable.tools/cosmovisor/internal/runner.go (1)
115-115: Subprocess launched with variable.tools/cosmovisor/args.go (3)
459-482: Current binary upgrade info methods are well-implemented.The methods properly handle missing upgrade info files by returning nil, and the convenience method for extracting the upgrade name is useful.
Regarding the security scan alert about file inclusion at line 462: The path is safely constructed from trusted configuration values (
cfg.Root()and thecurrentLinkconstant), so this is not a security vulnerability.
501-514: Consider logging errors in ReadLastKnownHeight.The method silently returns 0 on any error (file not found, parse error, etc.). While this provides robustness, it might hide configuration issues. Consider at least logging errors for debugging purposes.
Regarding the security scan alert: The file path is safely constructed from trusted configuration values.
516-519: Use more restrictive file permissions.The file contains operational state data and doesn't need to be world-readable. Use 0600 permissions as suggested by the security scan.
Apply this fix:
- return os.WriteFile(filename, []byte(strconv.FormatUint(height, 10)), 0644) + return os.WriteFile(filename, []byte(strconv.FormatUint(height, 10)), 0600)
🧹 Nitpick comments (21)
tools/cosmovisor/internal/errors.go (1)
7-11: Consider simplifying the error message implementation.The error type is well-structured and follows Go conventions. However, since there's no formatting involved, you can simplify the implementation:
func (e ErrRestartNeeded) Error() string { - return fmt.Sprintf("upgrade needed") + return "upgrade needed" }Alternatively, consider a more descriptive message like "daemon restart required due to upgrade".
tools/cosmovisor/internal/watchers/data_watcher_test.go (1)
38-55: Consider using testify's Eventually for more reliable timing.The test uses fixed sleep durations which could lead to flaky tests on slower systems. Consider using
require.Eventuallyto wait for conditions instead of fixed sleeps.Example refactor for waiting on file updates:
- // wait a bit to ensure the watcher has time to pick up the change - // then cancel the context - time.Sleep(time.Second) + // wait for the watcher to pick up the change + require.Eventually(t, func() bool { + select { + case <-dataWatcher.Updated(): + return true + default: + return false + } + }, 5*time.Second, 100*time.Millisecond) cancel()tools/cosmovisor/internal/skip.go (1)
12-12: Address the TODO comment.The TODO comment suggests uncertainty about whether this functionality should be retained. This needs to be resolved before merging.
Do you want me to help analyze the usage of this functionality across the codebase to determine if it's still needed, or should this be tracked in a separate issue?
tools/cosmovisor/cmd/cosmovisor/batch_upgrade.go (1)
105-107: Simplify the defer statement for file closing.The anonymous function wrapper is unnecessary for a simple method call.
- defer func(file *os.File) { - _ = file.Close() - }(file) + defer file.Close()tools/cosmovisor/internal/watchers/http_block_checker.go (1)
82-91: Improve error handling for height parsing.The
strconv.ParseUinterror is not wrapped, making debugging harder.func getHeightFromRPCBlockResponse(bz []byte) (uint64, error) { var response Response err := json.Unmarshal(bz, &response) if err != nil { return 0, fmt.Errorf("failed to unmarshal block response: %w", err) } height := response.Result.Block.Header.Height - return strconv.ParseUint(height, 10, 64) + parsed, err := strconv.ParseUint(height, 10, 64) + if err != nil { + return 0, fmt.Errorf("failed to parse height '%s': %w", height, err) + } + return parsed, nil }tools/cosmovisor/state_machine.puml (1)
1-68: Consider completing or removing commented states.The diagram provides good high-level flow but has extensive commented-out sections (lines 28-66) that detail important upgrade steps like backup, pre-upgrade customization, and signal watching.
Consider either:
- Uncommenting and completing these detailed states if they represent the actual implementation
- Removing them if they're outdated or not implemented
- Adding a note explaining why they're commented
This would improve the diagram's utility as documentation for the actual cosmovisor behavior.
x/upgrade/keeper/keeper.go (1)
624-626: Consider adding validation for manual upgrade retrieval.The getter method returns the raw pointer without any validation or defensive copying.
Consider adding documentation about the lifetime and mutability expectations:
+// GetManualUpgrade returns the current manual upgrade plan, if any. +// Returns nil if no manual upgrade is set. +// The returned plan should not be modified by the caller. func (k *Keeper) GetManualUpgrade() *types.Plan { return k.manualUpgradeInfo }tools/cosmovisor/internal/watchers/height_watcher.go (1)
23-26: Remove the extra blank line.watcher.PollWatcher = NewPollWatcher[uint64](errorHandler, func() (uint64, error) { return watcher.ReadNow() - }, pollInterval)tools/cosmovisor/README.md (6)
55-56: Add missing comma for clarity.-Release branches have the following format `release/cosmovisor/vA.B.x`, where A and B are a number (e.g. +Release branches have the following format `release/cosmovisor/vA.B.x`, where A and B are a number (e.g.,
121-122: Add missing comma after "Note".- is only after the upgrade and does not auto-restart the subprocess after an error occurs. Note restart + is only after the upgrade and does not auto-restart the subprocess after an error occurs. Note, restart
244-245: Add comma after introductory phrase.- upgrade name. -* If neither `cosmovisor/current/upgrade-info.json` nor `data/upgrade-info.json` exist, then `cosmovisor` will wait for + upgrade name. +* If neither `cosmovisor/current/upgrade-info.json` nor `data/upgrade-info.json` exist, then `cosmovisor` will wait for `data/upgrade-info.json` file to trigger an upgrade. * If `cosmovisor/current/upgrade-info.json` doesn't exist but `data/upgrade-info.json` exists, then `cosmovisor` assumes - that whatever is in `data/upgrade-info.json` is a valid upgrade request. In this case `cosmovisor` tries immediately + that whatever is in `data/upgrade-info.json` is a valid upgrade request. In this case, `cosmovisor` tries immediately
298-299: Fix grammar in infinitive construction.-Using the `--upgrade-height` flag allows to specify at which height the binary should be switched, without going via a +Using the `--upgrade-height` flag allows you to specify at which height the binary should be switched, without going via a
349-350: Add comma after conditional clause.- When submitting this as a proposal ensure there are no spaces. An example command using `gaiad` could look like: + When submitting this as a proposal, ensure there are no spaces. An example command using `gaiad` could look like:
380-381: Add comma after introductory phrase.-ensures that no false binary is run, even if someone hacks the server or hijacks the DNS. `go-getter` will always ensure -the downloaded file matches the checksum if it is provided. `go-getter` will also handle unpacking archives into -directories (in this case the download link should point to a `zip` file of all data in the `bin` directory). +ensures that no false binary is run, even if someone hacks the server or hijacks the DNS. `go-getter` will always ensure +the downloaded file matches the checksum if it is provided. `go-getter` will also handle unpacking archives into +directories (in this case, the download link should point to a `zip` file of all data in the `bin` directory).tools/cosmovisor/manual.go (1)
83-88: Simplify conditional logic.The else block after continue is unnecessary.
for _, existing := range manualUpgrades { if existing.Height == height { continue - } else { - newUpgrades = append(newUpgrades, existing) } + newUpgrades = append(newUpgrades, existing) }tests/systemtests/cosmovisor_test.go (2)
38-38: Consider making the legacy binary path configurable.The hardcoded path to the legacy binary could break if the directory structure changes. Consider making this configurable through an environment variable or test parameter.
-legacyBinary := systest.WorkDir + "/binaries/v0.53/simd" +legacyBinary := os.Getenv("LEGACY_BINARY_PATH") +if legacyBinary == "" { + legacyBinary = systest.WorkDir + "/binaries/v0.53/simd" +}
42-43: Voting period might be too short for CI environments.A 5-second voting period could be fragile in slow or loaded CI environments, potentially causing test flakiness.
-votingPeriod := 5 * time.Second // enough time to vote +votingPeriod := 10 * time.Second // enough time to vote in CI environmentstools/cosmovisor/internal/upgrader.go (1)
145-145: Use named constant for file permissions.The magic number
0o100should be replaced with a named constant for better readability.+const executePermission = 0o100 // Execute permission for owner + // Set the execute bit for only the current user // Given: Current user - Group - Everyone // 0o RWX - RWX - RWX oldMode := info.Mode().Perm() -newMode := oldMode | 0o100 +newMode := oldMode | executePermissiontools/cosmovisor/cmd/cosmovisor/mockchain_test.go (1)
200-200: Consider making the expected callback count more flexible.The hardcoded expected callback count could make the test fragile if timing changes or if the test logic is modified.
-require.Equal(t, 8, callbackCount) +require.GreaterOrEqual(t, callbackCount, 8, "should have at least 8 callbacks for all upgrade stages") +require.LessOrEqual(t, callbackCount, 10, "should not have excessive callbacks indicating a problem")systemtests/system.go (1)
219-220: Consider making the Cosmovisor path configurable.The hardcoded path to the Cosmovisor binary could break if the directory structure changes or in different environments.
func (s *SystemUnderTest) cosmovisorPath() string { - return filepath.Join(WorkDir, "binaries", "cosmovisor") + path := os.Getenv("COSMOVISOR_PATH") + if path == "" { + path = filepath.Join(WorkDir, "binaries", "cosmovisor") + } + return path }tools/cosmovisor/internal/runner.go (1)
175-176: Consider logging errors from cleanup operations.Errors from
GetLatestBlockHeightandShutdownare ignored in the defer block, which could hide important issues during cleanup.defer func() { // always check for the latest block height before shutting down so that we have it in the last known height file - _, _ = heightChecker.GetLatestBlockHeight() - _ = processRunner.Shutdown(r.cfg.ShutdownGrace) + if _, err := heightChecker.GetLatestBlockHeight(); err != nil { + r.logger.Warn("Failed to get latest block height during shutdown", "error", err) + } + if err := processRunner.Shutdown(r.cfg.ShutdownGrace); err != nil { + r.logger.Warn("Failed to shutdown process gracefully", "error", err) + } }()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Knowledge Base: Disabled due to Reviews > Disable Knowledge Base setting
⛔ Files ignored due to path filters (2)
tools/cosmovisor/go.modis excluded by!**/*.modtools/cosmovisor/go.sumis excluded by!**/*.sum,!**/*.sum
📒 Files selected for processing (53)
Makefile(1 hunks)simapp/upgrades.go(2 hunks)systemtests/system.go(3 hunks)tests/systemtests/cosmovisor_test.go(1 hunks)tests/systemtests/upgrade_test.go(1 hunks)tools/cosmovisor/CHANGELOG.md(3 hunks)tools/cosmovisor/README.md(14 hunks)tools/cosmovisor/args.go(10 hunks)tools/cosmovisor/args_test.go(1 hunks)tools/cosmovisor/cmd/cosmovisor/add_upgrade.go(4 hunks)tools/cosmovisor/cmd/cosmovisor/batch_upgrade.go(5 hunks)tools/cosmovisor/cmd/cosmovisor/config.go(3 hunks)tools/cosmovisor/cmd/cosmovisor/help.go(1 hunks)tools/cosmovisor/cmd/cosmovisor/help_test.go(1 hunks)tools/cosmovisor/cmd/cosmovisor/init.go(2 hunks)tools/cosmovisor/cmd/cosmovisor/init_test.go(1 hunks)tools/cosmovisor/cmd/cosmovisor/main.go(1 hunks)tools/cosmovisor/cmd/cosmovisor/mockchain_test.go(1 hunks)tools/cosmovisor/cmd/cosmovisor/prepare_upgrade.go(3 hunks)tools/cosmovisor/cmd/cosmovisor/root.go(2 hunks)tools/cosmovisor/cmd/cosmovisor/run.go(3 hunks)tools/cosmovisor/cmd/cosmovisor/run_config.go(2 hunks)tools/cosmovisor/cmd/cosmovisor/show_upgrade.go(1 hunks)tools/cosmovisor/cmd/cosmovisor/version.go(3 hunks)tools/cosmovisor/cmd/mock_node/main.go(1 hunks)tools/cosmovisor/internal/backoff.go(1 hunks)tools/cosmovisor/internal/errors.go(1 hunks)tools/cosmovisor/internal/process.go(1 hunks)tools/cosmovisor/internal/process_test.go(7 hunks)tools/cosmovisor/internal/runner.go(1 hunks)tools/cosmovisor/internal/skip.go(1 hunks)tools/cosmovisor/internal/testing.go(1 hunks)tools/cosmovisor/internal/upgrade.go(2 hunks)tools/cosmovisor/internal/upgrade_test.go(6 hunks)tools/cosmovisor/internal/upgrader.go(1 hunks)tools/cosmovisor/internal/watchers/data_watcher.go(1 hunks)tools/cosmovisor/internal/watchers/data_watcher_test.go(1 hunks)tools/cosmovisor/internal/watchers/file_poll_watcher.go(1 hunks)tools/cosmovisor/internal/watchers/file_poll_watcher_test.go(1 hunks)tools/cosmovisor/internal/watchers/fsnotify_watcher.go(1 hunks)tools/cosmovisor/internal/watchers/height_watcher.go(1 hunks)tools/cosmovisor/internal/watchers/http_block_checker.go(1 hunks)tools/cosmovisor/internal/watchers/hybrid_watcher.go(1 hunks)tools/cosmovisor/internal/watchers/poll_watcher.go(1 hunks)tools/cosmovisor/internal/watchers/watcher.go(1 hunks)tools/cosmovisor/manual.go(1 hunks)tools/cosmovisor/process.go(0 hunks)tools/cosmovisor/scanner.go(0 hunks)tools/cosmovisor/scanner_test.go(3 hunks)tools/cosmovisor/state_machine.puml(1 hunks)x/upgrade/CHANGELOG.md(1 hunks)x/upgrade/abci.go(1 hunks)x/upgrade/keeper/keeper.go(5 hunks)
💤 Files with no reviewable changes (2)
- tools/cosmovisor/process.go
- tools/cosmovisor/scanner.go
🧰 Additional context used
🧬 Code Graph Analysis (18)
tools/cosmovisor/cmd/cosmovisor/root.go (1)
tools/cosmovisor/cmd/cosmovisor/show_upgrade.go (1)
NewShowManualUpgradesCmd(10-36)
tools/cosmovisor/cmd/cosmovisor/prepare_upgrade.go (1)
tools/cosmovisor/internal/upgrade.go (1)
GetBinaryURL(71-81)
tools/cosmovisor/cmd/cosmovisor/run_config.go (1)
tools/cosmovisor/internal/runner.go (1)
RunConfig(260-264)
tools/cosmovisor/cmd/cosmovisor/config.go (2)
tools/cosmovisor/args.go (2)
Config(58-82)GetConfigFromFile(182-223)tools/cosmovisor/flags.go (1)
FlagCosmovisorConfig(9-9)
tools/cosmovisor/internal/watchers/file_poll_watcher.go (2)
tools/cosmovisor/internal/watchers/watcher.go (2)
ErrorHandler(17-22)Watcher(11-14)tools/cosmovisor/internal/watchers/poll_watcher.go (1)
NewPollWatcher(17-25)
simapp/upgrades.go (3)
version/version.go (1)
Name(31-31)log/logger.go (1)
Logger(39-63)math/uint.go (1)
ParseUint(240-246)
tools/cosmovisor/internal/watchers/fsnotify_watcher.go (1)
tools/cosmovisor/internal/watchers/watcher.go (1)
Watcher(11-14)
tools/cosmovisor/cmd/cosmovisor/run.go (3)
tools/cosmovisor/cmd/cosmovisor/run_config.go (1)
RunOption(19-19)tools/cosmovisor/args.go (1)
GetConfigFromFile(182-223)tools/cosmovisor/internal/runner.go (1)
NewRunner(26-32)
tools/cosmovisor/internal/watchers/data_watcher_test.go (4)
tools/cosmovisor/internal/watchers/watcher.go (1)
DebugLoggerErrorHandler(38-40)log/testing.go (1)
NewTestLogger(23-25)tools/cosmovisor/internal/watchers/file_poll_watcher.go (1)
NewFilePollWatcher(10-39)tools/cosmovisor/internal/watchers/data_watcher.go (1)
NewDataWatcher(11-36)
tools/cosmovisor/internal/skip.go (1)
tools/cosmovisor/flags.go (1)
FlagSkipUpgradeHeight(5-5)
tools/cosmovisor/cmd/cosmovisor/add_upgrade.go (3)
tools/cosmovisor/flags.go (2)
FlagForce(7-7)FlagUpgradeHeight(8-8)x/upgrade/client/cli/tx.go (1)
FlagUpgradeHeight(23-23)tools/cosmovisor/args.go (1)
Config(58-82)
tools/cosmovisor/internal/watchers/poll_watcher.go (1)
tools/cosmovisor/internal/watchers/watcher.go (1)
ErrorHandler(17-22)
tools/cosmovisor/internal/watchers/data_watcher.go (1)
tools/cosmovisor/internal/watchers/watcher.go (2)
ErrorHandler(17-22)Watcher(11-14)
tools/cosmovisor/internal/watchers/hybrid_watcher.go (3)
tools/cosmovisor/internal/watchers/watcher.go (2)
Watcher(11-14)ErrorHandler(17-22)tools/cosmovisor/internal/watchers/fsnotify_watcher.go (1)
FSNotifyWatcher(12-15)tools/cosmovisor/internal/watchers/file_poll_watcher.go (1)
NewFilePollWatcher(10-39)
tools/cosmovisor/internal/watchers/http_block_checker.go (1)
tools/cosmovisor/internal/watchers/height_watcher.go (1)
HeightChecker(8-10)
tools/cosmovisor/internal/watchers/watcher.go (4)
tools/cosmovisor/internal/watchers/fsnotify_watcher.go (1)
FSNotifyWatcher(12-15)tools/cosmovisor/internal/watchers/hybrid_watcher.go (1)
NewHybridWatcher(15-47)tools/cosmovisor/internal/watchers/data_watcher.go (1)
NewDataWatcher(11-36)tools/cosmovisor/internal/watchers/file_poll_watcher.go (1)
NewFilePollWatcher(10-39)
tools/cosmovisor/internal/watchers/height_watcher.go (2)
tools/cosmovisor/internal/watchers/poll_watcher.go (2)
PollWatcher(10-15)NewPollWatcher(17-25)tools/cosmovisor/internal/watchers/watcher.go (1)
ErrorHandler(17-22)
tools/cosmovisor/args.go (1)
x/upgrade/types/plan.go (1)
UpgradeInfoFilename(12-12)
🪛 GitHub Actions: System Tests
tests/systemtests/upgrade_test.go
[error] 1-1: Makefile test target failed with exit code 1 due to system test failure.
Makefile
[error] Make target 'test-system' failed with exit code 2 due to system test failures.
systemtests/system.go
[error] 900-900: Panic error: executable file '/home/runner/work/cosmos-sdk/cosmos-sdk/tests/systemtests/binaries/simd' not found (stat no such file or directory).
tests/systemtests/cosmovisor_test.go
[error] 1-1: Makefile test target failed with exit code 1 due to system test failure.
🪛 LanguageTool
tools/cosmovisor/CHANGELOG.md
[uncategorized] ~49-~49: You might be missing the article “the” here.
Context: ...emove duplicate binary downloads during auto-download process ## v1.7.0 - 2024-11...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[uncategorized] ~80-~80: You might be missing the article “the” here.
Context: ...(other cmds with flags) .... * Add --cosmovisor-configflag to provideco...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[uncategorized] ~80-~80: You might be missing the article “the” here.
Context: ...tomlpath to the configuration file in root command used by add-upgrade` and...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[uncategorized] ~178-~178: You might be missing the article “a” here.
Context: ...a/upgrade-info.json` file updates using polling mechanism. * [#9999](https://github.co...
(AI_EN_LECTOR_MISSING_DETERMINER_A)
[uncategorized] ~196-~196: You might be missing the article “the” here.
Context: ...om//pull/10128) Change default value of DAEMON_RESTART_AFTER_UPGRADE...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[uncategorized] ~201-~201: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ... - 2021-08-06 This is the first release and we started this changelog on 2021-07-01...
(COMMA_COMPOUND_SENTENCE)
x/upgrade/CHANGELOG.md
[uncategorized] ~35-~35: Possible missing comma found.
Context: ...e app's JSON codec rather than encoding/json which is the correct behavior for emitt...
(AI_HYDRA_LEO_MISSING_COMMA)
tools/cosmovisor/README.md
[uncategorized] ~55-~55: Possible missing comma found.
Context: ...e. Release branches have the following format release/cosmovisor/vA.B.x, where A an...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~121-~121: Possible missing comma found.
Context: ...m administrator to manually restart it. Note restart is only after the upgrade and...
(AI_HYDRA_LEO_MISSING_COMMA)
[style] ~146-~146: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...osmovisor/$COSMOVISOR_CUSTOM_PREUPGRADE prior to upgrade with the arguments [ upgrade....
(EN_WORDINESS_PREMIUM_PRIOR_TO)
[style] ~147-~147: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ... Executes a custom script (separate and prior to the chain daemon pre-upgrade command)...
(EN_WORDINESS_PREMIUM_PRIOR_TO)
[grammar] ~149-~149: The modal verb ‘will’ requires the verb’s base form.
Context: ...set to true, the upgrade directory will expected to match the upgrade plan name withou...
(MD_BASEFORM)
[uncategorized] ~180-~180: “being dependent on so/sth” requires the preposition “(up)on”.
Context: ...th the --home flag. $DAEMON_HOME is dependent of the data directory and must be set to t...
(DEPENDENT_ADJ)
[style] ~204-~204: This phrase is redundant. Consider writing “points” or “times”.
Context: ...andle switching binaries at the correct points in time so that the system administrator can pr...
(MOMENT_IN_TIME)
[style] ~206-~206: Consider a more concise word here.
Context: ... in advance and relax at upgrade time. In order to support downloadable binaries, a tarbal...
(IN_ORDER_TO_PREMIUM)
[formatting] ~244-~244: Consider inserting a comma after an introductory phrase for better readability.
Context: ...-info.jsonis a valid upgrade request. In this casecosmovisor` tries immediately to mak...
(IN_THAT_CASE_COMMA)
[uncategorized] ~257-~257: Possible missing comma found.
Context: ...commands can be used to schedule manual upgrades which will set the node's `--halt-heigh...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~270-~270: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ... condition requiring manual intervention and it will shut down the node. #### Expec...
(COMMA_COMPOUND_SENTENCE)
[style] ~295-~295: Try moving the adverb to make the sentence clearer.
Context: ...isorhas anadd-upgrade` command that allows to easily link a binary to an upgrade. It creates a new folder ...
(SPLIT_INFINITIVE)
[grammar] ~298-~298: Did you mean “specifying”? Or maybe you should add a pronoun? In active voice, ‘allow’ + ‘to’ takes an object, usually a pronoun.
Context: ...sing the --upgrade-height flag allows to specify at which height the binary should be sw...
(ALLOW_TO)
[grammar] ~322-~322: You should probably use: “are”.
Context: ...data/upgrade-info.json` file. The files is constructed by the x/upgrade module and...
(AGREEMENT_SENT_START)
[grammar] ~349-~349: Possibly, a comma is missing after an introductory clause, the verb inflection is not correct, or a question mark is missing. Did you mean “proposal, ensure” or “proposal ensures”?
Context: ...} ``` When submitting this as a proposal ensure there are no spaces. An example command...
(IF_DT_NN_VBZ)
[style] ~366-~366: Consider using a synonym to be more concise.
Context: ...ove format (e.g. if you want to specify lots of binaries, changelog info, etc. witho...
(A_LOT_OF)
[misspelling] ~377-~377: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...ity guarantees, all URLs should include a SHA 256/512 checksum. This ensures that...
(EN_A_VS_AN)
[uncategorized] ~380-~380: Possible missing comma found.
Context: ...king archives into directories (in this case the download link should point to a `zi...
(AI_HYDRA_LEO_MISSING_COMMA)
🪛 markdownlint-cli2 (0.17.2)
tools/cosmovisor/CHANGELOG.md
76-76: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
78-78: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
80-80: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
82-82: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
108-108: Multiple headings with the same content
null
(MD024, no-duplicate-heading)
tools/cosmovisor/README.md
135-135: Spaces inside link text
null
(MD039, no-space-in-links)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: tests (01)
- GitHub Check: tests (03)
- GitHub Check: tests (02)
- GitHub Check: tests (00)
- GitHub Check: test-cosmovisor
- GitHub Check: test-sim-nondeterminism
- GitHub Check: test-integration
- GitHub Check: test-e2e
- GitHub Check: Analyze
- GitHub Check: golangci-lint
- GitHub Check: Gosec
- GitHub Check: Check docs build
- GitHub Check: Summary
🔇 Additional comments (61)
tools/cosmovisor/cmd/cosmovisor/help.go (1)
6-6: LGTM: Import path updated for v2 refactor.The import path change to
cosmossdk.io/tools/cosmovisor/v2is consistent with the major refactor to Cosmovisor v2.tools/cosmovisor/cmd/cosmovisor/main.go (1)
8-8: Context handling preserved in Execute()The default
Execute()on a*cobra.Commandinternally callsExecuteContext(context.Background()), so switching fromExecuteContext(context.Background())toExecute()does not change signal handling or shutdown behavior. No further action required.tools/cosmovisor/cmd/cosmovisor/help_test.go (1)
8-8: LGTM: Import path updated for v2 refactor.The import path change to
cosmossdk.io/tools/cosmovisor/v2is consistent with the v2 refactor and matches the corresponding change in help.go.tools/cosmovisor/cmd/cosmovisor/init.go (2)
14-14: LGTM: Import path updated for v2 refactor.The import path change to
cosmossdk.io/tools/cosmovisor/v2is consistent with the v2 refactor across all files.
24-26: LGTM: Improved documentation clarity.The addition of explicit environment variable requirements (
DAEMON_HOMEandDAEMON_NAME) in the long description improves user experience by clearly stating prerequisites for the init command.tools/cosmovisor/cmd/cosmovisor/version.go (3)
11-11: LGTM: Import path updated for v2 refactor.The import path change to
cosmossdk.io/tools/cosmovisor/v2is consistent with the v2 refactor.
65-65: LGTM: Context parameter added consistently.The addition of
cmd.Context()as the first parameter is consistent with the change on line 50, ensuring proper context propagation in the v2 refactor.
50-50: Confirmed:runInvocation Is CorrectThe
runfunction intools/cosmovisor/cmd/cosmovisor/run.gois defined as:
func run(ctx context.Context, cfgPath string, args []string, options ...RunOption) errorThe call in
version.goat line 50 (run(cmd.Context(), "", append([]string{"version"}, args...))) correctly passescmd.Context()as the first argument. No changes are needed.tests/systemtests/upgrade_test.go (1)
25-28: LGTM! Good refactoring for better encapsulation.Moving the constants from package-level to function-level scope reduces global namespace pollution and improves test isolation without changing functionality.
tools/cosmovisor/cmd/cosmovisor/root.go (2)
22-22: Command change from show-upgrade-info to show-manual-upgrades looks good.The command change aligns with the shift to manual upgrade management. The new command implementation in
tools/cosmovisor/cmd/cosmovisor/show_upgrade.gocorrectly reads manual upgrade data and formats it as JSON.
6-6: Import path v2 consistently applied – approved
- All imports of
cosmossdk.io/tools/cosmovisorhave been updated to…/v2. No occurrences of the old path remain.- Please verify that the CLI command change from
show-upgrade-info(NewShowUpgradeInfoCmd) toshow-manual-upgrades(NewShowManualUpgradesCmd) is reflected in any documentation, CI scripts, integrations, or user guides.tools/cosmovisor/internal/errors.go (1)
13-13: LGTM! Proper interface compliance check.The compile-time interface compliance check ensures the error type correctly implements the error interface.
tools/cosmovisor/args_test.go (1)
475-475: LGTM! RPC address configuration supports new upgrade polling features.The addition of
RPCAddresswith a sensible default value aligns with the enhanced manual upgrade polling capabilities. The defaulthttp://localhost:26657is appropriate for testing environments.tools/cosmovisor/cmd/cosmovisor/init_test.go (1)
19-19: LGTM! Consistent v2 import path update.The import path update is consistent with the v2 migration seen across the codebase. This ensures the test suite uses the correct v2 module.
tools/cosmovisor/cmd/cosmovisor/run_config.go (1)
7-7: Excellent refactoring approach using type aliases.The use of a type alias to
internal.RunConfigis a clean way to centralize the struct definition while maintaining backward compatibility. This approach allows the internal package to own the core types while keeping the public API unchanged.Also applies to: 17-17
tools/cosmovisor/CHANGELOG.md (1)
39-41: Changelog appropriately documents the breaking changes.The consolidation under "Breaking Changes" accurately reflects the scope of this major refactoring. The reference to the core logic reimplementation provides clear context for users.
x/upgrade/CHANGELOG.md (1)
32-35: Changelog entries accurately document the manual upgrade support and breaking changes.The addition of manual upgrade methods and the change to use the app's JSON codec for upgrade plans are well-documented. The breaking change note about height field serialization is particularly important for users to understand the impact.
tools/cosmovisor/cmd/cosmovisor/prepare_upgrade.go (2)
37-40: Clean refactoring that simplifies configuration handling.The replacement of manual config retrieval with the
getConfigFromCmdhelper function eliminates boilerplate code and centralizes configuration handling. This improves maintainability and consistency across commands.
64-64: Correct usage of internal package function.The update to use
internal.GetBinaryURLaligns with the new package structure and maintains the same functionality as shown in the relevant code snippets.tools/cosmovisor/internal/upgrade_test.go (2)
3-3: Correct package migration to internal structure.The migration from
cosmovisor_testtointernalpackage aligns with the new architecture and follows Go best practices for testing internal packages. The import path update to v2 is also correct.Also applies to: 19-19
145-145: Function calls correctly updated for same-package context.The removal of package prefixes from
UpgradeBinaryandOSArchcalls is correct since the tests now reside in the sameinternalpackage as these functions.Also applies to: 160-160, 232-232, 235-235, 254-254
tools/cosmovisor/cmd/cosmovisor/config.go (3)
4-4: LGTM: Clean import updates for v2 refactor.The addition of the
fmtimport and update to the v2 cosmovisor package align well with the broader refactoring effort mentioned in the PR objectives.Also applies to: 8-8
18-18: LGTM: Good refactoring to extract common config loading logic.Replacing the inline config loading with the helper function improves code reusability and maintainability.
28-40: LGTM: Well-implemented helper function with proper error handling.The
getConfigFromCmdfunction properly extracts the config flag, handles errors with descriptive messages, and delegates to the appropriate config loading function.tools/cosmovisor/internal/watchers/file_poll_watcher_test.go (1)
14-64: LGTM: Comprehensive test coverage for FilePollWatcher.The test properly covers the complete lifecycle of the file poll watcher:
- Creates temporary test environment
- Uses goroutines to simulate asynchronous file updates
- Verifies content detection and channel closure
- Includes proper cleanup with context cancellation
The test structure and use of channels/select statements effectively validates the watcher's behavior.
simapp/upgrades.go (1)
36-38: LGTM: Good logging of upgrade info.Adding logging for upgrade info read from disk provides useful debugging information.
tools/cosmovisor/internal/testing.go (1)
1-19: LGTM: Clean testing utility implementation.The context-based callback mechanism follows Go best practices:
- Uses unexported key type to prevent context key collisions
- Proper type assertion with safety check
- Simple, focused API for test hooks
- Safe fallback behavior when callback is not present
This provides a clean way to inject test callbacks into the execution flow.
tools/cosmovisor/internal/watchers/file_poll_watcher.go (3)
10-15: LGTM: Proper initialization of file modification tracking.The initial stat check and mod time tracking setup is handled correctly, including proper error handling when the file doesn't exist initially.
22-33: LGTM: Sound file change detection logic.The file change detection properly:
- Checks file size to avoid reading empty files
- Compares modification times for change detection
- Updates tracking state after successful reads
- Handles read errors appropriately
The logic correctly identifies when file content has actually changed.
36-38: LGTM: Proper watcher initialization and startup.The generic poll watcher creation and startup sequence is handled correctly with proper context propagation.
x/upgrade/abci.go (1)
65-86: LGTM! Well-structured manual upgrade implementation.The manual upgrade logic is correctly positioned to execute only when no scheduled upgrade exists. The implementation properly:
- Executes the upgrade with infinite gas meter
- Clears the manual upgrade plan after execution
- Returns the appropriate consensus parameter change response
tools/cosmovisor/internal/upgrade.go (1)
1-25: LGTM! Clean refactoring to internal package.The migration to the internal package and added logging statements improve the code structure and observability. The clarified comment better explains the function's purpose.
tools/cosmovisor/cmd/cosmovisor/run.go (1)
67-68: LGTM! Clean integration with the new Runner.The replacement of Launcher with the internal Runner simplifies the code and aligns well with the overall refactoring.
tools/cosmovisor/internal/watchers/data_watcher_test.go (1)
20-81: Well-structured test with good coverage.The test comprehensively covers the DataWatcher functionality including error cases (invalid JSON) and proper cleanup on context cancellation. The channel reading pattern in the anonymous function is a good practice.
tools/cosmovisor/cmd/cosmovisor/show_upgrade.go (1)
10-36: LGTM! Clean refactoring to support structured manual upgrades.The command now properly integrates with the new manual upgrade management system. The use of
getConfigFromCmdcentralizes config loading, and the structured data handling with indented JSON output improves usability.tools/cosmovisor/internal/watchers/data_watcher.go (1)
38-41: Clean interface implementation.The
Updated()method correctly implements theWatcher[T]interface with proper channel exposure.tools/cosmovisor/scanner_test.go (2)
16-76: Test updates align well with the refactoring.The changes from value types to pointer types for upgrade plans are consistent with the broader protobuf-based refactoring. The updated error expectations and messages appear more accurate.
95-104: Good addition of helper function.The
parseUpgradeInfoFilehelper function improves test maintainability and provides a clean interface for testing upgrade info parsing with different configurations.tools/cosmovisor/internal/watchers/fsnotify_watcher.go (1)
19-80: Well-structured file watcher implementation.The FSNotifyWatcher provides a robust file monitoring solution with proper error handling, resource management, and integration with the generic watcher interface.
tools/cosmovisor/internal/process.go (1)
38-68: Robust graceful shutdown implementation.The shutdown logic properly handles:
- Early exit if process already finished
- Graceful termination with SIGTERM
- Fallback to force kill after timeout
- Proper synchronization via the done channel
tools/cosmovisor/internal/backoff.go (1)
30-63: Comprehensive command change detection and backoff management.The
BeforeRunmethod implements thorough logic for:
- Detecting changes in command or arguments to reset backoff
- Enforcing maximum restart limits with permanent errors
- Applying exponential backoff with proper logging
The slice comparison logic correctly handles length differences and element-wise comparison.
tools/cosmovisor/internal/watchers/poll_watcher.go (2)
10-25: Well-designed generic polling watcher.The
PollWatcher[T]provides a clean generic abstraction for periodic polling with configurable checker functions and error handling.
44-48: Appropriate use of reflection for generic zero value detection.The use of
reflect.DeepEqualto check against the zero value of typeTis necessary for supporting generic types including[]byte. This enables the watcher to filter out empty/zero responses while maintaining type safety.tools/cosmovisor/cmd/cosmovisor/add_upgrade.go (2)
31-74: Improved separation of concerns in upgrade handling.The refactored
addUpgradefunction now focuses on binary management and plan creation, delegating persistence to centralized configuration methods. The conditional plan creation based onupgradeHeight > 0is appropriate.
95-103: Clean integration with centralized upgrade management.The command handler now properly delegates upgrade plan persistence to
cfg.AddManualUpgrades, improving modularity and consistency across the codebase.tools/cosmovisor/cmd/cosmovisor/batch_upgrade.go (1)
73-97: Efficient batch upgrade processing with centralized management.The refactored logic correctly:
- Collects upgrade plans in memory before batch persistence
- Uses the same
addUpgradelogic for consistency- Delegates final persistence to
cfg.AddManualUpgradestools/cosmovisor/internal/watchers/http_block_checker.go (1)
14-19: LGTM: Constructor follows good practices.The constructor properly initializes the struct and returns the interface type, which is a good practice for dependency injection.
tools/cosmovisor/internal/watchers/watcher.go (3)
10-14: LGTM: Clean generic interface design.The generic
Watcher[T any]interface is well-designed with a clear contract and proper use of Go generics.
44-52: LGTM: Well-structured factory function.The
InitFileWatcherfunction properly handles both hybrid and polling-only scenarios with clean conditional logic and consistent error handling.
28-34: Behavior is intentional and appropriateThe doc comment on
DebugLoggerErrorHandlerin
tools/cosmovisor/internal/watchers/watcher.goexplicitly states that errors are logged as warnings and warnings as debug. This handler is only used by file/HTTP watchers to avoid treating every read or parse error as fatal. No changes needed.tools/cosmovisor/internal/process_test.go (1)
1-501: LGTM! Well-structured test refactoring.The introduction of
launchProcessFixtureandsetupTestLaunchProcessFixtureeffectively consolidates test setup logic and reduces code duplication across test cases. The transition fromlauncher.Runtorunner.Startaligns with the new internal runner architecture.tests/systemtests/cosmovisor_test.go (1)
172-180: Well-implemented symlink verification helper.The helper function correctly verifies that Cosmovisor's current symlink points to the expected upgrade directory across all nodes.
tools/cosmovisor/internal/upgrader.go (1)
23-70: Well-structured upgrade detection logic.The function correctly checks for both governance and manual upgrades, with proper error handling for edge cases like missed halt heights.
systemtests/system.go (1)
642-673: Well-implemented Cosmovisor integration in node startup.The modified
startNodesAsyncmethod cleanly separates Cosmovisor and regular execution paths with proper environment setup.tools/cosmovisor/internal/runner.go (1)
34-94: Well-structured main control loop with robust error handling.The Start method implements a clean control loop with proper upgrade detection, retry management, and error handling for different restart scenarios.
tools/cosmovisor/args.go (6)
4-4: Import additions look good.The addition of
bytesandjsonpbimports properly supports the new protobuf-based upgrade info parsing functionality.Also applies to: 15-15
39-39: RPC configuration and restart retry limit are well-implemented.The new
RPCAddressfield with environment variable support enables the manual upgrade polling mechanism, and theMaxRestartRetriesfield provides important protection against infinite restart loops.Also applies to: 70-70, 77-81, 310-314
115-123: Good refactoring of upgrade info path handling.Centralizing the upgrade info directory path through
UpgradeInfoDir()improves maintainability and ensures consistent path construction across the codebase.
437-442: Protobuf JSON marshaling is a good improvement.Using
jsonpb.Marshalerfor upgrade plan serialization provides better schema compatibility and evolution compared to standard JSON marshaling.
445-457: Well-structured method for reading pending upgrade info.The method properly uses the centralized file path and delegates parsing to the dedicated
ParseUpgradeInfomethod, following good separation of concerns.
641-641: Explicit error ignoring is appropriate here.The use of
_, _ = fmt.Fprintfand the deferred close with error ignoring are acceptable patterns for these non-critical operations.Also applies to: 652-652, 680-682
| func IsSkipUpgradeHeight(args []string, upgradeInfo upgradetypes.Plan) bool { | ||
| skipUpgradeHeights := UpgradeSkipHeights(args) | ||
| for _, h := range skipUpgradeHeights { | ||
| if h == int(upgradeInfo.Height) { | ||
| return true | ||
| } | ||
| } | ||
| return false | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix potential integer overflow and improve type safety.
The conversion from int64 (upgrade plan height) to int (parsed heights) could cause issues on 32-bit systems or with very large block heights.
Apply this diff to improve type safety:
-func IsSkipUpgradeHeight(args []string, upgradeInfo upgradetypes.Plan) bool {
+func IsSkipUpgradeHeight(args []string, upgradeInfo upgradetypes.Plan) bool {
skipUpgradeHeights := UpgradeSkipHeights(args)
for _, h := range skipUpgradeHeights {
- if h == int(upgradeInfo.Height) {
+ if int64(h) == upgradeInfo.Height {
return true
}
}
return false
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func IsSkipUpgradeHeight(args []string, upgradeInfo upgradetypes.Plan) bool { | |
| skipUpgradeHeights := UpgradeSkipHeights(args) | |
| for _, h := range skipUpgradeHeights { | |
| if h == int(upgradeInfo.Height) { | |
| return true | |
| } | |
| } | |
| return false | |
| } | |
| func IsSkipUpgradeHeight(args []string, upgradeInfo upgradetypes.Plan) bool { | |
| skipUpgradeHeights := UpgradeSkipHeights(args) | |
| for _, h := range skipUpgradeHeights { | |
| if int64(h) == upgradeInfo.Height { | |
| return true | |
| } | |
| } | |
| return false | |
| } |
🤖 Prompt for AI Agents
In tools/cosmovisor/internal/skip.go around lines 16 to 24, the function
converts upgradeInfo.Height from int64 to int, which risks integer overflow on
32-bit systems or with large heights. To fix this, ensure both
skipUpgradeHeights and upgradeInfo.Height use the same int64 type for
comparison, avoiding unsafe conversions and improving type safety.
| func UpgradeSkipHeights(args []string) []int { | ||
| var heights []int | ||
| for i, arg := range args { | ||
| if arg == fmt.Sprintf("--%s", cosmovisor.FlagSkipUpgradeHeight) { | ||
| j := i + 1 | ||
|
|
||
| for j < len(args) { | ||
| tArg := args[j] | ||
| if strings.HasPrefix(tArg, "-") { | ||
| break | ||
| } | ||
| h, err := strconv.Atoi(tArg) | ||
| if err == nil { | ||
| heights = append(heights, h) | ||
| } | ||
| j++ | ||
| } | ||
|
|
||
| break | ||
| } | ||
| } | ||
| return heights | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add validation for parsed heights.
The function should validate that parsed heights are positive integers to prevent invalid skip heights.
Apply this diff to add validation:
h, err := strconv.Atoi(tArg)
- if err == nil {
+ if err == nil && h > 0 {
heights = append(heights, h)
+ } else if err == nil && h <= 0 {
+ // Log warning about invalid height but continue parsing
}🤖 Prompt for AI Agents
In tools/cosmovisor/internal/skip.go lines 28 to 50, the UpgradeSkipHeights
function parses heights but does not validate that they are positive integers.
Modify the code to check that each parsed height is greater than zero before
appending it to the heights slice, ensuring only valid positive skip heights are
included.
| # build-system-test-current builds the binaries necessary for running system tests, but only those on the current branch | ||
| # this is useful if you are iterating on tests which rely on changes to the current branch only (which is most common in development) | ||
| build-system-test-current: build cosmovisor | ||
| mkdir -p ./tests/systemtests/binaries/ | ||
| cp $(BUILDDIR)/simd ./tests/systemtests/binaries/ | ||
| cp tools/cosmovisor/cosmovisor ./tests/systemtests/binaries/ | ||
|
|
||
| # build-system-test builds the binaries necessary for runnings system tests and places them in the correct locations | ||
| build-system-test: build-v53 | ||
| mkdir -p ./tests/systemtests/binaries/v0.53 | ||
| mv $(BUILDDIR)/simdv53 ./tests/systemtests/binaries/v0.53/simd | ||
|
|
||
| test-system: build-system-test | ||
| $(MAKE) -C tests/systemtests test | ||
| .PHONY: test-system | ||
| .PHONY: build-system-test-current build-system-test test-system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Build targets look good, but investigate system test failures.
The new build targets are well-structured and properly handle binary preparation for system tests. However, there's a pipeline failure mentioned for system tests.
Run the following script to investigate the system test failures:
🏁 Script executed:
#!/bin/bash
# Description: Check system test setup and identify potential issues
# Check if the system test directory structure exists
fd -t d systemtests
# Check if the test files exist and are executable
fd -t f -e go . tests/systemtests/
# Verify the binaries directory structure after build
echo "Checking expected binary locations..."
ls -la tests/systemtests/binaries/ 2>/dev/null || echo "Binaries directory not found"Length of output: 493
Fix build-system-test to include current branch binaries
The test-system target depends on build-system-test, which only invokes build-v53 and never creates the root tests/systemtests/binaries/ directory or copies the current branch simd and cosmovisor binaries. As your verification showed, the binaries/ folder is missing, causing system tests to fail.
· Makefile (lines 508–522): update build-system-test dependencies
· Ensure both current and v0.53 binaries are prepared before running tests
Proposed diff:
- build-system-test: build-v53
+ build-system-test: build-system-test-current build-v53This change makes build-system-test first run build-system-test-current (creating binaries/ and copying simd & cosmovisor), then produce the v0.53 binary, so all tests have the expected binaries available.
🧰 Tools
🪛 GitHub Actions: System Tests
[error] Make target 'test-system' failed with exit code 2 due to system test failures.
🤖 Prompt for AI Agents
In the Makefile around lines 508 to 522, the build-system-test target currently
only runs build-v53 and does not prepare the binaries directory or copy the
current branch binaries, causing system tests to fail. To fix this, modify
build-system-test to first depend on or invoke build-system-test-current so that
the binaries directory is created and current branch binaries (simd and
cosmovisor) are copied, then proceed with building v0.53 binaries. This ensures
all necessary binaries are available before running tests.
| func (j *httpRPCBlockChecker) GetLatestBlockHeight() (uint64, error) { | ||
| if j.subUrl != "" { | ||
| return j.getLatestBlockHeight(j.subUrl) | ||
| } | ||
|
|
||
| height, err1 := j.getLatestBlockHeight("/v1/block") | ||
| if err1 == nil { | ||
| j.logger.Info("Successfully resolved latest block height from /v1/block", "url", j.baseUrl+"/v1/block") | ||
| // If we successfully got the height from /v1/block, we can cache the subUrl | ||
| j.subUrl = "/v1/block" | ||
| return height, nil | ||
| } | ||
|
|
||
| height, err2 := j.getLatestBlockHeight("/block") | ||
| if err2 == nil { | ||
| j.logger.Info("Successfully resolved latest block height from /block", "url", j.baseUrl+"/block") | ||
| // If we successfully got the height from /block, we can cache the subUrl | ||
| j.subUrl = "/block" | ||
| return height, nil | ||
| } | ||
|
|
||
| return 0, fmt.Errorf("failed to get latest block height from both /block and /v1/block RPC endpoints: %w", errors.Join(err1, err2)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Verify URL validation and consider adding timeout configuration.
The caching strategy is good for performance, but consider these improvements:
- The
baseUrlis not validated - malformed URLs could cause runtime errors - HTTP requests lack timeout configuration, which could cause hanging requests
Consider adding URL validation in the constructor:
func NewHTTPRPCBLockChecker(baseUrl string, logger log.Logger) HeightChecker {
+ if _, err := url.Parse(baseUrl); err != nil {
+ logger.Error("Invalid base URL provided", "url", baseUrl, "error", err)
+ }
return &httpRPCBlockChecker{
baseUrl: baseUrl,
logger: logger,
}
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In tools/cosmovisor/internal/watchers/http_block_checker.go lines 27 to 49, the
code lacks validation for the baseUrl and does not set timeouts for HTTP
requests, risking runtime errors and hanging calls. Add URL validation in the
constructor or initialization method to ensure baseUrl is well-formed before
use. Also, configure HTTP client timeouts for all requests made in
getLatestBlockHeight to prevent indefinite blocking.
| ymd := fmt.Sprintf("%d-%d-%d", st.Year(), st.Month(), st.Day()) | ||
| dst := filepath.Join(u.cfg.DataBackupPath, fmt.Sprintf("data"+"-backup-%s", ymd)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Backup directory naming could cause collisions.
Using only the date (YYYY-MM-DD) for backup directories could cause collisions if multiple backups occur on the same day. Consider adding timestamp or sequence number.
// a destination directory, Format YYYY-MM-DD
st := time.Now()
-ymd := fmt.Sprintf("%d-%d-%d", st.Year(), st.Month(), st.Day())
-dst := filepath.Join(u.cfg.DataBackupPath, fmt.Sprintf("data"+"-backup-%s", ymd))
+timestamp := st.Format("2006-01-02-150405") // YYYY-MM-DD-HHMMSS
+dst := filepath.Join(u.cfg.DataBackupPath, fmt.Sprintf("data-backup-%s", timestamp))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ymd := fmt.Sprintf("%d-%d-%d", st.Year(), st.Month(), st.Day()) | |
| dst := filepath.Join(u.cfg.DataBackupPath, fmt.Sprintf("data"+"-backup-%s", ymd)) | |
| // a destination directory, Format YYYY-MM-DD | |
| st := time.Now() | |
| timestamp := st.Format("2006-01-02-150405") // YYYY-MM-DD-HHMMSS | |
| dst := filepath.Join(u.cfg.DataBackupPath, fmt.Sprintf("data-backup-%s", timestamp)) |
🤖 Prompt for AI Agents
In tools/cosmovisor/internal/upgrader.go around lines 224 to 225, the backup
directory name uses only the date (YYYY-MM-DD), which can cause collisions if
multiple backups happen on the same day. Modify the naming scheme to include a
timestamp with hours, minutes, and seconds or a unique sequence number to ensure
each backup directory name is unique and avoid overwriting previous backups.
| info, err := os.Stat(preupgradeFile) | ||
| if err != nil { | ||
| u.logger.Error("COSMOVISOR_CUSTOM_PREUPGRADE file missing", "file", preupgradeFile) | ||
| return err | ||
| } | ||
| if !info.Mode().IsRegular() { | ||
| _, f := filepath.Split(preupgradeFile) | ||
| return fmt.Errorf("COSMOVISOR_CUSTOM_PREUPGRADE: %s is not a regular file", f) | ||
| } | ||
|
|
||
| // Set the execute bit for only the current user | ||
| // Given: Current user - Group - Everyone | ||
| // 0o RWX - RWX - RWX | ||
| oldMode := info.Mode().Perm() | ||
| newMode := oldMode | 0o100 | ||
| if oldMode != newMode { | ||
| if err := os.Chmod(preupgradeFile, newMode); err != nil { | ||
| u.logger.Info("COSMOVISOR_CUSTOM_PREUPGRADE could not add execute permission") | ||
| return errors.New("COSMOVISOR_CUSTOM_PREUPGRADE could not add execute permission") | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TOCTOU race condition in file permission check.
There's a time-of-check to time-of-use race condition where the file could be modified between the Stat call and execution. Consider using atomic operations or accepting the race condition with documentation.
info, err := os.Stat(preupgradeFile)
if err != nil {
u.logger.Error("COSMOVISOR_CUSTOM_PREUPGRADE file missing", "file", preupgradeFile)
return err
}
if !info.Mode().IsRegular() {
_, f := filepath.Split(preupgradeFile)
return fmt.Errorf("COSMOVISOR_CUSTOM_PREUPGRADE: %s is not a regular file", f)
}
+
+// Note: There's an inherent TOCTOU race here, but it's acceptable for this use case
+// as the custom pre-upgrade script is expected to be under the operator's control📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| info, err := os.Stat(preupgradeFile) | |
| if err != nil { | |
| u.logger.Error("COSMOVISOR_CUSTOM_PREUPGRADE file missing", "file", preupgradeFile) | |
| return err | |
| } | |
| if !info.Mode().IsRegular() { | |
| _, f := filepath.Split(preupgradeFile) | |
| return fmt.Errorf("COSMOVISOR_CUSTOM_PREUPGRADE: %s is not a regular file", f) | |
| } | |
| // Set the execute bit for only the current user | |
| // Given: Current user - Group - Everyone | |
| // 0o RWX - RWX - RWX | |
| oldMode := info.Mode().Perm() | |
| newMode := oldMode | 0o100 | |
| if oldMode != newMode { | |
| if err := os.Chmod(preupgradeFile, newMode); err != nil { | |
| u.logger.Info("COSMOVISOR_CUSTOM_PREUPGRADE could not add execute permission") | |
| return errors.New("COSMOVISOR_CUSTOM_PREUPGRADE could not add execute permission") | |
| } | |
| } | |
| info, err := os.Stat(preupgradeFile) | |
| if err != nil { | |
| u.logger.Error("COSMOVISOR_CUSTOM_PREUPGRADE file missing", "file", preupgradeFile) | |
| return err | |
| } | |
| if !info.Mode().IsRegular() { | |
| _, f := filepath.Split(preupgradeFile) | |
| return fmt.Errorf("COSMOVISOR_CUSTOM_PREUPGRADE: %s is not a regular file", f) | |
| } | |
| // Note: There's an inherent TOCTOU race here, but it's acceptable for this use case | |
| // as the custom pre-upgrade script is expected to be under the operator's control | |
| // Set the execute bit for only the current user | |
| // Given: Current user - Group - Everyone | |
| // 0o RWX - RWX - RWX | |
| oldMode := info.Mode().Perm() | |
| newMode := oldMode | 0o100 | |
| if oldMode != newMode { | |
| if err := os.Chmod(preupgradeFile, newMode); err != nil { | |
| u.logger.Info("COSMOVISOR_CUSTOM_PREUPGRADE could not add execute permission") | |
| return errors.New("COSMOVISOR_CUSTOM_PREUPGRADE could not add execute permission") | |
| } | |
| } |
🤖 Prompt for AI Agents
In tools/cosmovisor/internal/upgrader.go around lines 131 to 151, there is a
TOCTOU race condition because the file's mode is checked with os.Stat and then
changed with os.Chmod, allowing the file to be modified between these calls. To
fix this, avoid separate stat and chmod calls by attempting to set the execute
bit directly with os.Chmod and handle errors accordingly, or document and accept
the race condition explicitly if atomic operations are not feasible.
| time.Sleep(pollInterval * 3) // wait a bit | ||
| rootCmd := NewRootCmd() | ||
| rootCmd.SetArgs([]string{ | ||
| "add-upgrade", | ||
| "manual20", | ||
| filepath.Join(mockchainDir, "manual-upgrades", "manual20"), | ||
| "--upgrade-height", | ||
| "20", | ||
| "--cosmovisor-config", | ||
| cfgFile, | ||
| }) | ||
| rootCmd.SetOut(os.Stdout) | ||
| rootCmd.SetErr(os.Stderr) | ||
| require.NoError(t, rootCmd.Execute()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add validation after executing add-upgrade command.
The test doesn't verify that the add-upgrade command succeeded, which could lead to silent test failures.
rootCmd.SetOut(os.Stdout)
rootCmd.SetErr(os.Stderr)
require.NoError(t, rootCmd.Execute())
+
+// Verify the upgrade was added
+upgradePath := filepath.Join(mockchainDir, "cosmovisor", "upgrades", "manual20")
+_, err := os.Stat(upgradePath)
+require.NoError(t, err, "manual upgrade should exist after add-upgrade")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| time.Sleep(pollInterval * 3) // wait a bit | |
| rootCmd := NewRootCmd() | |
| rootCmd.SetArgs([]string{ | |
| "add-upgrade", | |
| "manual20", | |
| filepath.Join(mockchainDir, "manual-upgrades", "manual20"), | |
| "--upgrade-height", | |
| "20", | |
| "--cosmovisor-config", | |
| cfgFile, | |
| }) | |
| rootCmd.SetOut(os.Stdout) | |
| rootCmd.SetErr(os.Stderr) | |
| require.NoError(t, rootCmd.Execute()) | |
| } | |
| time.Sleep(pollInterval * 3) // wait a bit | |
| rootCmd := NewRootCmd() | |
| rootCmd.SetArgs([]string{ | |
| "add-upgrade", | |
| "manual20", | |
| filepath.Join(mockchainDir, "manual-upgrades", "manual20"), | |
| "--upgrade-height", | |
| "20", | |
| "--cosmovisor-config", | |
| cfgFile, | |
| }) | |
| rootCmd.SetOut(os.Stdout) | |
| rootCmd.SetErr(os.Stderr) | |
| require.NoError(t, rootCmd.Execute()) | |
| // Verify the upgrade was added | |
| upgradePath := filepath.Join(mockchainDir, "cosmovisor", "upgrades", "manual20") | |
| _, err := os.Stat(upgradePath) | |
| require.NoError(t, err, "manual upgrade should exist after add-upgrade") | |
| } |
🤖 Prompt for AI Agents
In tools/cosmovisor/cmd/cosmovisor/mockchain_test.go around lines 104 to 118,
after executing the add-upgrade command with rootCmd.Execute(), add assertions
to validate the command succeeded as expected. This can include checking the
command output, verifying the upgrade was added to the configuration, or any
relevant state changes to ensure the test does not silently pass without
confirming the command's effect.
| // ParseUpgradeInfo parses the upgrade info from the given byte slice. | ||
| func (cfg *Config) ParseUpgradeInfo(bz []byte) (*upgradetypes.Plan, error) { | ||
| var upgradePlan upgradetypes.Plan | ||
| if err := jsonpb.Unmarshal(bytes.NewReader(bz), &upgradePlan); err != nil { | ||
| return nil, fmt.Errorf("error unmarshalling upgrade info: %w", err) | ||
| } | ||
| if err := upgradePlan.ValidateBasic(); err != nil { | ||
| return nil, fmt.Errorf("upgrade info failed validation upgrade inof: %w", err) | ||
| } | ||
| if !cfg.DisableRecase { | ||
| upgradePlan.Name = strings.ToLower(upgradePlan.Name) | ||
| } | ||
| return &upgradePlan, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix typo in error message.
There's a typo in the error message: "upgrade inof" should be "upgrade info".
Apply this fix:
- return nil, fmt.Errorf("upgrade info failed validation upgrade inof: %w", err)
+ return nil, fmt.Errorf("upgrade info failed validation: %w", err)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // ParseUpgradeInfo parses the upgrade info from the given byte slice. | |
| func (cfg *Config) ParseUpgradeInfo(bz []byte) (*upgradetypes.Plan, error) { | |
| var upgradePlan upgradetypes.Plan | |
| if err := jsonpb.Unmarshal(bytes.NewReader(bz), &upgradePlan); err != nil { | |
| return nil, fmt.Errorf("error unmarshalling upgrade info: %w", err) | |
| } | |
| if err := upgradePlan.ValidateBasic(); err != nil { | |
| return nil, fmt.Errorf("upgrade info failed validation upgrade inof: %w", err) | |
| } | |
| if !cfg.DisableRecase { | |
| upgradePlan.Name = strings.ToLower(upgradePlan.Name) | |
| } | |
| return &upgradePlan, nil | |
| } | |
| func (cfg *Config) ParseUpgradeInfo(bz []byte) (*upgradetypes.Plan, error) { | |
| var upgradePlan upgradetypes.Plan | |
| if err := jsonpb.Unmarshal(bytes.NewReader(bz), &upgradePlan); err != nil { | |
| return nil, fmt.Errorf("error unmarshalling upgrade info: %w", err) | |
| } | |
| if err := upgradePlan.ValidateBasic(); err != nil { | |
| return nil, fmt.Errorf("upgrade info failed validation: %w", err) | |
| } | |
| if !cfg.DisableRecase { | |
| upgradePlan.Name = strings.ToLower(upgradePlan.Name) | |
| } | |
| return &upgradePlan, nil | |
| } |
🤖 Prompt for AI Agents
In tools/cosmovisor/args.go around lines 484 to 497, fix the typo in the error
message returned by ValidateBasic by changing "upgrade inof" to "upgrade info"
to ensure the error message is clear and correct.
| return cfg.currentUpgrade, fmt.Errorf("failed to read %q: %w", filename, err) | ||
| func (cfg Config) WriteLastKnownHeight(height uint64) error { | ||
| filename := filepath.Join(cfg.UpgradeInfoDir(), LastKnownHeightFile) | ||
| return os.WriteFile(filename, []byte(strconv.FormatUint(height, 10)), 0o644) |
Check failure
Code scanning / gosec
Expect WriteFile permissions to be 0600 or less Error
| return err | ||
| } | ||
|
|
||
| return os.WriteFile(cfg.UpgradeInfoBatchFilePath(), manualUpgradesData, 0o644) |
Check failure
Code scanning / gosec
Expect WriteFile permissions to be 0600 or less Error
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…nto aaronc/22731-cosmovisor-fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some initial questions/comments. Still need to go through the main run loops/upgrade logic.
| Height: height, | ||
| Info: p.Info, | ||
| } | ||
| info, err := json.Marshal(upgradeInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to cause a problem during the initial upgrade? Not sure of the specifics, but I'm concerned about the existing version dumping a something to disk using one codec, and then after the upgrade we attempt to unmarshal using a different codec and it fails to parse. Let me know if this isn't an issue though.
| if err != nil { | ||
| return "" | ||
| } | ||
| cfg.currentUpgrade = u | ||
| return cfg.currentUpgrade, nil | ||
| if upgradeInfo == nil { | ||
| return "" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err != nil || upgradeInfo == nil {
return ""
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem like anything in here really needs to be a member of Config since all of the calls to the object are just function calls and any state is being loaded and stored from disk and not really in memory on our struct as far as I can tell.
| // ErrorHandler is an interface for handling errors and warnings in watchers. | ||
| type ErrorHandler interface { | ||
| // Error handles an error as an error. | ||
| Error(msg string, err error) | ||
| // Warn handles an error as a warning. | ||
| Warn(msg string, err error) | ||
| } | ||
|
|
||
| type debugLoggerErrorHandler struct { | ||
| logger log.Logger | ||
| } | ||
|
|
||
| func (h *debugLoggerErrorHandler) Error(msg string, err error) { | ||
| h.logger.Warn(msg, "error", err) | ||
| } | ||
|
|
||
| func (h *debugLoggerErrorHandler) Warn(msg string, err error) { | ||
| h.logger.Debug(msg, "error", err) | ||
| } | ||
|
|
||
| // DebugLoggerErrorHandler returns an ErrorHandler that logs errors and warnings using the provided logger, | ||
| // but downgrades errors to warnings and warnings to debug logs. | ||
| func DebugLoggerErrorHandler(logger log.Logger) ErrorHandler { | ||
| return &debugLoggerErrorHandler{logger: logger} | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there another implementation of this interface somewhere? I'm struggling to understand why this is useful instead of just using a logger directly.
| if err != nil { | ||
| if !os.IsNotExist(err) { | ||
| w.errorHandler.Error("failed to check for updates", err) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if err != nil { | |
| if !os.IsNotExist(err) { | |
| w.errorHandler.Error("failed to check for updates", err) | |
| } | |
| if err != nil && !os.IsNotExist(err) { | |
| w.errorHandler.Error("failed to check for updates", err) |
| // to make PollWatcher generic on any type T (including []byte), we use reflect.DeepEqual and the default zero value of T | ||
| var zero T | ||
| if !reflect.DeepEqual(x, zero) { | ||
| w.outChan <- x | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the conditions under which we would receive zero valued notifications?
| func (h HybridWatcher) Errors() <-chan error { | ||
| return h.errChan | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems correct--the error channel is not used in this implementation.
| height, err1 := j.getLatestBlockHeight("/v1/block") | ||
| if err1 == nil { | ||
| j.logger.Info("Successfully resolved latest block height from /v1/block", "url", j.baseUrl+"/v1/block") | ||
| // If we successfully got the height from /v1/block, we can cache the subUrl | ||
| j.subUrl = "/v1/block" | ||
| return height, nil | ||
| } | ||
|
|
||
| height, err2 := j.getLatestBlockHeight("/block") | ||
| if err2 == nil { | ||
| j.logger.Info("Successfully resolved latest block height from /block", "url", j.baseUrl+"/block") | ||
| // If we successfully got the height from /block, we can cache the subUrl | ||
| j.subUrl = "/block" | ||
| return height, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nicer to just specify a var which contains all subUrls to check and iterate through them since the logic for each is the same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you run a hybrid watcher, do you have data on which watcher ends up propagating the updates? Is it always one or the other, or do both of them sometimes get the updates first?
My question is mainly--why is a single watcher implementation not good enough?
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days-before-close if no further activity occurs. |
Description
Closes: #22731
This PR is essentially a rewrite of the core watcher/process management of Cosmovisor and will result in a Cosmovisor v2 release.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Refactor
Chores
Tests