Skip to content

fix(router): recover from panics in hooks (Cosmo Streams)#2311

Merged
dkorittki merged 4 commits intotopic/streams-v1from
dominik/eng-8359-analyze-and-fix-panic-behavior-in-hooks
Oct 30, 2025
Merged

fix(router): recover from panics in hooks (Cosmo Streams)#2311
dkorittki merged 4 commits intotopic/streams-v1from
dominik/eng-8359-analyze-and-fix-panic-behavior-in-hooks

Conversation

@dkorittki
Copy link
Copy Markdown
Contributor

@dkorittki dkorittki commented Oct 30, 2025

Summary by CodeRabbit

  • Bug Fixes

    • Added panic recovery in event processing hooks to prevent service crashes.
    • Improved error handling with detailed logging including stack traces when hooks fail.
    • Enhanced subscription lifecycle management to gracefully recover from panic scenarios.
  • Tests

    • Added comprehensive test coverage for panic recovery in event processing workflows.

Checklist

Add panic recoveries to Cosmo Streams hooks. Now when inside a hook a panic happens the error message gets logged with [Recovery from handler panic], similar to other places in the router. Also if we deal with subscription hooks, the subscription gets closed normally. On the OnPublishEvent hook we return a success:false message back to the client the same way we deal with normal errors being returned by that hook.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 30, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds comprehensive panic recovery and error handling across pubsub datasource operations. Converts applyPublishEventHooks to a PubSubProvider method with panic recovery, adds panic recovery in subscription startup and event updater goroutines, and includes corresponding test coverage for panic scenarios.

Changes

Cohort / File(s) Summary
Publish event hooks refactoring
router/pkg/pubsub/datasource/pubsubprovider.go, router/pkg/pubsub/datasource/pubsubprovider_test.go
Converted applyPublishEventHooks from standalone function to PubSubProvider method with panic recovery via recover() block that logs stacktraces and converts panics to errors. Enhanced error handling with detailed context logging. Updated tests to use provider-based method calls and added new panic recovery test.
Subscription startup panic recovery
router/pkg/pubsub/datasource/subscription_datasource.go, router/pkg/pubsub/datasource/subscription_datasource_test.go
Added deferred panic recovery in SubscriptionOnStart that captures panics, logs stacktraces, and converts recovered values to errors. Added imports for fmt and zapcore. New test verifies panic handling across multiple panic types.
Subscription event updater panic recovery
router/pkg/pubsub/datasource/subscription_event_updater.go, router/pkg/pubsub/datasource/subscription_event_updater_test.go
Added recoverPanic method for panic recovery in update goroutine, logging with stacktraces and closing subscription with DownstreamServiceError. Adjusted defer execution order to recover panics before semaphore release. New test validates panic recovery behavior across multiple panic scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Panic recovery pattern consistency: Verify that recover() blocks consistently capture, log, and convert panics across all three locations
  • Error handling flow: Ensure panic-as-error conversions don't mask or alter original error context in downstream callers
  • Semaphore and defer ordering: In subscription_event_updater.go, confirm recoverPanic is called before semaphore release to prevent resource leaks on panic
  • Test coverage: Verify new panic recovery tests adequately cover error, string, and arbitrary type panic scenarios

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "fix(router): recover from panics in hooks (Cosmo Streams)" directly and accurately describes the primary objective of the changeset. The changes across all modified files consistently implement panic recovery mechanisms for Cosmo Streams hooks, including adding recovery blocks with logging in pubsubprovider.go, subscription_datasource.go, and subscription_event_updater.go, along with corresponding test updates. The title is concise, specific, and clearly communicates the main change without unnecessary details or vague terminology. A teammate scanning the commit history would immediately understand that this PR adds panic handling capabilities to the hooks system.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Oct 30, 2025

Router image scan passed

✅ No security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-49e33fb6649555b02eeacb28bc53fd768dee2b55

@dkorittki
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 30, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@dkorittki dkorittki changed the title fix(router): analyze and fix panic behavior in hooks (Cosmo Streams) fix(router): recover from panics in hooks (Cosmo Streams) Oct 30, 2025
Comment thread router/pkg/pubsub/datasource/pubsubprovider.go Outdated
@dkorittki dkorittki force-pushed the dominik/eng-8359-analyze-and-fix-panic-behavior-in-hooks branch from 1a9e361 to 458be27 Compare October 30, 2025 17:50
@dkorittki dkorittki merged commit 7020faf into topic/streams-v1 Oct 30, 2025
33 of 34 checks passed
@dkorittki dkorittki deleted the dominik/eng-8359-analyze-and-fix-panic-behavior-in-hooks branch October 30, 2025 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants