Support for external storage in Circuit Breaker #215

rahulrai-in · 2017-01-20T06:01:08Z

I want to implement Circuit Breaker in a distributed system. I would like to share circuit breaker state among all instances of the application. How can I support this scenario?

reisenberger · 2017-01-21T09:43:05Z

@rahulrai-in Polly does not provide an in-built way to share circuit state between different processes / instances of an application.

Previous discussion on Polly highlighted possible risks with this pattern intrinsic to a distributed system. Essentially, if node A1 is experiencing problems in calls to M, is it safe to assume that other nodes (A2, A3, A4 say) also would? What if the problem is specific only to the network path between A1 and M, but not A2/3/4 to M? Circuit-state co-ordination could then be dangerous and unhelpful: it risks telling A2/3/4 they can't talk to M, when in fact they still can. Likewise if the problem is something internal only to A1 (eg burst of requests or other malfunction causing local CPU/thread/memory starvation). What if the issue is a general network problem between A1 and anywhere? Most methods to co-ordinate circuit state (shared db; Redis cache; messaging) would depend on the same network ...

Per these risks, some commentators suggest it may be simpler, and safer, just to treat A1, A2, A3, A4 as independent actors, and let them pop their own circuits on the only thing they can know to be true ... that they aren't getting answers from calls to M.

reisenberger · 2017-01-21T09:44:25Z

@rahulrai-in If otoh making a judgment the benefits outweigh the risks for a particular installation (effectively, you take the risk of assuming the problem is with M, not anywhere else), you could support this scenario with no changes to current Polly thus:

Make the onBreak() and onReset() delegates of the CircuitBreaker, broadcast the break and reset to other instances of the application, for example via a messaging system.
All instances listen for breaks and resets.
On hearing of a break made by another instance, they would manually break their instance using Isolate(). [or recode Polly to allow manually break only for a set period]
On hearing of the reset made by the original instance, they would manually reset using Reset().

(a possibility, not necessarily a recommendation)
.

To implement within Polly, alternatively, a genuine shared external storage of the circuit state updated on each and every call, would require coding a new ICircuitController implementation. OnActionPreExecute() would read the circuit state from shared storage. Other methods such as OnActionSuccess(), OnActionFailure() etc would update it.

We would not want the core Polly package to take a dependency on a particular external storage system, so an implementation would have to go in an external package such as Polly.SharedStateCircuitBreaker, rather than in the core Polly package.

Both approaches imply extra network overhead, and are vulnerable to failure if the network isn't available.

rahulrai-in · 2017-01-21T11:34:00Z

Thank you for the insights @reisenberger. In my opinion, there are still benefits to making Polly share state among the various services. I am working on building a stateless Microservices application in which the client requests may arrive at any instance of Microservice. I want the requests to behave in a consistent manner and not fail individually on every instance. This feature would also be beneficial in actor model as without this feature several instances of the same actor will try to communicate with external dependency before failing. This scenario is also supported by Akka.

I agree to the fact that this should be an opt-in rather than a compulsory feature and I would love to submit a PR to include this feature in Polly. We can design this feature to trip the circuit if failure is reported from at least N instances so that communication failure between a single node and dependency does not trip the circuit.

This is a really good pattern for use in cloud applications esp. Microservices.

reisenberger · 2017-01-21T13:44:13Z

@moonytheloony The notion of a quorum of services which must independently report their circuit broken, before forcing others to break, seems a reasonable way of alleviating some of the risks around a single faulting service incorrectly influencing others which might otherwise succeed.

Thinking quickly, we would possibly need look at what needs adding to ICircuitController<TResult>, and what of abstract class CircuitStateController<TResult> and its existing implementations might need to be marked virtual, in order that an implementation for this feature could intercept circuit operations, to share and manipulate state. I will aim to look at this, or: please feel free to look yourself!

What would you envisage needing sharing? The underlying statistical state of the circuit (numbers of successes/failures etc metrics), or just the major events/states? (breaks/resets; CircuitState). Or? EDIT: @reisenberger come to own conclusion that only the major events/states need sharing (as tarup also says later)

Note that we are already simultaneously engaged on other large-scale features, including caching and metrics, so there are a few competing priorities in the short-term.

Feel free to join the Polly slack channel for on-going discussion on features.

Ideally, any solution which would eventually be integrated into the Polly stable would be generalised: ie work for both existing kinds of CircuitBreaker; and connect to external storage with an interface, in order that alternative external storage implementations could easily be coded by interested parties.

What do you think?

rahulrai-in · 2017-01-22T07:44:35Z

I think to support the quorum policy, we need to add another Policy class or update CircuitBreakerPolicy class to add support for Quorum policy. By doing this we can add custom policies to the Circuit Breaker. Next, CircuitBreakerState property should have the support to be loaded from a data store. Currently, this property returns what is stored in a local variable.

rahulrai-in · 2017-02-01T04:44:22Z

Adding @tarunp who is a .net architect specializing in distributed systems to add his thoughts and contribute to this feature.

reisenberger · 2017-02-01T07:37:50Z

@bunceg, you were interested in this pattern too? (circuitBreakers breaking in common across distributed systems.) Please join the conversation as feature evolves, if you have angles to share.

tarunp · 2017-02-02T16:55:53Z

Sure @moonytheloony . I also faced the same issue in one of my application consisting of background services. If we can have quorum concept within the Polly itself then it will be really useful for multi instance scenarios. I think major events or state of the circuit should be good enough to determine the state of each node.

reisenberger · 2017-06-15T07:10:58Z

Another request for this feature at #259

reisenberger · 2017-08-02T07:51:58Z

Proposal for DistributedCircuitBreaker now posted as #287

reisenberger · 2018-12-30T20:52:45Z

Closing as duplicate, main discussion now on #287

reisenberger added the question label Jan 21, 2017

reisenberger added the feature label Feb 2, 2017

reisenberger mentioned this issue Jun 15, 2017

Where is Circuit Breaker State stored? #259

Closed

reisenberger mentioned this issue Jun 27, 2017

Allow users to provide own circuitbreakstate implementation / inject circuit state #6

Closed

reisenberger mentioned this issue Aug 2, 2017

OPTION/PROPOSAL: Distributed Circuit Breaker #287

Closed

reisenberger added the duplicate label Jan 2, 2018

reisenberger closed this as completed Dec 30, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for external storage in Circuit Breaker #215

Support for external storage in Circuit Breaker #215

rahulrai-in commented Jan 20, 2017

reisenberger commented Jan 21, 2017 •

edited

Loading

reisenberger commented Jan 21, 2017 •

edited

Loading

rahulrai-in commented Jan 21, 2017

reisenberger commented Jan 21, 2017 •

edited

Loading

rahulrai-in commented Jan 22, 2017

rahulrai-in commented Feb 1, 2017

reisenberger commented Feb 1, 2017

tarunp commented Feb 2, 2017

reisenberger commented Jun 15, 2017

reisenberger commented Aug 2, 2017

reisenberger commented Dec 30, 2018

Support for external storage in Circuit Breaker #215

Support for external storage in Circuit Breaker #215

Comments

rahulrai-in commented Jan 20, 2017

reisenberger commented Jan 21, 2017 • edited Loading

reisenberger commented Jan 21, 2017 • edited Loading

rahulrai-in commented Jan 21, 2017

reisenberger commented Jan 21, 2017 • edited Loading

rahulrai-in commented Jan 22, 2017

rahulrai-in commented Feb 1, 2017

reisenberger commented Feb 1, 2017

tarunp commented Feb 2, 2017

reisenberger commented Jun 15, 2017

reisenberger commented Aug 2, 2017

reisenberger commented Dec 30, 2018

reisenberger commented Jan 21, 2017 •

edited

Loading

reisenberger commented Jan 21, 2017 •

edited

Loading

reisenberger commented Jan 21, 2017 •

edited

Loading