Skip to content

Conversation

@louyuting
Copy link
Collaborator

Describe what this PR does / why we need it

Circuit breaker refactor and enhance the scalability of stat module

Does this pull request fix one issue?

Describe how you did it

Describe how to verify it

Special notes for reviews

@codecov-io
Copy link

codecov-io commented Apr 28, 2020

Codecov Report

Merging #143 into master will decrease coverage by 1.16%.
The diff coverage is 40.68%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #143      +/-   ##
==========================================
- Coverage   43.48%   42.31%   -1.17%     
==========================================
  Files          67       68       +1     
  Lines        2863     3247     +384     
==========================================
+ Hits         1245     1374     +129     
- Misses       1479     1737     +258     
+ Partials      139      136       -3     
Impacted Files Coverage Δ
core/base/entry.go 0.00% <0.00%> (ø)
core/circuitbreaker/slot.go 0.00% <0.00%> (ø)
core/circuitbreaker/stat_slot.go 0.00% <0.00%> (ø)
core/stat/base/metric_bucket.go 62.96% <ø> (ø)
core/stat/stat_slot.go 0.00% <0.00%> (ø)
core/stat/base/sliding_window_metric.go 22.87% <10.00%> (ø)
core/circuitbreaker/circuit_breaker.go 23.30% <23.30%> (ø)
core/base/context.go 40.54% <29.41%> (-11.85%) ⬇️
core/stat/base/bucket_leap_array.go 42.70% <50.00%> (ø)
core/stat/base/leap_array.go 57.14% <50.00%> (-7.23%) ⬇️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 975debd...da72d05. Read the comment docs.

@louyuting louyuting added area/circuit-breaking Issues or PRs related to circuit breaking kind/enhancement Category issues or PRs related to enhancement to-review PRs to review labels Apr 28, 2020
@louyuting louyuting requested a review from sczyh30 April 28, 2020 17:01
@louyuting
Copy link
Collaborator Author

Please ignore Conflicting.
It's not impact the implementation of this PR
I will fix it.

Output *SentinelOutput
}

func (ctx *EntryContext) GetError() error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just Error() or Err() is enough? See https://golang.org/doc/effective_go.html#Getters:

There's nothing wrong with providing getters and setters yourself, and it's often appropriate to do so, but it's neither idiomatic nor necessary to put Get into the getter's name. If you have a field called owner (lower case, unexported), the getter method should be called Owner (upper case, exported), not GetOwner.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Err() is better.
If EntryContext implement Error() function, will implement standard error interface implicitly.

rtVal, existed := ctx.Output.Attachments[EntryContextRtStatKey]
if !existed {
rt := util.CurrentTimeMillis() - ctx.StartTime()
ctx.PutRtIfAbsent(rt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems weird to bring the operation here. Any consideration?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking may not be right, there is no concurrency scenario. I'm going to optimize here.

ctx.PutRtIfAbsent(rt)
return rt
}
rt := rtVal.(uint64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the performance OK if QPS is large?

Copy link
Collaborator Author

@louyuting louyuting May 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The impact for performance is small.
I just thought about it again, maybe add a new field 'rt' in EntryContext is better.

+----------------+ +----------------+ +----------------+
*/
const (
Closed Status = iota
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to be State (corresponding to state machine).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.


// StatusSwitchListener could listen on circuit breaker status switch
type StatusSwitchListener interface {
onSwitchToClosed(prev Status, rule Rule)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it public?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

// return the strategy type
BreakerStrategy() BreakerStrategy
// check whether the rule is valid and could be converted to corresponding circuit breaker
isApplicable() bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to return the concrete error here (and log it at caller). Example: https://github.com/alibaba/sentinel-golang/blob/master/core/system/rule_manager.go#L87

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense.

if !ok {
ret = make([]CircuitBreaker, 0)
ret := make([]CircuitBreaker, 0)
updateMux.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RLock?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

return ret
}
ret = append(ret, resCBs...)
ret = append(ret, commonCBs...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For default rules, it does not indicate shared circuit breaker for all resources (however it might be helpful in some scenarios). Actually it means shared circuit breaker generator on resource created.

logger.Warnf("Circuit Breaker Generator for %+v is not existed.", r.BreakerStrategy())
continue
}
cb := generator(r)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we may improve the logic: if a rule remains unchanged (identical), use the former circuit breaker instance instead of re-creating a new one.

logger.Info(sb.String())
}

func ResisterStatusSwitchListeners(listeners ...StatusSwitchListener) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right.

type StatusSwitchListener interface {
onSwitchToClosed(prev Status, rule Rule)

onSwitchToOpen(prev Status, rule Rule)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also carry the triggered "snapshot" value when the circuit breaker transforms to open.

// return true if succeed to update
func (b *circuitBreakerBase) fromHalfOpenToOpen() bool {
if b.status.casStatus(HalfOpen, Open) {
for _, listener := range b.statusSwitchListeners {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updateNextRetryTimestamp is also needed here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense.

@louyuting louyuting force-pushed the circuit_breaker_refactor branch from 6afe67f to 45687b6 Compare May 6, 2020 13:43
@sczyh30
Copy link
Member

sczyh30 commented May 6, 2020

Please also add relevant slots to the default slot chain.

retryTimeoutMs: errRatioRule.RetryTimeout,
nextRetryTimestamp: 0,
status: status,
statusSwitchListeners: statusSwitchListeners,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new listeners will not be included if added after the circuit breaker is generated.

}

func (c *MetricStatSlot) OnCompleted(ctx *base.EntryContext) {
res := ctx.Resource.String()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resource.Name()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.

+----------------+ +----------------+ +----------------+
*/
const (
Closed State = iota
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better if we could add String() method for it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

}
}

// Reset init EntryContext,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reset method should be updated for new fields.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right

// fromClosedToOpen update circuit breaker status machine from closed to open
// Used for opening circuit breaker from closed when checking circuit breaker
// return true if succeed to update
func (b *circuitBreakerBase) fromClosedToOpen() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Carry the snapshot value here?

// fromHalfOpenToOpen update circuit breaker status machine from half-open to open
// Used for failing to probe
// return true if succeed to update
func (b *circuitBreakerBase) fromHalfOpenToOpen() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Carry the snapshot value here?

for _, rule := range rules {
if !rule.isApplicable() {
logger.Warnf("Ignoring invalid breaker rule when loading new rules, %+v.", rule)
if err := rule.IsApplicable(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check nil here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense

if !rule.isApplicable() {
logger.Warnf("Ignoring invalid breaker rule when loading new rules, %+v.", rule)
if err := rule.IsApplicable(); err != nil {
logger.Warnf("Ignoring invalid breaker rule when loading new rules, err: %+resRules.", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to print both err and the bad rule?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

@louyuting louyuting force-pushed the circuit_breaker_refactor branch from f3b3d29 to 2d41d02 Compare May 18, 2020 13:36
}

func (c *MetricStatSlot) OnCompleted(ctx *base.EntryContext) {
res := ctx.Resource.Name()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check whether this request has been blocked. If blocked, the request should not be recorded to the circuit breakers.

}

func (b *circuitBreakerBase) retryTimeoutArrived() bool {
return util.CurrentTimeMillis() >= b.nextRetryTimestamp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need to make it thread-safe when accessing and modifying nextRetryTimestamp.

type CircuitBreaker interface {
BoundRule() Rule

BoundStat() interface{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it private?

@codecov-commenter
Copy link

codecov-commenter commented May 19, 2020

Codecov Report

Merging #143 into master will decrease coverage by 1.16%.
The diff coverage is 40.55%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #143      +/-   ##
==========================================
- Coverage   42.72%   41.55%   -1.17%     
==========================================
  Files          67       68       +1     
  Lines        2926     3314     +388     
==========================================
+ Hits         1250     1377     +127     
- Misses       1537     1801     +264     
+ Partials      139      136       -3     
Impacted Files Coverage Δ
core/base/entry.go 0.00% <0.00%> (ø)
core/circuitbreaker/slot.go 0.00% <ø> (ø)
core/circuitbreaker/stat_slot.go 0.00% <0.00%> (ø)
core/stat/base/metric_bucket.go 62.96% <ø> (ø)
core/stat/stat_slot.go 0.00% <0.00%> (ø)
core/stat/base/sliding_window_metric.go 22.87% <10.00%> (ø)
core/base/context.go 21.62% <15.38%> (-3.38%) ⬇️
core/circuitbreaker/circuit_breaker.go 23.30% <23.30%> (ø)
core/stat/base/bucket_leap_array.go 42.70% <50.00%> (ø)
core/stat/base/leap_array.go 57.14% <50.00%> (-7.23%) ⬇️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b73ee5e...70ea8e9. Read the comment docs.

@louyuting louyuting force-pushed the circuit_breaker_refactor branch from 1b796b1 to 7943e1c Compare May 21, 2020 09:16
@louyuting louyuting force-pushed the circuit_breaker_refactor branch from ff4d6c1 to 70ea8e9 Compare May 21, 2020 12:20
Copy link
Member

@sczyh30 sczyh30 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sczyh30 sczyh30 merged commit 112f169 into alibaba:master May 21, 2020
@sczyh30
Copy link
Member

sczyh30 commented May 21, 2020

Awesome work. Thanks!

@sczyh30 sczyh30 removed the to-review PRs to review label May 21, 2020
@sczyh30 sczyh30 added this to the 0.3.0 milestone May 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/circuit-breaking Issues or PRs related to circuit breaking kind/enhancement Category issues or PRs related to enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make statistic metrics in MetricBucket extensible Enhance Circuit Breaker Design

4 participants