Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: disable UseCacheTime by default | 建议默认将UseCacheTime设置为false #441

Closed
ansiz opened this issue Nov 26, 2021 · 3 comments · Fixed by #442
Closed
Labels
area/performance Issues or PRs related to runtime performance kind/enhancement Category issues or PRs related to enhancement
Milestone

Comments

@ansiz
Copy link
Contributor

ansiz commented Nov 26, 2021

Issue Description

Type: feature request

我们在系统性能分析中发现Sentinel内部StartTimeTicker有一个独立的goroutine在缓存毫秒时间戳以提供给其他地方调用,此处产生了大量的系统调用,对于绝大多数系统而言该逻辑的性能收益是负向的,建议默认设置为false

我们从理论上分析一下为什么这部分优化通常是无效的,先来看看Sentinel的代码

package util

import (
	"sync/atomic"
	"time"
)

var nowInMs = uint64(0)

// StartTimeTicker starts a background task that 
caches current timestamp per millisecond,
// which may provide better performance in high-concurrency scenarios.
func StartTimeTicker() {
	atomic.StoreUint64(&nowInMs, uint64(time.Now().UnixNano())/UnixTimeUnitOffset)
	go func() {
		for {
			now := uint64(time.Now().UnixNano()) / UnixTimeUnitOffset
			atomic.StoreUint64(&nowInMs, now)
			time.Sleep(time.Millisecond)
		}
	}()
}

func CurrentTimeMillsWithTicker() uint64 {
	return atomic.LoadUint64(&nowInMs)
}

从上面的代码可以看到,Sentinel内部用了一个goroutine循环的获取时间戳存到atomic变量里,然后调用sleep休眠1ms,通过这种方式缓存了毫秒级别的时间戳。从这段代码上看,性能开销最大的应该是sleep,因为sleep会产生syscall,众所周知syscall的代价是比较高的

sleep和time.Now对比开销到底大多少呢?查证资料后我发现一个反直觉的事实,由于Golang特殊的调度机制,在Golang中一次sleep会产生7次syscall,而time.Now则是vDSO实现的,那么问题来了vDSO和7次系统调用相比提升应该是多少呢?

我找到了可以佐证的资料,恰好有一个golang的优化,其中提到在老版本的golang中(golang 1.9-),Linux/386下没有这个vDSO的优化,此时会有2次syscall,新版本经过优化后理论性能提高5~7x+,可以约等于<=0.3次syscall的开销。

cache设计的目的是为了减少time.Now的调用,所以理论上这里调用量足够大的情况下可能会有收益,按照上面的分析,假设time.Now和sleep系统调用的开销比是0.3:7.3(7+0.3),这意味着CurrentTimeMillsWithTicker的调用总次数要超过2.4w才会有收益。

所以我们再分析一下CurrentTimeMillsWithTicker的调用次数,我在这个地方加了一个counter进行验证,然后模拟请求调用Sentinel的Entry,经过测试发现:

  • 当首次创建资源点时,Entry和CurrentTimeMillsWithTicker的放大比为20,这主要是因为创建底层滑动窗口时需要大量的时间戳计算
  • 当相同的resource调用Entry时,调用的放大比为5:1
    考虑到创建资源点是低频的,我们可以近似认为此处调用放大比为5

所以理论上当单机qps至少超过4800以上才可能会取得收益.

考虑到这个优化还有一个好处,是可以降低同步请求时间戳时的开销,所以我们可以再对比一下直接从内存和time.Now读取时间戳的速度
image

可以看到单次直接获取时间戳确实比从内存读取开销小很多,但是仍然是ns级别的,这种级别的耗时增长对于一笔请求而言是可以忽略不计的。
image

大概是0.06微秒,即使乘以5,也就是0.3微秒的增加。


为了佐证上述理论分析,我们做了性能测试,结果如下:

image

从上图的数据可以看出来,启用cache的情况下cpu sys util始终比不启用cache的版本要大,随着请求量增加,性能差距在逐步缩小,但是直至4000qps仍然没有正向的收益

Describe what feature you want

经过测试和理论分析可得知,在常规的场景下,Sentinel的这个cache特性是没有收益的,反而对性能造成了损耗,尤其是在低负载的情况下。即使在高负载的情况下,也可以推论出没有这个cache不会对系统造成太大的影响。

综上所述,建议在Sentinel-golang中默认禁用这个特性。

Additional context

Add any other context or screenshots about the feature request here.

参考资料

time: Sleep requires ~7 syscalls #25471
69390: runtime: use vDSO on linux/386 to improve time.Now performance

@sczyh30 sczyh30 added the area/performance Issues or PRs related to runtime performance label Nov 26, 2021
@louyuting
Copy link
Collaborator

@ansiz Nice issue, could you please file a PR to enhance it?

@ansiz
Copy link
Contributor Author

ansiz commented Nov 26, 2021

@ansiz Nice issue, could you please file a PR to enhance it?

#442

@sczyh30 sczyh30 added the kind/enhancement Category issues or PRs related to enhancement label Nov 26, 2021
@sczyh30 sczyh30 changed the title 建议默认将UseCacheTime设置为false Suggestion: disable UseCacheTime by default | 建议默认将UseCacheTime设置为false Nov 26, 2021
@sczyh30
Copy link
Member

sczyh30 commented Nov 26, 2021

感谢建议,非常有理有据,相关 PR 已合并。对于 Java 侧的原始设计与性能损耗,社区也会进一步评估。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance Issues or PRs related to runtime performance kind/enhancement Category issues or PRs related to enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants