Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better score calculation algorithms #499

Closed
GangZhuo opened this issue Mar 14, 2016 · 14 comments
Closed

Better score calculation algorithms #499

GangZhuo opened this issue Mar 14, 2016 · 14 comments

Comments

@GangZhuo
Copy link
Contributor

The version 3.0 added a new feature "Choose by statistics", but the score calculation is not reasonable. we need a better score calculation algorithms. Feel free to put forward your advice.

See #493

@hardbone12
Copy link

本人的想法,假设某一链路丢包率不太稳定,忽高忽低,或者某一链路延时不稳定但丢包率比较低。我觉得权重是有必要的,但是权重值这个问题不能单一的用定义数字去体现,是否可以根据csv里的数据进行排序再配合定义值进行选择。另外还要考虑检测的周期,使用以前的版本有时会遇到有的节点维护了一天,但是这个节点的历史数据是好的,由于基数问题导致不切换。

@celeron533
Copy link
Contributor

或者根据csv数据来一发机器学习。。。

@xbb123
Copy link

xbb123 commented Aug 27, 2016

New version should have a default calculation algorithm that's suitable for the most.

@kang000feng
Copy link

kang000feng commented Sep 5, 2016

@GangZhuo 你好, 关于高可用选项一直有疑惑:高可用是根据延迟和丢包率自动选择服务器, 请问它是怎么检测延迟和丢包率? 是通过ping测试还是别的? 检测的频率和切换选择的频率是多少? 谢谢~!

@GangZhuo
Copy link
Contributor Author

GangZhuo commented Sep 5, 2016

延迟是通过记录和服务器建立连接的时间来确定的。如果在统计配置中启用了 ping 测试的话,则是通过 ping 来测试丢包,否则无丢包。检测和切换的频率可在统计配置中设置。

现在,此项目由 @wongsyrone 维护。

@kang000feng
Copy link

@wongsyrone 你好,请问高可用与根据统计这两个选项有什么区别? "根据统计" 可以自己设置权重, "高可用"的延迟和丢包率权重是怎样的? 谢谢!

@GangZhuo
Copy link
Contributor Author

GangZhuo commented Sep 5, 2016

高可用:根据连接失败时间、延迟、最后读写时间等给各服务器打分,然后选择一个高可用的服务器,评分算法参考 https://github.com/shadowsocks/shadowsocks-windows/blob/master/shadowsocks-csharp/Controller/Strategy/HighAvailabilityStrategy.cs#L107

根据统计:“统计配置”中启用后,软件记录使用过程中的各种状态(延迟、速度、Ping等),然后根据“统计配置”中设置的打分算法计算各服务器的分数,然后选择一个优质服务器。这个打分算法不太合理,因此开了这个 issues。

@kang000feng
Copy link

多谢

@icylogic
Copy link
Contributor

icylogic commented Nov 8, 2016

实际上 statistics 本身是一个数据服务,以csv文件形式提供数据,然后那个 strategy 是个示例 #266 (comment) ,提供一个简单的数据呈现方式,并允许自行设计 naive 的四则运算得出分数,给后来的开发者示例这些数据可以怎么用。
其实让用户自己设计算法显然是不友好的,strategy 的本意就是集思广益,然后 statistics 提供了比原有 API 更多的数据。

所以不用拘泥于改进这个 demo 算法,完全可以写各种新的 strategy 比如注重带宽,或者注重延时,注重稳定,在程序里用那些数据就可以。用户这边直接点一下选 strategy 就好,不需要自己去算。

@Noisyfox
Copy link
Contributor

改了一点代码,在GetAServer之前解析出Socks5请求中的目的地址并传入GetAServer。
这样应该能够实现一个缓存避免某些ip敏感的网站因为切换服务器导致的登录失效。
https://github.com/Noisyfox/shadowsocks-windows/commit/4b4943180075d98b4c7dc23ea38b64567d4e17b5

@xenonz95
Copy link

呃。。我的那个建议是说,一个价格高而全面优于另一个服务器的服务器,和一个廉价服务器,那这样的话增加一个价格评分系数?

@BrillianceRen
Copy link

BrillianceRen commented Jan 23, 2017

建议给统计项分几个组,然后都演算成百分比的形式。
速度相关的看上去波动很大,可以增加一个范围设定,例如输入0-1024,则速度在此范围内计算百分比,大于等于1024的计算为1.0,小于等于0的记为0.0,中间线性变化;
延迟也可以类似处理;
丢包率现在应该就是个百分比吧,我粗略观察了一下似乎没什么问题,(1 - 丢包率/100) 即可;

最后n个项的数据相乘再开n次根,
若能有一套默认配置便更好了。

@JollyTRjano
Copy link

JollyTRjano commented May 12, 2017

There is one simplified formula: Rate <= (MSS/RTT)*(1 / sqrt{p}) (the Mathis et.al. formula), where:
Rate: is the TCP transfer rate or throughputd
MSS: is the maximum segment size (fixed for each Internet path, typically 1460 bytes)
RTT: is the round trip time (as measured by TCP)
p: is the packet loss rate.
We don't need to assert it's perfect but it does offer a reference to design a better (maybe the best) score calculation algorithm that we can simply make the score equals to the Rate calculated (using p'=max(p, a very small number)).
More to read https://www.slac.stanford.edu/comp/net/wan-mon/thru-vs-loss.html

@ghost ghost mentioned this issue Dec 30, 2019
@database64128
Copy link
Contributor

The statistics feature has been removed in #2994. Closing now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests