This repository was archived by the owner on Feb 18, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 941
This repository was archived by the owner on Feb 18, 2025. It is now read-only.
region or datacenter causes failover to fail #1455
Copy link
Copy link
Open
Description
触发bug的前置条件(满足任意一个或者同时启用)
- PreventCrossRegionMasterFailover为true
- PreventCrossDataCenterMasterFailover为true
触发时机
看运气
触发结果
切换失败
触发报错截图
分析报错
review下代码

发现代码in后面是有值的,而日志输出的这里的in后面是空值,导致判断不通过,触发了region保护,导致故障转移失败
如何复现
代码文件:go/inst/instance_dao.go
在instanceFound = true下面增加for循环(你想多少秒都行,不要太大,太大检测过慢,效果显现慢)

在Master节点上shutdown
这里要选好时机,时机就是调试代码的5秒逻辑,shutdown命令要在循环Loop 5秒期间执行
观察拓扑
确认下日志
观察下ORC表的记录

发现宕机的Master节点10.10.1.220对应的region记录为空了
贴下解决办法吧(过程懒的讲了)
其实修复起来已经很简单了,调整下DetectRegionQuery代码顺序即可
添加代码的位置
补充代码
// Get datacenter、region etc
func() {
var getMetaWaitGroup sync.WaitGroup
if config.Config.DetectDataCenterQuery != "" && !isMaxScale {
getMetaWaitGroup.Add(1)
go func() {
defer getMetaWaitGroup.Done()
err := db.QueryRow(config.Config.DetectDataCenterQuery).Scan(&instance.DataCenter)
logReadTopologyInstanceError(instanceKey, "DetectDataCenterQuery", err)
}()
}
if config.Config.DetectRegionQuery != "" && !isMaxScale {
getMetaWaitGroup.Add(1)
go func() {
defer getMetaWaitGroup.Done()
err := db.QueryRow(config.Config.DetectRegionQuery).Scan(&instance.Region)
logReadTopologyInstanceError(instanceKey, "DetectRegionQuery", err)
}()
}
if config.Config.DetectPhysicalEnvironmentQuery != "" && !isMaxScale {
getMetaWaitGroup.Add(1)
go func() {
defer getMetaWaitGroup.Done()
err := db.QueryRow(config.Config.DetectPhysicalEnvironmentQuery).Scan(&instance.PhysicalEnvironment)
logReadTopologyInstanceError(instanceKey, "DetectPhysicalEnvironmentQuery", err)
}()
}
if config.Config.DetectInstanceAliasQuery != "" && !isMaxScale {
getMetaWaitGroup.Add(1)
go func() {
defer getMetaWaitGroup.Done()
err := db.QueryRow(config.Config.DetectInstanceAliasQuery).Scan(&instance.InstanceAlias)
logReadTopologyInstanceError(instanceKey, "DetectInstanceAliasQuery", err)
}()
}
if config.Config.DetectSemiSyncEnforcedQuery != "" && !isMaxScale {
getMetaWaitGroup.Add(1)
go func() {
defer getMetaWaitGroup.Done()
err := db.QueryRow(config.Config.DetectSemiSyncEnforcedQuery).Scan(&instance.SemiSyncPriority)
logReadTopologyInstanceError(instanceKey, "DetectSemiSyncEnforcedQuery", err)
}()
}
getMetaWaitGroup.Wait()
}()Metadata
Metadata
Assignees
Labels
No labels






