Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added redis-role selector for consistent access to 'master' or 'slave' #785

Conversation

wkd-woo
Copy link
Contributor

@wkd-woo wkd-woo commented Feb 23, 2024

Description
Redis Replication requires the Redis client to be able to access the master node without having to go through the sentinel client.

Sentinel tracks the redis master, but the endpoint of the redis pod is set to private IP inside the k8s cluster

There are restrictions on accessing the Redis service from outside the k8s cluster.

Of course, it is accessible with the current function of the operator, but the current function does not guarantee access to the redis-server of the master role.

Therefore, in Replication Topology, the ability to label the current pod to distinguish which pod is "leader(i.e master)" has been added,

The label was used as the selector to create a service that guarantees access to a specific role(leader or follower)

In addition, I thought it would be good to provide read-replica endpoint, so I added the function.

Summary
Added redis-role label to redis-replication pod.
And create a service that guarantees access to a specific role.

Related issue
#765
#629
#453

associated with a previous PR
#777

Type of change
New feature (non-breaking change which adds functionality)

Checklist

  • Tests have been added/modified and all tests pass.
  • Functionality/bugs have been confirmed to be unchanged or fixed.
  • I have performed a self-review of my own code.
  • Documentation has been updated or added where necessary.

Additional Context

$ kubectl get pods -l redis-role=master
NAME                  READY   STATUS    RESTARTS   AGE
redis-replication-1   2/2     Running   0          24h

$ kubectl get pods -l redis-role=slave
NAME                  READY   STATUS    RESTARTS   AGE
redis-replication-0   2/2     Running   0          24h
redis-replication-2   2/2     Running   0          24h

The pods are labeled.
The key called redis-role has master or slave as a value.

$ kubecl get svc -l redis-role=master
NAME                       TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
redis-replication-leader   ClusterIP   10.99.100.47   <none>        6379/TCP         4m38s

$ kubectl exec -it service/redis-replication-leader -- redis-cli ROLE
Defaulted container "redis-replication" out of: redis-replication, redis-exporter
1) "master"
2) (integer) 18697180
3) 1) 1) "192.168.43.242"
      2) "6379"
      3) "18697180"
   2) 1) "192.168.251.50"
      2) "6379"
      3) "18697037"

And a service is created with the value of the key as the selector.
The service will forward the request to the selecting pod.

In this case (redis-replication-leader), forward to "master".

$ kubectl get svc -l redis-role=slave
NAME                         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
redis-replication-follower   ClusterIP   10.96.227.135    <none>        6379/TCP         4m55s

$  kubectl exec -it service/redis-replication-follower -- redis-cli ROLE
Defaulted container "redis-replication" out of: redis-replication, redis-exporter
1) "slave"
2) "192.168.52.228"
3) (integer) 6379
4) "connected"
5) (integer) 18698767

Considering the load distribution in the production environment, I also wanted to provide read-replica endpoint as a service.

func updatePodLabel(ctx context.Context, cl kubernetes.Interface, logger logr.Logger, cr *redisv1beta2.RedisReplication, role string, nodes []string) error {
	for _, node := range nodes {
		pod, err := cl.CoreV1().Pods(cr.Namespace).Get(context.TODO(), node, metav1.GetOptions{})
		if err != nil {
			logger.Error(err, "Cannot get redis replication pod")
			return err
		}
		// set Label redis-role
		metav1.SetMetaDataLabel(&pod.ObjectMeta, "redis-role", role)
		// update Label
		_, err = cl.CoreV1().Pods(cr.Namespace).Update(context.TODO(), pod, metav1.UpdateOptions{})
		if err != nil {
			logger.Error(err, "Cannot update redis replication pod")
			return err
		}
	}
	return nil
}

func UpdateRoleLabelPod(ctx context.Context, cl kubernetes.Interface, logger logr.Logger, cr *redisv1beta2.RedisReplication) error {
	// find realMaster, and label this pod: 'redis-role=master'
	role := "master"
	masterPods := GetRedisNodesByRole(ctx, cl, logger, cr, role)
	realMaster := checkAttachedSlave(ctx, cl, logger, cr, masterPods)
	err := updatePodLabel(ctx, cl, logger, cr, role, []string{realMaster})
	if err != nil {
		return err
	}

	// when configuring one size replication
	if cr.Spec.Size == pointer.Int32(1) {
		err = updatePodLabel(ctx, cl, logger, cr, role, masterPods)
		if err != nil {
			return err
		}
	}

	role = "slave"
	err = updatePodLabel(ctx, cl, logger, cr, role, GetRedisNodesByRole(ctx, cl, logger, cr, role))
	if err != nil {
		return err
	}
	return nil
}

Labeled only realMaster as 'redis-role': 'master' using 'GetRedisNodesByRole()' -> 'checkAttachedSlave()'.

If it is 'ClusterSize:1', make an exception.

@wkd-woo wkd-woo mentioned this pull request Feb 23, 2024
4 tasks
@wkd-woo wkd-woo force-pushed the wkd-woo/replication-master-selector branch from 87ccb4c to 4e74d3f Compare February 23, 2024 11:33
Copy link

codecov bot commented Feb 26, 2024

Codecov Report

Attention: Patch coverage is 11.42857% with 62 lines in your changes missing coverage. Please review.

Project coverage is 38.93%. Comparing base (d121d86) to head (b99997d).
Report is 175 commits behind head on master.

Files with missing lines Patch % Lines
controllers/redisreplication_controller.go 23.07% 20 Missing ⚠️
k8sutils/redis-replication.go 0.00% 20 Missing ⚠️
controllers/rediscluster_controller.go 0.00% 15 Missing and 1 partial ⚠️
k8sutils/services.go 25.00% 3 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #785      +/-   ##
==========================================
+ Coverage   35.20%   38.93%   +3.73%     
==========================================
  Files          19       19              
  Lines        3213     2738     -475     
==========================================
- Hits         1131     1066      -65     
+ Misses       2015     1596     -419     
- Partials       67       76       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wkd-woo
Copy link
Contributor Author

wkd-woo commented Mar 3, 2024

@iamabhishek-dubey @shubham-cmyk @drivebyer
Can I get a review?

@drivebyer
Copy link
Collaborator

Could you please rebase from the master branch and address the failing checks? @wkd-woo.

@wkd-woo wkd-woo force-pushed the wkd-woo/replication-master-selector branch from 4e74d3f to 16fcf61 Compare March 4, 2024 07:01
@wkd-woo
Copy link
Contributor Author

wkd-woo commented Mar 4, 2024

Could you please rebase from the master branch and address the failing checks? @wkd-woo.

I did :)

@drivebyer
Copy link
Collaborator

Could you please rebase from the master branch and address the failing checks? @wkd-woo.

I did :)

the DCO check that is failing, fix it plz.

@wkd-woo wkd-woo force-pushed the wkd-woo/replication-master-selector branch 3 times, most recently from 8cc24cf to 337abec Compare March 4, 2024 09:16
@wkd-woo
Copy link
Contributor Author

wkd-woo commented Mar 6, 2024

@drivebyer
I missed it. I took action. I'm sorry.

Can you please check it out?

@drivebyer
Copy link
Collaborator

@drivebyer I missed it. I took action. I'm sorry.

Can you please check it out?

The DCO issue has been resolved. There are some reviews that you should examine and address the conflicts, please.

@wkd-woo wkd-woo force-pushed the wkd-woo/replication-master-selector branch from 337abec to e8212a5 Compare March 6, 2024 02:50
@wkd-woo
Copy link
Contributor Author

wkd-woo commented Mar 18, 2024

@drivebyer
Hello, hope you're well.
Is there anything else I need to prepare to get a review?

@drivebyer
Copy link
Collaborator

@drivebyer Hello, hope you're well. Is there anything else I need to prepare to get a review?

Actually, you should resolve the issues with the GitHub Actions checks and the Git conflicts. Additionally, I have already provided some review suggestions, which you may want to take a look at.

@wkd-woo wkd-woo force-pushed the wkd-woo/replication-master-selector branch from e8212a5 to b1d5d8b Compare March 18, 2024 04:26
@drivebyer drivebyer changed the title feat(replication): Added redis-role selector for consistent access to 'master' or 'slave' feat: Added redis-role selector for consistent access to 'master' or 'slave' Mar 18, 2024
@wkd-woo wkd-woo force-pushed the wkd-woo/replication-master-selector branch 3 times, most recently from 49c3b0d to 38e723d Compare March 18, 2024 15:13
@wkd-woo wkd-woo force-pushed the wkd-woo/replication-master-selector branch 3 times, most recently from 943772b to 42cd5fa Compare April 1, 2024 10:56
@wkd-woo wkd-woo marked this pull request as draft April 2, 2024 08:17
@wkd-woo wkd-woo marked this pull request as ready for review April 3, 2024 05:21
@wkd-woo wkd-woo force-pushed the wkd-woo/replication-master-selector branch from 49b2029 to b99997d Compare May 13, 2024 10:57
@@ -36,13 +36,23 @@ func generateServiceDef(serviceMeta metav1.ObjectMeta, epp exporterPortProvider,
} else {
PortName = "redis-client"
}
selectorLabels := serviceMeta.GetLabels()
if serviceMeta.GetName() == "redis-replication-leader" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the svc always name:"redis-replication-leader", We should create svc based on the specific instance name.

@zifeo
Copy link

zifeo commented Nov 1, 2024

@drivebyer @wkd-woo is there anything we could help with in order to get that important one merged?

@drivebyer
Copy link
Collaborator

@drivebyer @wkd-woo is there anything we could help with in order to get that important one merged?

It hasn’t been updated for a while, so if you’re open to it, a new PR would also be welcome.

@zifeo
Copy link

zifeo commented Nov 5, 2024

@drivebyer does it needs a simple rebase or that are more issue to address (note that I usually not using go, but happy to help completing that one)?

@zifeo
Copy link

zifeo commented Nov 8, 2024

@drivebyer sorry to bring that again, really motivated to move forward but I lack the last input to make it happen :)

@drivebyer
Copy link
Collaborator

@drivebyer Sorry to bring this up again, but I’m really motivated to move forward. I just need one last piece of input to make it happen! :)

Actually, part of this PR is implemented by #925, which adds labels to both master and slave pods. It updates the labels as the roles change.

@zifeo
Copy link

zifeo commented Nov 8, 2024

@drivebyer awesome! so the only thing left for the user is to define a proper service that matches the given logic?

@drivebyer
Copy link
Collaborator

@drivebyer awesome! so the only thing left for the user is to define a proper service that matches the given logic?

Yes

@Leo791
Copy link

Leo791 commented Dec 30, 2024

Are you taking into account that when a failover occurs the labels have to be switched? This currently only occurs for the master which is set to slave. But the slave role is never set to master

@wkd-woo
Copy link
Contributor Author

wkd-woo commented Dec 30, 2024

@drivebyer @zifeo @Leo791
Now I see this thread, sorry for the late response. This PR hasn't merged into the main project, but I'm currently using it very well in my production environment. I think I'll only have some time at the end of the year, so I'm going to update this PR again as soon as possible and request it again.

@wkd-woo
Copy link
Contributor Author

wkd-woo commented Jan 6, 2025

Now I checked this feature is implemented by #925 #1124
So I close this issue.

Thank you a lot @drivebyer
cc. @zifeo @Leo791

@wkd-woo wkd-woo closed this Jan 6, 2025
@Leo791
Copy link

Leo791 commented Jan 6, 2025

Thank you for your work and directing my to #1124, hadn't seen that one yet. We created a work around while awaiting the release with this merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants