-
Notifications
You must be signed in to change notification settings - Fork 185
Conversation
Welcome @Invictus17! |
/assign @mbohlool |
thanks for the PR. could you please keep the file naming style consistent with the existing code? e.g. LeaderElection is better to switched to leaderelection |
Sure. I've pushed the changes. |
/assign |
/assign |
leaderelection/electionconfig.py
Outdated
import sys | ||
|
||
|
||
class createConfig: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's call it LeaderElectionConfig
or Config
leaderelection/electionconfig.py
Outdated
|
||
class createConfig: | ||
# Validate config, exit if an error is detected | ||
def __init__(self, lock, leaseDuration, renewDeadline, retryPeriod, onStartedLeading, onStoppedLeading): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/leaseDuration/lease_duration
leaderelection/electionconfig.py
Outdated
if retryPeriod < 1: | ||
sys.exit("retryPeriod must be greater than zero") | ||
|
||
self.leaseDuration = leaseDuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same again for code style
@@ -0,0 +1,31 @@ | |||
import threading | |||
import ctypes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very excited if we have to add C compatibility to achieve this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can look into a different way to kill threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to use stop flags and let the client check for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to use stop flags and let the client check for it.
Yes, that's similar to the latest update using traces(this commit is outdated now) but I also want to look into https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor as suggested by @yliaog in her review.
leaderelection/leaderelection.py
Outdated
self.OnStoppedLeadingThread.start() | ||
|
||
# Start to follow | ||
self.follow(scheduler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your implementation, a program can keep switching between being a leader and a follower. However reading the client-go implementation, when the leader fails to renew the lease, it gives up and doesn't re-join as a follower. Should we keep the client-go behavior and let the user decide what to do when a leader loses the lease?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I can change the code to have a leader exit and close if failed to update lease.
leaderelection/leaderelection.py
Outdated
def getLatestLeader(self): | ||
getStatus, getResponse = self.electionConfig.lock.Get(name=self.electionConfig.lock.name, namespace=self.electionConfig.lock.namespace) | ||
if getStatus: | ||
return "leader is " + str(ast.literal_eval(getResponse.metadata.annotations[self.electionConfig.lock.LeaderElectionRecordAnnotationKey])['holderIdentity']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need literal_eval
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getResponse.metadata.annotations[self.electionConfig.lock.LeaderElectionRecordAnnotationKey]
returns a string.
literal_eval
converts that string to a dictionary so that ['holderIdentity']
can be easily accessed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getResponse.metadata.annotations[self.electionConfig.lock.LeaderElectionRecordAnnotationKey]
returns a string.
literal_eval
converts that string to a dictionary so that['holderIdentity']
can be easily accessed.
Would json.loads() work or is it not JSON?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getResponse.metadata.annotations[self.electionConfig.lock.LeaderElectionRecordAnnotationKey]
returns a string.
literal_eval
converts that string to a dictionary so that['holderIdentity']
can be easily accessed.Would json.loads() work or is it not JSON?
No, it's not JSON.
Hi @roycaihw , Thanks for your review. Based on your comments I've pushed the suggested changes. changes:
|
leaderelection/example.py
Outdated
from datetime import timedelta | ||
|
||
# Authenticate using config file | ||
config.load_kube_config(config_file=r"D:\Kubernetes open source - Python client\Go example\kubeconfig.txt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually the default config file is located at ~/.kube/config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I forgot to leave that out. Updated now.
# , if the default callback function will be used is a callback is not provide | ||
config = electionconfig.Config(ConfigMapLock(lock_name, lock_namespace, candidate_id), lease_duration=17, renew_deadline=15, retry_period=5, onstarted_leading=example_func, onstopped_leading=None) | ||
|
||
leaderelection.LeaderElection(config).run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually more than one is run to see how leaderelection is working. a simple README would be helpful.
leaderelection/leaderelection.py
Outdated
lock_status, lock_response = self.election_config.lock.get(self.election_config.lock.name, self.election_config.lock.namespace) | ||
|
||
# create a default Election record for this candidate | ||
self.leader_election_record = self.leaderelector_record(self.election_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why creating it here instead of in the 'else:' below? it is not used in the "if lock_status:"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pointing it out. I'll update it in the following commit.
leaderelection/leaderelection.py
Outdated
|
||
# If a lock is already created with that name | ||
if lock_status: | ||
print(self.election_config.lock.identity, "is a follower") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to use 'logging' instead of 'print'
leaderelection/threadingwithtrace.py
Outdated
|
||
|
||
# citing source: https://www.geeksforgeeks.org/python-different-ways-to-kill-a-thread/ | ||
class thread_with_trace(threading.Thread): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be simpler to use https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor than working with threads directly.
it is added in python 3, but i guess it is ok since python 2 is deprecated anyway, you don't have to support python 2.
leaderelection/leaderelection.py
Outdated
# Make sure thread that runs onstopped_leading callback is stopped | ||
if self.onstopped_leadingthread: | ||
self.onstopped_leadingthread.stop() | ||
self.onstopped_leadingthread.join() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think you need to 'stop', then 'join', i.e., wait for it to stop.
it's better to implement a simpler leaderelection:
- right at the start, try to acquire lease to be the leader
- if not yet a leader, periodically check if it can acquire the lease to be leader
- if it becomes the leader, call the hook on started_leading
- try to maintain the leadership by renewing the lease
- if fail to renew the lease, call the hook on_stopped_leading
- done with the leaderelection, return
NOTE: the lifecycle of one leaderelection run has two possibilities:
a) it always blocks, waiting to be a leader, but never succeeds
b) it becomes a leader, lead for sometime, then stop leading and return
It is simpler because during one leaderelection run, it can be a leader at most once. if become a leader, then somehow lose the leadership, then the whole leaderelection returns. it leave to the caller of the leaderelection to decide what to do after that, the caller may exit completely, or the caller may choose to run another leaderelection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pointing this out. I should have removed this piece of code when I updated the code for a leader to not follow after losing leadership and exit. Other than that, the leader election logic is identical to what you've described. I'll also make sure that the program let's the user decide whether to exit or run for election again. I'll handle these changes in the next commit.
leaderelection/leaderelection.py
Outdated
# Point of entry to Leader election | ||
def run(self): | ||
# Try to create/ acquire a lock | ||
self.try_acquire_or_renew() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to check out the java implementation, https://github.com/kubernetes-client/java/blob/master/extended/src/main/java/io/kubernetes/client/extended/leaderelection/LeaderElector.java
It uses only three threads, or more precisely, three threadpools, but each pool has only a single thread.
from kubernetes.client.api_client import ApiClient | ||
|
||
|
||
class ConfigMapLock: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe not now, but unittests need to be added before merging the PR
leaderelection/leaderelection.py
Outdated
self.leaderfunction_thread = None | ||
|
||
# onstopped_leadingthread contains the thread object for onstopped_leading | ||
self.onstopped_leadingthread = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on_stopped_leading should not be run in a separate thread
Hi @roycaihw @yliaog ,
|
leaderelection/leaderelection.py
Outdated
|
||
# updatelease_schedulerId variable stores the scheduler object id for the update_lease schedule that is repeated | ||
# every retry_period seconds by the leader | ||
self.updatelease_schedulerId = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leaderelection/leaderelection.py
Outdated
scheduler_leader = sched.scheduler(time.time, time.sleep) | ||
self.lead(scheduler_leader) | ||
|
||
def transition_follower_to_leader(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to distinguish transition_follower_to_leader
and lead
, same for follow
and try_acquire
. These complexity might be useful if we wanted to convert a leader back to a follower
leaderelection/leaderelection.py
Outdated
|
||
|
||
# keep checking for lease updates every retry_period seconds | ||
self.followerlease_checkscheduler = scheduler.enter(int(self.election_config.retry_period), 1, self.check_lease_updates, (scheduler,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previously I thought the scheduler was like the wait.Until
method that client-go uses, where the scheduling and the real logic are isolated. I found this implementation (scheduling the real logic recursively) hard to follow. What's the benefit against using a simple while loop + sleep like the java client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it has a benefit over the other. Would you suggest using a while + sleep?
leaderelection/README.md
Outdated
### Command to run | ||
```python example.py``` | ||
|
||
Now kill the existing leader. You will see from the terminal outputs that one of the remaining two processes will be elected as the new leader. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you assuming 3 in total? (it says the remaining two processes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out. I'll update it.
leaderelection/electionconfig.py
Outdated
|
||
# Default callback for when the current candidate if a leader, stops leading | ||
def on_stoppedleading_callback(self): | ||
print(self.lock.identity, "stopped leading") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use logging?
:return: 'True, None' if object is created else 'False, error' if failed | ||
""" | ||
body = client.V1ConfigMap( | ||
metadata={"name": name, "annotations": {self.leader_electionrecord_annotationkey: str(election_record)}}) # V1ConfigMap | Name is a necessary metadata for a configmap object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why having the comment at the end of the line before? name is required for any k8s object, it is not special for configmap
leaderelection/leaderelection.py
Outdated
# If a lock is already created with that name | ||
if lock_status: | ||
logging.info("{} is a follower".format(self.election_config.lock.identity)) | ||
scheduler_follower = sched.scheduler(time.time, time.sleep) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are a total of 4 of these: sched.scheduler(time.time, time.sleep) in the code. would one scheduler be sufficient? would it be better to create just one in the class init?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are a total of 4 of these: sched.scheduler(time.time, time.sleep) in the code. would one scheduler be sufficient? would it be better to create just one in the class init?
Yes, one would do, I'll update it. @roycaihw had a different opinion about using the sched module. Do you have similar thoughts about it?
#206 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, i agree with his comment. i think it's easier to understand to have the program structure like this:
- try to acquire lease to be leader in main thread
- if get lease, then run lead in a separate thread
- keep renewing lease in main thread
- if failed renewing lease, then stop leading in main thread
Codecov Report
@@ Coverage Diff @@
## master #206 +/- ##
=======================================
Coverage 92.37% 92.37%
=======================================
Files 13 13
Lines 1613 1613
=======================================
Hits 1490 1490
Misses 123 123 Continue to review full report at Codecov.
|
leaderelection/leaderelection.py
Outdated
|
||
# If a lock is already created with that name | ||
if lock_status: | ||
old_election_record = ast.literal_eval(lock_response.metadata.annotations[self.election_config.lock.leader_electionrecord_annotationkey]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lock_response needs to be validated, i.e. it may not have annotations, the annotation may not have the key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also ast.literal_eval could throw an exception, better to catch it
leaderelection/leaderelection.py
Outdated
|
||
|
||
# If This candidate is not the leader and lease duration is yet to finish | ||
if str(self.election_config.lock.identity) != self.observed_record['holderIdentity'] and self.observed_time_milliseconds + self.election_config.lease_duration*1000 > int(time.time()*1000): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.observed_time_milliseconds is from this candidate, but it should be from the election record, no?
also self.observed_time_milliseconds is updated to the current time at line 117, so it will always be the current time, and the lease duration will never expire
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.observed_time_milliseconds is from this candidate, but it should be from the election record, no?
No, the java and go clients are also keeping a local record of 'observed_time_milliseconds' and not referring to the election record
https://github.com/kubernetes-client/java/blob/master/extended/src/main/java/io/kubernetes/client/extended/leaderelection/LeaderElector.java#L269-L315
also self.observed_time_milliseconds is updated to the current time at line 117, so it will always be the current time, and the lease duration will never expire
It is only updated if a follower identifies that the lock object has been updated by a leader.
if old_election_record != self.observed_record:
# Update self.observed_time_milliseconds & self.observed_record
In case a leader fails to update the lock, self.observed_time_milliseconds will not be updated to current time & after a period of 'leaseDuration' a follower will try to update the lock.
https://github.com/kubernetes-client/java/blob/master/extended/src/main/java/io/kubernetes/client/extended/leaderelection/LeaderElector.java#L267-L279
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, the "> int(time.time()*1000):" in the above line is incorrect. time.time() would return the current time, what should be used instead is the time at the start of the function try_acquire_or_renew (https://github.com/kubernetes-client/java/blob/6a2a60ad2ad75fb127874797ea910b90e4a80651/extended/src/main/java/io/kubernetes/client/extended/leaderelection/LeaderElector.java#L236)
leaderelection/leaderelection.py
Outdated
# If this candidate is the Leader | ||
if str(self.election_config.lock.identity) == self.observed_record['holderIdentity']: | ||
# Leader sets acquireTime | ||
leader_election_record['acquireTime'] = self.observed_record['acquireTime'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.observed_record['acquireTime'] is not updated, it is still the old_election_record's acquireTime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line is correct I think. It prevents leader_election_record['acquireTime'] to be "now"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. i was thinking about setting the 'acquireTime' at the time when the lease is acquired, currently, the 'acquireTime' is set at the line: 104: leader_election_record = self.leaderelector_record(self.election_config)
it probably does not matter though, as there is no wait or loop inside the function try_acquire_or_renew
leaderelection/leaderelection.py
Outdated
return True | ||
|
||
# A lock is not created with that name, try to create one | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should check the api exception. We can create a lock only if it's a 404
leaderelection/leaderelection.py
Outdated
leader_election_record = self.leaderelector_record(self.election_config) | ||
|
||
# If a lock is already created with that name | ||
if lock_status: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverse the condition to have less indentation in the code
leaderelection/leaderelection.py
Outdated
|
||
if old_election_record != self.observed_record: | ||
self.observed_record = old_election_record | ||
self.observed_time_milliseconds = int(time.time()*1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
capture the current time (e.g. now = time.time()
) at the beginning of this function, and keep using now
, instead of calling time.time() multiple times.
leaderelection/leaderelection.py
Outdated
# If this candidate is the Leader | ||
if str(self.election_config.lock.identity) == self.observed_record['holderIdentity']: | ||
# Leader sets acquireTime | ||
leader_election_record['acquireTime'] = self.observed_record['acquireTime'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line is correct I think. It prevents leader_election_record['acquireTime'] to be "now"
leaderelection/leaderelection.py
Outdated
|
||
# Update object with latest election record | ||
lock_response.metadata.annotations[self.election_config.lock.leader_electionrecord_annotationkey] = str(leader_election_record) | ||
update_status, update_response = self.election_config.lock.update(self.election_config.lock.name, self.election_config.lock.namespace, lock_response) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not generalized for different types of locks. Better to make leader_election_record a class or a dict, and pass it directly to the lock methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the leaderelection logic doesn't need to know about LeaderElectionRecordAnnotationKey
. It's a implementation detail for the lock
leaderelection/leaderelection.py
Outdated
|
||
# If this candidate is the Leader | ||
if str(self.election_config.lock.identity) == self.observed_record['holderIdentity']: | ||
# Leader sets acquireTime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "Leader updates renewTime, but keeps acquireTime unchanged" may be more accurate here
election_record=leader_election_record) | ||
|
||
if create_status is False: | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a log for the create failure
leaderelection/leaderelection.py
Outdated
|
||
return self.update_lock(lock_response, leader_election_record) | ||
|
||
def update_lock(self, lock_response, leader_election_record): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lock_response
can be a field stored in the lock, since it was returned from lock.get()
and never changed before we passed it back to lock.update()
. See client-go as an example.
leaderelection/leaderelection.py
Outdated
# A lock exists with that name | ||
# Validate lock_record | ||
if lock_record is None: | ||
# try to update lock with proper annotation and election record | ||
return self.update_lock(lock_response, leader_election_record) | ||
|
||
# check for any key, value errors in the record | ||
try: | ||
old_election_record = ast.literal_eval(lock_record) | ||
if (old_election_record['holderIdentity'] == '' or old_election_record['leaseDurationSeconds'] == '' | ||
or old_election_record['acquireTime'] == '' or old_election_record['renewTime'] == ''): | ||
# try to update lock with proper annotation and election record | ||
return self.update_lock(lock_response, leader_election_record) | ||
except: | ||
# try to update lock with proper annotation and election record | ||
return self.update_lock(lock_response, leader_election_record) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to assume lock_record
is a class object, instead of a string here. Each lock type can have its own unmarshalling implementation (e.g. ConfigMap and Endpoints unmarshal their annotations, while Lease unmarshalls its spec).
This will also allow the short-circuit logic here to be as simple as client-java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for generalizing the leader_election_record. Overall the implementation looks good to me. Please add a test.
leaderelection/leaderelection.py
Outdated
|
||
# A lock is not created with that name, try to create one | ||
if not lock_status: | ||
if ast.literal_eval(lock_response.body)['code'] != 404: | ||
logging.info("Error retrieving resource lock {} as {}".format(self.election_config.lock.name, lock_response.reason)) | ||
if json.loads(old_election_record.body)['code'] != 404: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please avoid magic constant 404, instead use NOT_FOUND
leaderelection/leaderelection.py
Outdated
logging.info("{} successfully acquired lease".format(self.election_config.lock.identity)) | ||
|
||
# Start leading and call OnStartedLeading() | ||
threading.Thread(target=self.election_config.onstarted_leading, daemon=True).start() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add some documentation to the class to describe the threading behavior. i.e. the thread is running until onstarted_leading returns if it does return, it's not safe to run this leader election more than once in a process, etc.
Codecov Report
@@ Coverage Diff @@
## master #206 +/- ##
==========================================
- Coverage 92.37% 92.29% -0.09%
==========================================
Files 13 13
Lines 1613 1635 +22
==========================================
+ Hits 1490 1509 +19
- Misses 123 126 +3
Continue to review full report at Codecov.
|
@yliaog @roycaihw In my latest push I've added the tests but the travis ci report is indicating a few failures. Any idea why some of these tests are failing? |
#222 fixed the configmap test failure |
onstopped_leading=on_stopped_leading_A) | ||
|
||
# Enter leader election | ||
leaderelection.LeaderElection(config_A).run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since run()
blocks until the leader election ends, B won't start until A hits renew_count_max
. Could you make the two clients run in parallel to make sure the leader election handles concurrency properly?
"try update record", | ||
"update record", | ||
"try update record", | ||
"try update record"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some trouble understanding what this test (test_Leader_election_with_renew_deadline
) does. Could you add some comments to explain how it's related to renew_deadline
and what's the expected behavior? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @roycaihw, sure I'll add some comments. The expected behavior is to check if the leader stops leading if it fails to update the lock within the renew_deadline.
on update: zzz s | ||
on try update: 3s | ||
on update: zzz s | ||
on try update: 4.5s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this try, the timeout was set to be 4.5 + renew_deadline = 6.5s. After two failed tries (w/ sleep), the leader exited at 4.5+1.5*2=7.5s.
LGTM. Please squash the commits into reasonable parts. |
self.name = name | ||
self.namespace = namespace | ||
self.identity = str(identity) | ||
self.lock = threading.RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this lock per MockResourceLock? I'd expect it to be shared between all MockResourceLocks in one test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it should be shared. I'll update it. Thanks!
changed file naming style consistent with the existing go client code Update example.py Changed file and folder names Rename LeaderElection.py to leaderelection.py Rename threadingWithException.py to threadingwithexception.py Rename ConfigMapLock.py to configmaplock.py LeaderElection to leaderelection Added boiler plate headers, updated variable and function names consistent with the guidelines, removed the ctypes dependency by using traces to kill threads, changed logic for leader now it gives up and doesn't re-join as a follower if it fails to update lease added correct boiler plate year Rename threadingWithTrace.py to threadingwithtrace.py Update leaderelection.py Update example.py Changes based on review - logging, OnStoppedLeading is not killed abruptly, OnStartedLeading is not run in a separate thread, adding README Update example.py updated comments set threads as daemon Update README.md Code made consistent with other clients. Update example.py Update leaderelection.py Error & exception handling for the annotation, reduced indentation Adding serializing functions for serializing & de-serializing locks, leader_election_record as a class Adding a test Adding boilerplate header Rename leaderelectiontest.py to leaderelection_test.py Updated boiler plates handling imports for pytest handling 'HTTP not found' compatibility with python 2 & 3, & handling relative imports Update leaderelection.py to check tests for tox assertEquals -> assertEqual Update leaderelection_test.py making Threading compatible for Python 2 changing datetime.timestamp for backward compatibility with Python 2.7 Adding comments for test_Leader_election_with_renew_deadline & making candidates run in parallel for test_leader_election remove redundant daemon = True reassignment common thread lock for MockResourceLock
/lgtm Thanks! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Invictus17, roycaihw The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Leader Election issue #434
kubernetes-client/python#434