Skip to content

Conversation

@hexfusion
Copy link
Contributor

@hexfusion hexfusion commented Jul 25, 2019

This PR adds logic required to scale up during install via openshift-etcd-operator. More details soon..

required by: openshift/cluster-etcd-operator#19

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 25, 2019
@cgwalters
Copy link
Member

Would be nice to avoid the term "pivot" for this since it's highly associated here with the pivot command used in RHCOS (4.1).

@hexfusion
Copy link
Contributor Author

hexfusion commented Jul 25, 2019

Would be nice to avoid the term "pivot" for this since it's highly associated here with the pivot command used in RHCOS (4.1).

not an issue, will change this.

Copy link

@beekhof beekhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sane to me with the possible improvement of the scaling-lock variable naming.

duration := 10 * time.Second
// wait forever for success and retry every duration interval
wait.PollInfinite(duration, func() (bool, error) {
result, err := client.CoreV1().ConfigMaps("openshift-etcd").Get("scaling-lock", metav1.GetOptions{})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in this usage, "scaling-lock" seems misnamed.
Perhaps "known-members"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this configmap originally was a distributed lock similar to the leader election process. But I did not find that to be necessary will consider renaming.

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: beekhof, hexfusion
To complete the pull request process, please assign ashcrow
You can assign the PR to them by writing /assign @ashcrow in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

}
var e EtcdScaling
if runOpts.pivot {
clientConfig, err := rest.InClusterConfig()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that it requires k8s networking to be up to talk to apiservers... That would be difficult in recovery scenarios

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This functionality is part of cluster bootstrap and relies on the operator so networking would be assumed. DR would require standing up a single node etcd cluster then scaling additional members via operator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I will consider how to handle this for that single node. thanks.

Copy link
Contributor

@kikisdeliveryservice kikisdeliveryservice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a few questions

wait.PollInfinite(duration, func() (bool, error) {
result, err := client.CoreV1().ConfigMaps("openshift-etcd").Get("scaling-lock", metav1.GetOptions{})
if err != nil {
klog.Errorf("error creating client %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering why you used klog here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we have glog will drop.

klog.Errorf("could not find self in scaling-lock")
return false, nil
}
members := e.Members
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this placeholder var? would using e.Members directly cause an issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point will adjust

if etcdName == "" {
return fmt.Errorf("environment variable ETCD_NAME has no value")
}
var e EtcdScaling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you use a more descriptive name than e, it's hard to remember what this stands for as i read thru below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good.

@openshift-ci-robot
Copy link
Contributor

@hexfusion: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-disruptive c611bec link /test e2e-aws-disruptive
ci/prow/e2e-gcp-op c611bec link /test e2e-gcp-op
ci/prow/e2e-aws-upgrade c611bec link /test e2e-aws-upgrade
ci/prow/e2e-gcp-upgrade c611bec link /test e2e-gcp-upgrade

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@hexfusion
Copy link
Contributor Author

/close

Work is continued in #1221

@hexfusion hexfusion closed this Oct 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants