Skip to content

Conversation

@cybertron
Copy link
Contributor

No description provided.

In single node virtual environments, the system can get so overloaded
during deployment of the masters that the heartbeat times out, which
causes openshift-metal3#617. This change overrides the heartbeat timeout to change
it from the default of 60 seconds to 120 seconds. In my experience,
this is sufficient to prevent the timeouts.

Note that this is making use of the environment driver[0] in oslo.config
for setting the value. I don't think we want to change the value
in the container config since it's primarily for virtual dev
environments.

0: https://docs.openstack.org/oslo.config/latest/reference/drivers.html#environment
securityContext:
privileged: true
image: quay.io/celebdor/keepalived:latest
image: registry.svc.ci.openshift.org/ocp/4.2@sha256:daa9f390c43563b67546cd5b4cf3d8e351c3530f8091f523a73061fa441e8818
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously I need to figure out how to reference this image properly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You won't be able to do this "properly" in dev-scripts. I expect these manifests will move to MCO, right?

There's a bit of magic here you need to hook into. Assuming MCO is what has this manifest, then the image needs to be listed in this file as well: https://github.com/openshift/machine-config-operator/blob/master/install/image-references

By being listed in image-references for one of the cluster operators, the image will get included in the release payload. There's also magic in CVO to convert image references in CVO deployed manifests to use the right image for that release payload. It looks like MCO has this manifest that declares a config map of image references. I would expect keepalived to be added here as well, so you can get the right image from your MCO code: https://github.com/openshift/machine-config-operator/blob/master/install/0000_80_machine-config-operator_02_images.configmap.yaml

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this will need to go into the MCO PR @bcrochet is working on, when that lands all this asset mangling will go away ref #623

#561 tracks the removal and the MCO PR is openshift/machine-config-operator#795

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In dev scripts I suppose you could use an “oc adm release” command to inspect the release image and pull out the pullspec for an image in that release payload. That would just be a very short term hack though.


sudo podman run -d --net host --privileged --name ironic --pod ironic-pod \
--env MARIADB_PASSWORD=$mariadb_password \
--env OS_CONDUCTOR__HEARTBEAT_TIMEOUT=120 \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unrelated, but I need it for testing locally. Will remove before I push the final version of this change.


vrrp_script chk_dns {
script "host -t SRV _etcd-server-ssl._tcp.${DOMAIN} localhost"
script "/usr/bin/host -t SRV _etcd-server-ssl._tcp.${DOMAIN} localhost"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me that we need to drop DNS VIP.

@bcrochet
Copy link
Contributor

bcrochet commented Aug 5, 2019

This is obsolete and should be closed. @cybertron

@bcrochet
Copy link
Contributor

bcrochet commented Aug 5, 2019

This is obsolete and can be closed. @cybertron

@cybertron
Copy link
Contributor Author

No longer needed.

@cybertron cybertron closed this Aug 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants