Skip to content

Conversation

@abhinavdahiya
Copy link
Contributor

With changes in the cluster-api-provider-azure from openshift/cluster-api-provider-azure#35, the Azure IPI fails to create worker VMs like
https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/4006/rehearse-4006-pull-ci-openshift-origin-master-e2e-azure/1/artifacts/e2e-azure/must-gather/namespaces/openshift-machine-api/pods/machine-api-controllers-c95fb75b5-rzjks/machine-controller/machine-controller/logs/current.log

2019-06-07T19:38:06.7703507Z E0607 19:38:06.770317       1 actuator.go:79] failed to reconcile machine ci-op-pqj0zh18-849ce-9765f-worker-qg7fj: failed to create nic ci-op-pqj0zh18-849ce-9765f-worker-qg7fj-nic for machine ci-op-pqj0zh18-849ce-9765f-worker-qg7fj: MachineConfig subnet is missing on machine ci-op-pqj0zh18-849ce-9765f-worker-qg7fj, skipping machine creation

This changes the installer to set the subnet names based on the role of the MachinePool.

/cc @enxebre @ingvagabund

cluster-api-provider-azure PR [1], allows installer to set the subnets for machines explicitly rather than the implicit names.

The new subnet name for control plane VMs is `<cluster-id>-master-subnet`, while the subnet name for compute VMs is `<cluster-id>-worker-subnet` based on the
`role` for these MachinePools

[1]: openshift/cluster-api-provider-azure#35
@abhinavdahiya
Copy link
Contributor Author

/test e2e-azure

@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 7, 2019
@ingvagabund
Copy link
Member

/test e2e-azure

@ingvagabund
Copy link
Member

@abhinavdahiya #35 also removed pieces setting load balancers. Was installer relying on azure actuator to set the load balancers when creating master machine(s)?

@ingvagabund
Copy link
Member

/retest

@ingvagabund
Copy link
Member

/test e2e-azure

1 similar comment
@ingvagabund
Copy link
Member

/test e2e-azure

@ingvagabund
Copy link
Member

Repeated in a loop in openshift-machine-api_machine-api-controllers-7d7675558d-sk6r9_machine-controller.log:

I0612 21:08:32.957058       1 controller.go:129] Reconciling Machine "ci-op-qdxjy6fw-2dc90-vcspv-worker-2x6st"
I0612 21:08:32.957089       1 controller.go:298] Machine "ci-op-qdxjy6fw-2dc90-vcspv-worker-2x6st" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster
I0612 21:08:32.957100       1 actuator.go:146] Checking if machine ci-op-qdxjy6fw-2dc90-vcspv-worker-2x6st exists
E0612 21:08:32.957527       1 controller.go:233] Failed to check if machine "ci-op-qdxjy6fw-2dc90-vcspv-worker-2x6st" exists: failed to create scope: Secret "azure-credentials" not found
failed to update cluster
sigs.k8s.io/cluster-api-provider-azure/pkg/cloud/azure/actuators.NewMachineScope
	/go/src/sigs.k8s.io/cluster-api-provider-azure/pkg/cloud/azure/actuators/machine_scope.go:86
sigs.k8s.io/cluster-api-provider-azure/pkg/cloud/azure/actuators/machine.(*Actuator).Exists
	/go/src/sigs.k8s.io/cluster-api-provider-azure/pkg/cloud/azure/actuators/machine/actuator.go:148
sigs.k8s.io/cluster-api-provider-azure/vendor/github.com/openshift/cluster-api/pkg/controller/machine.(*ReconcileMachine).Reconcile
	/go/src/sigs.k8s.io/cluster-api-provider-azure/vendor/github.com/openshift/cluster-api/pkg/controller/machine/controller.go:231
sigs.k8s.io/cluster-api-provider-azure/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/src/sigs.k8s.io/cluster-api-provider-azure/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:210
sigs.k8s.io/cluster-api-provider-azure/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/go/src/sigs.k8s.io/cluster-api-provider-azure/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158
sigs.k8s.io/cluster-api-provider-azure/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
	/go/src/sigs.k8s.io/cluster-api-provider-azure/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152
sigs.k8s.io/cluster-api-provider-azure/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/src/sigs.k8s.io/cluster-api-provider-azure/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153
sigs.k8s.io/cluster-api-provider-azure/vendor/k8s.io/apimachinery/pkg/util/wait.Until
	/go/src/sigs.k8s.io/cluster-api-provider-azure/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:2361

@ingvagabund
Copy link
Member

/test e2e-azure

machine-api-operator moved the machine-controllers to only allow access to secrets in openshift-machine-api namespace [1]

machine-api-operator PR [2], creates a secret `openshift-machine-api/azure-cloud-credentials` that can be used by azure machines

[1]: openshift/machine-api-operator@df49930
[2]: openshift/machine-api-operator#325
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 13, 2019
@ingvagabund
Copy link
Member

/test e2e-azure

@abhinavdahiya
Copy link
Contributor Author

Looks like last run of azure failed where we expected.
Networking should provide Internet connection for containers [Feature:Networking-IPv4] [Suite:openshift/conformance/parallel] [Suite:k8s] expand_more

e2e-azure

@ingvagabund
Copy link
Member

@abhinavdahiya I assume it's a known issue, so we can merge this PR

@abhinavdahiya
Copy link
Contributor Author

@abhinavdahiya I assume it's a known issue, so we can merge this PR

Yes that a known failure, that I'm working on fixing. We can merge this.

@ingvagabund
Copy link
Member

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 16, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, ingvagabund

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ingvagabund
Copy link
Member

/test e2e-azure

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 311a8a1 into openshift:master Jun 16, 2019
@openshift-ci-robot
Copy link
Contributor

@abhinavdahiya: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-azure b29f9da link /test e2e-azure
ci/prow/e2e-aws-scaleup-rhel7 b29f9da link /test e2e-aws-scaleup-rhel7

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants