Skip to content

Installing csi-proxy in some windows clusters#2012

Merged
k8s-ci-robot merged 1 commit into
kubernetes-sigs:mainfrom
marosset:windows-csi-proxy
Feb 10, 2022
Merged

Installing csi-proxy in some windows clusters#2012
k8s-ci-robot merged 1 commit into
kubernetes-sigs:mainfrom
marosset:windows-csi-proxy

Conversation

@marosset
Copy link
Copy Markdown
Contributor

@marosset marosset commented Jan 26, 2022

Signed-off-by: Mark Rossetti marosset@microsoft.com

What type of PR is this?

/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests

Release note:

Installs csi-proxy as a daemonset on Windows nodes for clusters that specify `csi-proxy: enabled`

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/provider/azure Issues or PRs related to azure provider sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 26, 2022
@jsturtevant
Copy link
Copy Markdown
Contributor

private cluster timed out.
/retest

Copy link
Copy Markdown
Contributor

@jsturtevant jsturtevant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan to follow up with a test config after?

I am not 100% sure as I haven't tried this yet but I think it needs to be supported in a few different templates including this one:

export CLUSTER_TEMPLATE="test/ci/cluster-template-prow-external-cloud-provider.yaml"

Comment thread templates/addons/windows/csi-proxy/csi-proxy.yaml Outdated
Comment thread templates/test/ci/patches/windows-csi-proxy-enabled.yaml Outdated
@marosset
Copy link
Copy Markdown
Contributor Author

Is the plan to follow up with a test config after?

Yep

@marosset
Copy link
Copy Markdown
Contributor Author

I am not 100% sure as I haven't tried this yet but I think it needs to be supported in a few different templates including this one:

Thanks, I wasn't sure which all clusters this needed to be enabled for. Hopefully it will be clearer once I start the test-infra/config changes.

@marosset
Copy link
Copy Markdown
Contributor Author

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jan 27, 2022
@marosset
Copy link
Copy Markdown
Contributor Author

/test pull-kubernetes-e2e-capz-azure-disk-windows

It looks like some test jobs for windows + capz + azure-disk were added with kubernetes/test-infra#24065

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@marosset: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test pull-cluster-api-provider-azure-build
  • /test pull-cluster-api-provider-azure-e2e
  • /test pull-cluster-api-provider-azure-e2e-windows-dockershim
  • /test pull-cluster-api-provider-azure-test
  • /test pull-cluster-api-provider-azure-verify

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-provider-azure-apidiff
  • /test pull-cluster-api-provider-azure-apiversion-upgrade
  • /test pull-cluster-api-provider-azure-capi-e2e
  • /test pull-cluster-api-provider-azure-ci-entrypoint
  • /test pull-cluster-api-provider-azure-conformance
  • /test pull-cluster-api-provider-azure-conformance-with-ci-artifacts
  • /test pull-cluster-api-provider-azure-coverage
  • /test pull-cluster-api-provider-azure-e2e-exp
  • /test pull-cluster-api-provider-azure-e2e-full
  • /test pull-cluster-api-provider-azure-e2e-workload-upgrade-1-21-1-22
  • /test pull-cluster-api-provider-azure-e2e-workload-upgrade-1-22-latest-main
  • /test pull-cluster-api-provider-azure-upstream-windows-dockershim
  • /test pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts
  • /test pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts-serial-slow

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-provider-azure-apidiff
  • pull-cluster-api-provider-azure-build
  • pull-cluster-api-provider-azure-ci-entrypoint
  • pull-cluster-api-provider-azure-coverage
  • pull-cluster-api-provider-azure-e2e
  • pull-cluster-api-provider-azure-e2e-windows-dockershim
  • pull-cluster-api-provider-azure-test
  • pull-cluster-api-provider-azure-verify
Details

In response to this:

/test pull-kubernetes-e2e-capz-azure-disk-windows

It looks like some test jobs for windows + capz + azure-disk were added with kubernetes/test-infra#24065

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@marosset
Copy link
Copy Markdown
Contributor Author

I am not 100% sure as I haven't tried this yet but I think it needs to be supported in a few different templates including this one:

Thanks, I wasn't sure which all clusters this needed to be enabled for. Hopefully it will be clearer once I start the test-infra/config changes.

Nevermind, wrong repo :)

/test pull-kubernetes-e2e-capz-azure-disk-windows

It looks like some test jobs for windows + capz + azure-disk were added with kubernetes/test-infra#24065

Nevermind, wrong repo :)

@marosset
Copy link
Copy Markdown
Contributor Author

/wip

@marosset
Copy link
Copy Markdown
Contributor Author

/hold
While I validate the dev-cluster scenario

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 28, 2022
resources:
- ../default
- machine-deployment-windows.yaml
- ../../addons/windows/csi-proxy/csi-proxy-resource-set.yaml
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have not been installing Cluster Resource Sets for the default templates, only for the E2E tests. This is mostly because cluster api doesn't include those definitions in what is considered a cluster.

We shouldn't include this in the default template as all addon and other components (such as cni) needs to be managed by the user after the base cluster is up and running.

Since CAPI doesn't deploy CNI's if a customer used this template with a CRS I think they would get errors since they clusterctl generate cluster <template> doesn't include a CNI. I think CRS are also an experimental feature?

The way we are handling this as of today with tilt is to deploy during tilt initialization like:

os.putenv("KUBERNETES_VERSION", settings.get("kubernetes_version", {}))
local(kubectl_cmd + " create configmap flannel-windows-addon --from-file=templates/addons/windows/flannel/ --dry-run=client -o yaml | " + envsubst_cmd + " | " + kubectl_cmd + " apply -f -")
local(kubectl_cmd + " create configmap calico-windows-addon --from-file=templates/addons/windows/calico/ --dry-run=client -o yaml | " + envsubst_cmd + " | " + kubectl_cmd + " apply -f -")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I was able to get things working this way for dev clusters!

@marosset
Copy link
Copy Markdown
Contributor Author

marosset commented Jan 31, 2022

I was able to manually validate that the windows-containerd and machinepool-windows-containerd workload clusters had csi-proxy running when deployed through tilt (and not deployed for windows or machinepool-windows as expected.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 31, 2022
@marosset
Copy link
Copy Markdown
Contributor Author

/hold cancel

@jsturtevant
Copy link
Copy Markdown
Contributor

We aren't running the tests against the csi proxy but they are modifying the templates to add them into the CI cluster templates so we should those tests to make sure it doesn't cause any issues

/test pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts
/test pull-cluster-api-provider-azure-conformance

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Feb 1, 2022
@marosset marosset force-pushed the windows-csi-proxy branch 2 times, most recently from 5ddbbb5 to fda29a8 Compare February 1, 2022 21:43
@marosset
Copy link
Copy Markdown
Contributor Author

marosset commented Feb 1, 2022

/test pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts
/test pull-cluster-api-provider-azure-conformance

@marosset
Copy link
Copy Markdown
Contributor Author

marosset commented Feb 1, 2022

It looks like the azure-disk/azure-file csi jobs are targeting cluster-template-prow.yaml based on
https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_azuredisk-csi-driver/1133/pull-kubernetes-e2e-capz-azure-disk-windows/1481582126804504576/build-log.txt

Create workload Cluster.

/home/prow/go/src/sigs.k8s.io/cluster-api-provider-azure/hack/tools/bin/envsubst-v2.0.0-20210730161058-179042472c46 < /home/prow/go/src/sigs.k8s.io/cluster-api-provider-azure/templates/test/ci/cluster-template-prow.yaml

Let me update this PR so csi-proxy is installed for those cluster templates

@marosset
Copy link
Copy Markdown
Contributor Author

marosset commented Feb 1, 2022

It looks like the azure-disk/azure-file csi jobs are targeting cluster-template-prow.yaml based on https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_azuredisk-csi-driver/1133/pull-kubernetes-e2e-capz-azure-disk-windows/1481582126804504576/build-log.txt

Create workload Cluster.

/home/prow/go/src/sigs.k8s.io/cluster-api-provider-azure/hack/tools/bin/envsubst-v2.0.0-20210730161058-179042472c46 < /home/prow/go/src/sigs.k8s.io/cluster-api-provider-azure/templates/test/ci/cluster-template-prow.yaml

Let me update this PR so csi-proxy is installed for those cluster templates

I made these updates

@marosset
Copy link
Copy Markdown
Contributor Author

marosset commented Feb 2, 2022

/test pull-cluster-api-provider-azure-ci-entrypoint
Re-triggering test to pick up kubernetes/test-infra#25082

Comment thread Tiltfile
@@ -218,6 +218,8 @@ def create_crs():
local(kubectl_cmd + " delete configmaps calico-ipv6-addon --ignore-not-found=true")
local(kubectl_cmd + " create configmap calico-ipv6-addon --from-file=templates/addons/calico-ipv6.yaml")
local(kubectl_cmd + " delete configmaps flannel-windows-addon --ignore-not-found=true")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated to this PR but @jsturtevant do you know why we are deleting the flannel-windows-addon cm but not creating it?

Copy link
Copy Markdown
Contributor

@jsturtevant jsturtevant Feb 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's created on line https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2012/files#diff-c2ee8653e1d6b85f0aadf87cd438a9250806c052877248442be4d434cbc52425R226

though it doesn't look like we delete calico-windows-addon which is added the line after. Not sure if deletion is required... Seems to be working, even on reboots of tilt

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to open another PR to fix that

@marosset marosset force-pushed the windows-csi-proxy branch 2 times, most recently from 393debc to f839338 Compare February 8, 2022 16:57
@marosset
Copy link
Copy Markdown
Contributor Author

marosset commented Feb 8, 2022

@jsturtevant
Copy link
Copy Markdown
Contributor

@marosset could you squash the commits? lgtm but I would also like to run the conformance tests since those templates are modified though I don't expect anything.

@marosset
Copy link
Copy Markdown
Contributor Author

marosset commented Feb 8, 2022

@marosset could you squash the commits? lgtm but I would also like to run the conformance tests since those templates are modified though I don't expect anything.

I added the label to have tide squash the commits for me. Is that sufficient?

@jsturtevant
Copy link
Copy Markdown
Contributor

This has come up in past. PR release notes don't show up in the release notes with this label.

@marosset
Copy link
Copy Markdown
Contributor Author

marosset commented Feb 8, 2022

This has come up in past. PR release notes don't show up in the release notes with this label.

Ah ok.
I'll rebase shortly.

Adding csi-proxy to windows-containerd dev template
Moving csi-proxy addon to cluster-template-prow.yaml since that is what azure-disk/file tests use in CI

Signed-off-by: Mark Rossetti <marosset@microsoft.com>
@marosset
Copy link
Copy Markdown
Contributor Author

marosset commented Feb 8, 2022

/remove-label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot removed the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Feb 8, 2022
@jsturtevant
Copy link
Copy Markdown
Contributor

/lgtm
/test pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts
/test pull-cluster-api-provider-azure-conformance

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2022
Copy link
Copy Markdown
Contributor

@devigned devigned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: devigned

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 9, 2022
@k8s-ci-robot k8s-ci-robot merged commit 6e20db5 into kubernetes-sigs:main Feb 10, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.1 milestone Feb 10, 2022
@marosset marosset deleted the windows-csi-proxy branch February 10, 2022 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants