Skip to content
This repository has been archived by the owner on Oct 7, 2020. It is now read-only.

[kiali] use kiali operator to install and manage kiali #556

Closed
wants to merge 7 commits into from

Conversation

jmazzitelli
Copy link
Member

@jmazzitelli jmazzitelli commented Nov 15, 2019

(work on this PR is on hold until the monorepo PR is merged, then we can merge the work done here and in PR istio/operator#622)

This should supercede this PoC PR: #295 (that one is not complete and I'm not sure will work the way we want.)

The gist of what is trying to be done here is:

If --set Values.kiali.operator.enabled=true, the operator is to be deployed in which case the following should happen:

  1. Create kiali-operator namespace - this is where the operator will go
  2. Deploy all the necessary operator resources (Deployment, ClusterRole, ServiceAccount, etc) in kiali-operator namespace. There are included here the 2 CRDs necessary for Kiali operator. The operator Deployment will have a WATCH_NAMESPACE of empty string "" thus allowing this one operator to install Kiali anywhere it finds a Kiali CR (this will support multi-tenancy if/when supported in the future).

If --set Values.kiali.enabled=true, Kiali should be deployed in which case the following should happen (notice, you can install Kiali if the operator was already running - you just pass in --set Values.kiali.operator.enabled=false so istioctl doesn't try to install another one).

  1. Create a demo secret in istio-system namespace, if configured to do so
  2. Create a Kiali CR in istio-system namespace, if configured to do so (this will trigger the operator to install Kiali in istio-system).

Notice that this implementation allows a user to wholly configure the Kiali CR within values.yaml by specifying the kiali.fullSpec value. This will allow a user to be able to configure any setting within Kiali - see https://github.com/kiali/kiali/blob/master/operator/deploy/kiali/kiali_cr.yaml for all the settings available.

There is a problem with this PR, though. When I try to test it by passing in --set installPackagePath=/git/istio.io/installer (so it picks up my local changes), for some reason the kiali-operator namespace (the namespace where the kiali operator resources are to be deployed) gets immediately pruned after it is created. The --verbose --logtostderr output has this in it:

2019-11-15T21:34:19.584757Z   info
Component Kiali failed install:
===============================
2019-11-15T21:34:19.584767Z   info  Error: error running kubectl: exit status 1

2019-11-15T21:34:19.584777Z   info  Error detail:



Error from server (Forbidden): error when creating "STDIN": serviceaccounts "kiali-operator-service-account" is forbidden: unable to create new content in namespace kiali-operator because it is being terminated (repeated 1 times)
the namespace from the provided object "istio-system" does not match the namespace "kiali-operator". You must pass '--namespace=istio-system' to perform this operation. (repeated 2 times)
Error from server (Forbidden): error when creating "STDIN": deployments.apps "kiali-operator" is forbidden: unable to create new content in namespace kiali-operator because it is being terminated (repeated 1 times)


2019-11-15T21:34:19.584785Z   info
namespace/kiali-operator created

customresourcedefinition.apiextensions.k8s.io/kialis.kiali.io created
customresourcedefinition.apiextensions.k8s.io/monitoringdashboards.monitoring.kiali.io created
namespace/kiali-operator pruned

clusterrole.rbac.authorization.k8s.io/kiali-operator created
clusterrolebinding.rbac.authorization.k8s.io/kiali-operator created

I don't understand why namespace/kiali-operator pruned is happening.

There are a couple questions that need to be answered before we can use this PR:

  1. Why is the kiali-operator namespace immediately deleted after it is created?
  2. Is it OK for CRDs to be included here? Or do we need to deploy the CRDs in the base package with the rest of the CRDs? The CRDs are to support the Kiali CR Kind as well as the monitoring dashboards (which is a feature of Kiali, but an optional feature).

@jmazzitelli jmazzitelli requested a review from a team as a code owner November 15, 2019 22:04
@googlebot googlebot added the cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. label Nov 15, 2019
@istio-testing istio-testing added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Nov 15, 2019
@jmazzitelli jmazzitelli added the do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. label Nov 15, 2019
@jmazzitelli jmazzitelli mentioned this pull request Nov 16, 2019
Copy link
Member

@howardjohn howardjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from an install standpoint, will need to get feedback from people familiar with the operator as well and to help solve the issues you ran into.

I think the CRDs probably should he in the base, but I would wait for confirmation

Thanks for working on this!

Is that expectation that people will use fullSpec and the old stuff is around for backwards compatiblity, or as a "break glass" mechanism?

ansible.operator-sdk/reconcile-period: "0s"
spec:
{{- if .Values.kiali.fullSpec }}
{{ toYaml .Values.kiali.fullSpec | indent 2 }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a nice implementation

@jmazzitelli
Copy link
Member Author

Is that expectation that people will use fullSpec and the old stuff is around for backwards compatiblity, or as a "break glass" mechanism?

I was considering fullSpec to be the way to do this moving forward (old stuff can be kept around for backwards compat for now, but I would deprecate them for removal in a future istio release). However, if people feel they want the old stuff to remain as the recommended settings, this fullSpec can be considered a break glass mechanism.

But I think it would be easiest if we just recommend people use "fullSpec" and put in the Kiali CR yaml right here (thus, eventually making kialicr.yaml template very small and simple). Otherwise, we are going to be forced to duplicate settings in values.yaml and pass those settings individually to the kialicr.yaml template (as this PR is doing in order to maintain backward compat with the settings we have today - making kialicr.yaml more complex). This becomes a maintenance problem because it means the values.yaml has to track and keep up to date with any settings that Kiali CR adds or changes. This is why I think we should recommend the user to put Kiali CR yaml directly in values.yaml via fullSpec.

@howardjohn
Copy link
Member

100% agree with you there. Would it make sense to just call this spec then, making it more like other k8s objects? Not a big deal but feels a bit more natural.

@jmazzitelli
Copy link
Member Author

jmazzitelli commented Nov 17, 2019

Would it make sense to just call this spec then, making it more like other k8s objects? Not a big deal but feels a bit more natural.

We could do that, yes. No reason why it has to be called fullSpec. I agree that spec would be more natural.

I do want to bring up one issue that may be a problem with this fullSpec setting (to be renamed spec).

Because this fullSpec is found inside values.yaml, there is no {{ ABC }} templates allowed. For most of the settings, this is fine. However, for some things, the values require data that come from the .Values.global template or .Release.ABC values.

For example, spec.deployment.namespace -- we need to specify the namespace where Kiali is to be deployed, and that is {{ .Release.Namespace }}. (side note: this problem with this particular setting will actually go away when kiali/kiali#1944 (PR kiali/kiali#1945) is implemented, because we can leave this deployment.namespace undefined in values.yaml fullSpec, and Kiali will simply be installed where the Kiali CR is stored, which is defined in the kialicr.yaml template using {{ .Release.Namespace }}.)

But there are a few other examples which I do not have an answer for, such as Kiali needs to be told where certain components live via the istio_component_namespaces setting:

istio_component_namespaces:
  grafana: "{{ .Values.global.telemetryNamespace }}"
  pilot: "{{ .Values.global.configNamespace }}"
  prometheus: "{{ .Values.global.prometheusNamespace }}"
  tracing: "{{ .Values.global.telemetryNamespace }}"

The other place where this is a problem is where we need to specify the affinity settings:

affinity:
  {{- include "nodeaffinity" . | indent 6 }}
  {{- include "podAntiAffinity" . | indent 6 }}

I do not believe we can ship with a values.yaml that only has fullSpec because there is no way to ship a values.yaml that defines these settings -- as you see, these settings require values coming from the .Values.global template. Perhaps there are some helm templating tricks we can use???

I believe this kind of problem can be worked around once this Helm feature is merged and available: helm/helm#6876

@howardjohn
Copy link
Member

We have a similar problem with the injection template. Since that is rendered by the sidecar injector rather than helm we added a render function. I am not sure if we can somehow do something similar here.

It's possible we don't need to support direct helm installation, and maybe the operator could add this functionality, but I think it's just calling out to helm anyways so maybe not.

I'll think about it some more and maybe there's another way around it

@jmazzitelli
Copy link
Member Author

  • Why is the kiali-operator namespace immediately deleted after it is created?

This most likely has to do with the use of --prune in the kubectl apply command that is executed when helm is processing the templates.

@jmazzitelli
Copy link
Member Author

This incorporates the changes for issue kiali/kiali#1903
There is a PR for that issue istio/istio#18819 but that fixes the existing helm charts for the istioctl installed.
In this PR we put the same thing for support in the Kiali CR.

@istio-testing istio-testing added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 19, 2019
@jmazzitelli
Copy link
Member Author

Right now, I'm blocked waiting on istio/operator#622 - I do not know all the knobs and dials to tweek to get an istio/operator build completed and pulled in for use with this PR (this PR cannot work on its own - it needs istio/operator changes because we are adding in a new kiali-operator component).

Someone should write a document to assist new developers on the project to come up to speed on how to code in the istio/operator and istio/installer repos - right now, it is a mystery to me how one adds a new component and gets operator to be pulled into the installer for local testing.

@mandarjog
Copy link
Contributor

@ostromart for operator.

@sdake
Copy link
Member

sdake commented Nov 25, 2019

Right now, I'm blocked waiting on istio/operator#622 - I do not know all the knobs and dials to tweek to get an istio/operator build completed and pulled in for use with this PR (this PR cannot work on its own - it needs istio/operator changes because we are adding in a new kiali-operator component).

Someone should write a document to assist new developers on the project to come up to speed on how to code in the istio/operator and istio/installer repos - right now, it is a mystery to me how one adds a new component and gets operator to be pulled into the installer for local testing.

@jmazzitelli your not alone -> istio/istio#19146 (comment)

Copy link
Member

@sdake sdake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor problems. Overall really great work! Keep at it. We definitely want to head in the direction of using operators for all of our third-party components and integrating the operator in a similar fashion (or maybe even simpler than this integration).

The perfect integration would be "here is the image, launch it as a pod" and CRs are responsible for the rest of the magic. Thoughts for longer term planning?

istio-telemetry/kiali-operator/templates/clusterrole.yaml Outdated Show resolved Hide resolved
sidecar.istio.io/inject: "false"
scheduler.alpha.kubernetes.io/critical-pod: ""
prometheus.io/scrape: "true"
prometheus.io/port: "8383"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this port not be 42422?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is 8383. This is the metrics port that base ansible operator SDK opens up and exposes its metrics -- see https://github.com/operator-framework/operator-sdk/blob/v0.9.x/pkg/ansible/run.go#L43

So these are operator metrics (the operator itself exposes its own metrics to monitor itself).

See: kiali/kiali#1561 (that issue documents a snippet of the metrics that prometheus will be able to scrape on the operator's port 8383).

name: runner
env:
- name: WATCH_NAMESPACE
value: ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this watch namespace wildcard watch everything? I'm not sure this is in policy for Istio's security. @istio/wg-security-maintainers should provide feedback here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the WATCH_NAMESPACE that allows the Kiali operator to watch all namespaces. i.e. This is the typical operator InstallMode of "AllNamespaces". It is how you can install a single operator and be able to install Kiali where ever a user has permission to (thus you need only have a single operator in your cluster). This is how we support multi-tenancy (at least its how it works with Maistra).

You have a single operator (who is installed typically by a cluster admin). Then you have users who may or may not be cluster admin - but who have permission to create a Kiali CR in a namespace where Kiali is to be deployed. This way you can have one tenant owner put a Kiali in one namespace/control plane and have a second tenant (who has no permission to see that first control plane) install Kiali in a different namespace/control plane. This is typical operator use-case. See this for a more general description of what is going on: https://medium.com/@jmazzite/kubernetes-operators-a-very-brief-overview-270e75f3dfab

That said, if Istio does not want to support multi-tenancy in this way, we can put the operator in the same namespace as the control plane itself (e.g. istio-system), but this would be kinda weird putting the operator together with Kiali in the control plane namespace, and if you have multiple control planes you would then be required to install multiple operators (when you really shouldn't have to).

istio-telemetry/kiali/values.yaml Show resolved Hide resolved
tolerations: []
replicaCount: 1

createDemoSecret: false # When true, a secret will be created with a default username and password. Useful for demos.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We eventually want to get Istio to the state where everything is true - or turned on by default. As this extends the API, can you give some thought to a better name for line 11 that matches with this goal?

Copy link
Member Author

@jmazzitelli jmazzitelli Nov 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a setting introduced a while ago - @linsun asked for this because at the time when users were installing the demo profile, we were still requiring them to create a secret and some people forgot or didn't know and were not able to access Kiali. So it was decided to create a demo secret for the demo profile to avoid that problem. See this issue for background and discussion about this: istio/istio#11244

So I don't know what we want to do here. We could remove it, and always require the user to create secrets - you can see where we tell people to do this in the Task doc here: https://istio.io/docs/tasks/observability/kiali/#create-a-secret

Or we could have a default authentication strategy that doesn't require a secret to contain credentials. In fact, Kiali does not require a secret by default if installing in OpenShift - the Kiali operator just sets the auth.strategy to "openshift" which is an oauth mechanism to simply allow the user to log in using his/her OpenShift login credentials. There is another auth.strategy called "anonymous" which doesn't require the user to use any credentials - if a user has the Kiali URL, the user will gain access to the Kiali UI (so that also does not require a secret and may be a good alternative for the demo profile).

istio-telemetry/kiali/values.yaml Show resolved Hide resolved
@istio-testing istio-testing added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 25, 2019
@jmazzitelli
Copy link
Member Author

See istio/operator#622 (comment) for the current state of affairs.

@clyang82
Copy link
Member

@jmazzitelli Please add a make target with the steps to test the kiali operator and kiali. At least, we need to test and ensure everything can be installed successfully.

@clyang82
Copy link
Member

/retest

@istio-testing
Copy link

@jmazzitelli: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
lint_installer 9934cd4 link /test lint_installer
build_installer 9934cd4 link /test build_installer
noauth_installer 9934cd4 link /test noauth_installer
demo_installer 9934cd4 link /test demo_installer

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jmazzitelli
Copy link
Member Author

jmazzitelli commented Dec 20, 2019

This stuff isn't ready yet. Waiting for the monorepo effort to complete this work.
There is a companion PR to this PR that is required - istio/operator#622
See my comment above: #556 (comment) "this PR cannot work on its own - it needs istio/operator changes because we are adding in a new kiali-operator component"

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants