Skip to content

Conversation

@elevran
Copy link
Collaborator

@elevran elevran commented Jan 28, 2024

Detailed design document


- **Fabric administrator** has control over the entire ClusterLink fabric. They are responsible
for creating the fabric's root of trust, controlling which peers are part of the fabric, etc.
This role is mostly not in scope for this design document.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need a document that describes the different roles.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. This is the minimal work needed to define their existence and I think it is good enough (informal but conveys the intent).
Once we have more experience operating the CL system, we can revisit and formalize.

status conditions are suggested. Controllers should use the following conditions types
to signal whether a resource has been reconciled, or if it encountered any problems:

- `Reconciling`: Indicates that the resource actual state does not yet match its desired
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if reconciling is the best word in deployment they use progressing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The controller is "reconciling" the actual state with the intended state in the CRD. I take it to mean reconciliation is in progress...
Open to other suggestions. This was meant to be generic so it can be made consistent across different objects. I prefer not to use CR specific conditions if the generic ones are clear and accurate (e.g., you can use the message field to capture controller/CR specific knowledge)

}
```

TODO: how and where are Service attributes defined? Both import and export are set
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do peers have an attribute field while exports and imports do not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to capture Service attributes on the Import/Export. The issue is that these are set by namespace users and Service attributes seem to be more sensitive than to allow arbitrary set by any user (they affect the policies).
Import seems the least likely object for it, Exports are better, but having these in the privileged namespace would be best, IMHO

Copy link
Collaborator

@zivnevo zivnevo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I opened a discussion on setting load-balancing policies.

- **Site administrator** has control over a single cluster in the fabric and manages the
installation and operation of the local ClusterLink deployment. Site administrators can
add remote Peers to their cluster and may define admin level network policies.
- **Application owners and developers** manage Services and their access policies. They
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we support scenarios where fabric admin, site admin and application owners belong to the same entity? In this case, Clusterlink control plane and dataplane would be deployed on a single namespace which the user has access to. Ofcourse we wouldn't have CRD for that scenario. But, I am just concerned if we remove the API access, how we intend to support such scenarios.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would differentiate between two aspects:

  • the CRDs are installed (one time operation) and no further admin level access is required; and
  • absolutely no global cluster admin privileges.

The first is on our roadmap, and this design doc "hints" at what could be done to support it (e.g., Peer are defined in the same namespace as the CP and DP). I'm not sure how this would work with a true "no cluster privileges" mode as we need to expose the CP/DP Service outside of the cluster which (IMHO) is a privileged operation. Think of the case of multiple different namespace, each with their own deployment - how would you route from the outside to the correct CL instance? You would need to open multiple ingress points into the cluster and I'm not sure this would be considered "unprivileged". Perhaps we can explore the use case you have in mind in more details. Note that if one time admin access is reasonable (e.g., CRD and CL are deployed) all other management would be per namespace.

The second is currently out of scope, but we might be able to support it by "emulating" CRDs as ConfigMap objects (e.g., with version, type, spec.* and status.* fields. You lose the type validation provided by the API for CRDs, but the changes in the controllers may be minimized by adding a layer to convert between ConfigMap and the corresponding Go structure for the CRD and then handing the "CRD instance" for further processing.

@elevran
Copy link
Collaborator Author

elevran commented Feb 12, 2024

@kfirtoledo @orozery please let me know if there are any pending issues requiring closure in this iteration. Please approve if ok to merge for now and address remaining issues in follow ups

@elevran elevran force-pushed the crd_based_management branch from 7e7dc01 to 2dff70e Compare February 13, 2024 14:39
@elevran elevran merged commit 2edce21 into clusterlink-net:main Feb 13, 2024
@elevran elevran deleted the crd_based_management branch February 13, 2024 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants