Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: architectural changes #4057

Closed
6 tasks
metacosm opened this issue Sep 30, 2020 · 18 comments
Closed
6 tasks

Proposal: architectural changes #4057

metacosm opened this issue Sep 30, 2020 · 18 comments
Assignees
Labels
area/refactoring Issues or PRs related to code refactoring lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@metacosm
Copy link
Contributor

metacosm commented Sep 30, 2020

Issues with the current architecture

Lack of abstraction

The main issue with the architecture, in my opinion, is that there is not enough abstraction over concepts and their actual
representation / implementation. This manifests itself at several levels, the main ones being:

  • no platform abstraction (OpenShift vs. plain Kubernetes vs. Docker), each requiring their own clients when a single one could work for both
  • little to no component concept abstraction: each component implementation is actually a combination of several structures and representations which are directly accessed

This results in code complexity because client code is forced to deal with the low-level structures and particularities instead of being able to rely on a common abstraction. This has the following consequences:

  • parallel code bases where code to deal with one component type is copied over and sightly changed, all of it hidden behind lots of if/else conditionals, to implement a different component type
  • client code needs to be aware of low-level representations, accessing them directly making it very difficult to change them because it impacts code almost everywhere
  • cluster representations are not abstracted either tying odo to particular cluster representations rendering the code base
    fragile in case these representations need to be changed
  • changes cannot easily be made because there is often not one single spot to change or seemingly innocuous changes have unintended consequences

Context under-utilization

Context was created to provide contextual information to commands so that each command didn't need to perform the same logic over and over. It seems that the concept hasn't been developed as much as it could, leading to commands performing logic to test things in slightly different ways resulting in differing behavior between commands. This is especially evident when looking at how commands identify whether they're dealing with a devfile component or not. This points also loops back to the lack of abstraction issue.

Moreover, commands and utility classes could be simplified if that contextual information could be reliably accessed at
different levels of the code instead of having to pass some parts of it around as function arguments often leading to functions with more than 5 arguments, which very often don't always need to be present at the same time depending on the execution context.

Recommendations

Create proper abstractions for the core concepts

The odo core concepts should have proper abstractions hiding the implementation details so that client code (e.g. commands) don't need to know or deal with the messy details, instead manipulating concepts as opposed to implementations. More precisely, odo should have Component, App and Project abstractions at the very least, each hiding how they are implemented. In particular, for the Component concept, it should hide whether we're dealing with an s2i or devfile component, provide an interface to access its different elements without having to know how they are implemented. In particular, it should hide how the information is represented locally or on the cluster: client code should not have to know how to parse an env / devfile / whatever to extract the name of the component or its URLs. Similarly, client code should not need to know that a Component is represented by a DeploymentConfig on OpenShift but a Deployment on plain Kubernetes. All these implementations should be hidden.

Another aspect of this is that these abstractions should provide an interface, and their data should only be accessed via
accessor functions, not directly. This is a basic encapsulation idea, but it is very helpful for several reasons:

  • it unifies client code because the accessor exposes the internal data in a unified way (in particular when dealing with
    missing data, providing sane defaults or at least properly defined behavior) instead of letting client code figure out what to do and, thus, possibly doing different things at different spots
  • it makes lazy initialization possible leading to more efficient code where data is only retrieved / parsed / created when
    needed, this also means that availability of resources can also be delayed (for example, a connection to the cluster might not be required at the onset)
  • it makes it a lot easier to change the underlying representations / implementations because now you only have one spot to change instead of the implementation being spread across client code and different structures

Related issues:

Unify cluster access

There should only be one client to access the cluster and that client should also provide an abstraction above the platform
being accessed. There is no reason a command should know whether it deals with a plain Kubernetes cluster or an OpenShift one: the client should hide the differences. Coupled with proper abstractions for concepts, this would lead to dramatically simpler code base because now commands would only deal with abstractions instead of having to know the details of how a component is implemented for each supported type and platform. The client should only return high level Component or App abstractions instead of low-level representations such as DeploymentConfig or Ingress. It's not excluded that several client implementations be provided; these implementations should, however, be accessed via a unified interface and the specific implementation hidden from client code: a command should not need to know how to create a client, just that it can retrieve the appropriate one when needed.

Related issues:

Generalize Context use

Commands require lots of information to be able to do their work and gathering that information (or making it easy to do so), is the role of the Context. Combined with the previous two points (abstractions over concepts and cluster access), moving gathering all the information that commands require into Context would make command implementations simpler and more coherent. Context would build the appropriate abstractions based on the command's execution context: which flags have been specified, what files are locally present, which cluster can be accessed and its type, etc. All that logic would be encapsulated in a single spot so that it could easily be changed if needed but, more importantly, it could be implemented only once instead of requiring commands to rewrite the same code over and over, possibly with slight differences leading to subtle bugs and incoherent behavior.

Associated with concept abstractions, the Context would provide access to lazily instantiated concepts instead of providing direct access to local config, env or devfile files. Associated with cluster abstraction, the Context would provide simpler cluster access with a lazily instantiated, unified client instead of providing one client for each platform and letting commands figure out what to do based on the nullity of one or the other…

Related issues:

Properly use the Complete/Validate/Run pattern for commands

The Complete/Validate/Run pattern for commands is meant to simplify the maintenance of command implementations and make sure that execution steps are taken when needed, not before. This also makes it easier to onboard new developers since command implementations have a structure that is easily to follow. While it sometimes makes sense to deviate from this pattern (for example, when validation could be performed more efficiently in the Complete step), the pattern should be followed as much as possible. In particular, any computation steps should not be performed before the command is actually executed (i.e. Complete is called), and definitely not when the command is created. The Context should be created in the Complete step so that it's available for further computation.

Related issues:

Conclusion

odo has reached a state where evolving it becomes more and more difficult, slowing down the development of new features while maintaining existing ones without breaking them. With the 2.0 release out of the way, it's time to clean up the architecture to be able to move forward with greater confidence and velocity.

@metacosm
Copy link
Contributor Author

/cc @kadel @maxandersen

@kadel
Copy link
Member

kadel commented Sep 30, 2020

👍 Thank you @metacosm for summarizing this.

+1 to all points that you made.
The challenging part is going to be how to approach this without the need to start from scratch.

What you mentioned is actually quite similar to the original intended design when we started with odo.
The idea was to have multiple layers of abstraction.

  • lowest layer would be a package that interacts with the cluster (using client-go). This is what occclient was supposed to be
  • on top of that there would be packages for Component, App, Project, URL, Storage etc.. those should never communicate with the cluster directly (they should not import client-go) and all interaction they need to do should go via lower layer (occlient)
  • the highest layer than would be packaged handling CLI commands. Those should never use the occlient package, everything they need to do should go via the previous layer (Component, App ....)

You can still see some relics of this design in code. But then at some point, as the new code was added people stopped adhere to this design and we ended up in the mess where we are :-(

I would actually like to go back to this strict architecture, where each abstraction layer can only access the layer directly underneath (they should never skip abstraction layers, for example, code handling CLI command should never use code on occlient layer directly or client-go).

I think that one of the reasons why this failed was the occlient package. There was never a clear API or interface that would define this abstraction layer and it grew into scary proportions. So when Kubernetes implementation came it created the disaster that we currently have.

@metacosm
Copy link
Contributor Author

metacosm commented Oct 1, 2020

👍 Thank you @metacosm for summarizing this.

+1 to all points that you made.
The challenging part is going to be how to approach this without the need to start from scratch.

That's indeed a challenge though it's fair to wonder if it might not be faster re-writing than trying to fix the architecture. The "good" thing about a rewrite is that the logic is known and presumably could be re-implemented much faster. I've tried to fix the architecture in parts but it's quite difficult because of all the parts that are affected and the complexity of the code so changing stuff without breaking anything is exceedingly difficult.

@mik-dass
Copy link
Contributor

mik-dass commented Oct 13, 2020

Client refactoring: #4093
Context refactoring: #4081
Package refactoring: #4115

@kadel
Copy link
Member

kadel commented Oct 26, 2020

@metacosm I've updated the issue descriptions with links to issues that are opened related to each of your recommendations.

@metacosm
Copy link
Contributor Author

metacosm commented Oct 27, 2020

Here are some ideas:

Architecture ideas

Abstract domain model

Domain model concepts should be hidden behind interfaces so that different implementations (s2i, devfile) can provide the same API to interested clients.

Here are some initial thoughts on how this could look like:

package model

type Component interface {
	GetApplication() Application
	GetType() string
	GetName() string
	GetSourceLocation() string
	GetSourceType() SourceType
	GetURLs() []URL
	GetStorages() []Storage
	GetEnvVars() []Env
	GetPorts() []string
	GetStatus() ComponentStatus
	GetLinkedComponents() map[string]Component
	GetLinkedServices() map[string]Service
}

type SourceType string

type Application interface {
	GetName() string
	GetComponents() map[string]Component
	GetServices() map[string]Service
}

type URL interface {
	GetHost() string
	GetProtocol() string
	GetPort() int
	IsSecure() bool
	GetKind() URLKind
	GetTLSSecret() string
	GetExternalPort() int
	GetPath() string
	GetStatus() URLStatus
}
type URLKind string

type Storage interface {
	GetSize() string
	GetPath() string
	GetContainerName() string
	GetStatus() StorageStatus
}

type Env interface {
	GetName() string
	GetValue() string
}

type Service interface {
	GetName() string
	GetBindingInfo() BindingInfo
	GetStatus() ServiceStatus
}

type BindingInfo interface {
	GetSecretName() string
	GetMountedVolume() Storage
	GetExposedEnvKeys() []string
}

Using interfaces instead of structs should protect client code from how things are implemented but also make it possible to lazily compute / cache values as needed.

Context

The abstract domain model objects should probably not be instantiated directly but rather retrieved from a Context object.

package context

type Context interface {
	GetComponents() map[string]Component
	GetCurrentComponent() Component
	GetApplications() map[string]Application
	GetCurrentApplication() Application
	GetCurrentPath() string
	GetOptions() map[string]string // access command flags if needed
	GetClient() (Client, error)
	SetClient(client Client) error
}

The idea here would be that the Context would be retrieved from a factory function that would examine the context in which odo was invoked to determine which implementation to use based on the available information: command flags, presence of an identified devfile, etc.

Similarly, the Context would examine its environment to determine the appropriate Client to instantiate, possibly returning a
client that would only present locally-available information if the user is not connected to a cluster.

Note that the Context should be globally accessible from everywhere in the code without having to pass it around to simplify things. This could be either accomplished by providing a globally-accessible function to access the existing Context or by making the factory function return the existing instance if one exists instead of creating a new one.

Client

The Client should be only dealing with the domain model and not how they are realized on the cluster, and provide simple CRUD operations for each at the very least.

Here are some ideas of how a Client interface could look like:

package client

type Client interface {
	CreateComponent(component Component) (ComponentStatus, error)
	GetComponent(name string) (Component, error)
	GetComponentStatus(name string) (ComponentStatus, error)
	GetComponents() (map[string]Component, error)
	GetComponentsFor(applicationName string) (map[string]Component, error)
	DeleteComponent(name string) error
	PushComponent(component Component) (ComponentStatus, error)
	AddURL(url URL, to Component) (URLStatus, error)
	GetURL(component, name string) (URL, error)
	GetURLStatus(component, name string) (URLStatus, error)
	GetURLsFor(component string) (map[string]URL, error)
	Bind(service Service, to Component) (BindingStatus, error)
	BindComponents(from, to Component) (BindingStatus, error)
	GetBindings(component string) (map[string]BindingStatus, error)
	AddStorage(storage Storage) (StorageStatus, error)
	GetStorage(component, name string) (Storage, error)
	GetStorageStatus(component, name string) (StorageStatus, error)

	CreateApplication(application Application) (ApplicationStatus, error)
	GetApplication(name string) (Application, error)
	GetApplicationStatus(name string) (ApplicationStatus, error)
	GetApplications() (map[string]Application, error)
	DeleteApplication(name string) error
	PushApplication(application Application) (ApplicationStatus, error)

	// ...

	GetCurrentProject() (string, error)
	SetProject(name string) error
	CreateProject(name string) error

	GetBindingInfo(serviceName string) (BindingInfo, error)
	GetBindingStatus(service string) (BindingStatus, error)

	// possibly returns a different client instance, setting it on the Context if the cluster has changed
	Login(options LoginOptions) (Client, error)
}

No details of the underlying implementation should be made available to calling code so that the details of how the cluster
implements the different functions can be changed at will, depending on the target cluster (or if there's even such a cluster).

Implementation

I would start with trying to introduce the domain model objects progressively replacing the numerous existing versions that exist (Component, PushedComponent, etc.), encapsulating the devfile and s2i implementations. Once the different representations are unified, the Client should probably be the next step. Finally, I would work on the Context.

Obviously, it's rather difficult to really devise an attack plan for these changes because you might encounter unexpected issues while working on it that might affect how you go about it.

@mik-dass
Copy link
Contributor

The Client should be only dealing with the domain model and not how they are realized on the cluster, and provide simple CRUD
operations for each at the very least.

I am a bit confused since we have always used the client as a low level package which connects to the cluster and performs CRUD operations and this seems to convert the client to a higher level package. Since the client interface would be huge and in future might become even bigger, it might be a good idea to break it into smaller clients like StorageClient, URLClient etc. WDYT?

@dharmit
Copy link
Member

dharmit commented Oct 29, 2020

Since the client interface would be huge and in future might become even bigger, it might be a good idea to break it into smaller clients like StorageClient, URLClient etc. WDYT?

Wouldn't it make sense to have storage, URL, project, etc. as separate packages the way they are right now, and use the client to perform the actions they need to do on the cluster?

@metacosm
Copy link
Contributor Author

The Client should be only dealing with the domain model and not how they are realized on the cluster, and provide simple CRUD
operations for each at the very least.

I am a bit confused since we have always used the client as a low level package which connects to the cluster and performs CRUD operations and this seems to convert the client to a higher level package.

In my opinion, that's part of why the architecture is so messy because any code needing access to the cluster requires to know how things are represented on the cluster. This is an issue because that means that you can difficulty change these representations if you need to but, and perhaps more importantly, that means that all the rest of the application needs to be aware of and deal with all possible representations (kubernetes / openshift / docker / operators / whatever…). This is just bad from an architecture point of view.

Since the client interface would be huge and in future might become even bigger, it might be a good idea to break it into smaller clients like StorageClient, URLClient etc. WDYT?

That might make sense, though I don't really see what that accomplishes concretely. What's the purpose of splitting in different clients if these clients all need to share the same representation? It only makes sense to split things if they can change independently. Otherwise, it adds complexity for no obvious benefits.

@metacosm
Copy link
Contributor Author

Since the client interface would be huge and in future might become even bigger, it might be a good idea to break it into smaller clients like StorageClient, URLClient etc. WDYT?

Wouldn't it make sense to have storage, URL, project, etc. as separate packages the way they are right now, and use the client to perform the actions they need to do on the cluster?

I haven't thought at that level yet. First, we need to agree on general architecture concepts, then see how to implement them concretely.

More to your point: what benefits are expected from splitting the code in separate packages? It seems to me that each implementation of the interface should be encapsulated in its own package so that the internal representations are kept internal and don't leak to different packages as would probably be needed if you split the code in different packages.

@mik-dass
Copy link
Contributor

mik-dass commented Oct 29, 2020

That might make sense, though I don't really see what that accomplishes concretely. What's the purpose of splitting in different clients if these clients all need to share the same representation? It only makes sense to split things if they can change independently. Otherwise, it adds complexity for no obvious benefits.

It might also help in preventing leakage and restricting unnecessary interactions between the various resources.

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2021
@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 26, 2021
@openshift-ci-robot openshift-ci-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 26, 2021
@openshift-bot
Copy link

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci-robot
Copy link
Collaborator

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@feloy
Copy link
Contributor

feloy commented Nov 24, 2021

I open this issue again as it is in my opinion a very good direction to follow for refactoring the odo code.

Particularly, the work we (re)started to do to refactor the context code follows this direction.

@feloy feloy reopened this Nov 24, 2021
@feloy feloy mentioned this issue Nov 24, 2021
15 tasks
@rm3l rm3l added this to odo Project Oct 3, 2022
@rm3l rm3l moved this to To Do 📝 in odo Project Oct 3, 2022
@rm3l rm3l removed the status in odo Project Oct 3, 2022
@rm3l rm3l removed this from the Post v2.0 milestone Oct 3, 2022
@kadel
Copy link
Member

kadel commented Feb 6, 2023

I'm closing this one because I don't think it makes sense to keep it.

A lot of ideas mentioned here were implemented in odo v3. Thank you, @metacosm, for this; It provided us with a lot of inspiration on how to better structure odo code.

/close

@openshift-ci openshift-ci bot closed this as completed Feb 6, 2023
@openshift-ci
Copy link

openshift-ci bot commented Feb 6, 2023

@kadel: Closing this issue.

In response to this:

I'm closing this one because I don't think it makes sense to keep it.

A lot of ideas mentioned here were implemented in odo v3. Thank you, @metacosm, for this; It provided us with a lot of inspiration on how to better structure odo code.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@github-project-automation github-project-automation bot moved this to Done ✅ in odo Project Feb 6, 2023
@rm3l rm3l added the area/refactoring Issues or PRs related to code refactoring label Jun 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/refactoring Issues or PRs related to code refactoring lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
Archived in project
Development

No branches or pull requests

9 participants