-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: architectural changes #4057
Comments
/cc @kadel @maxandersen |
👍 Thank you @metacosm for summarizing this. +1 to all points that you made. What you mentioned is actually quite similar to the original intended design when we started with odo.
You can still see some relics of this design in code. But then at some point, as the new code was added people stopped adhere to this design and we ended up in the mess where we are :-( I would actually like to go back to this strict architecture, where each abstraction layer can only access the layer directly underneath (they should never skip abstraction layers, for example, code handling CLI command should never use code on I think that one of the reasons why this failed was the |
That's indeed a challenge though it's fair to wonder if it might not be faster re-writing than trying to fix the architecture. The "good" thing about a rewrite is that the logic is known and presumably could be re-implemented much faster. I've tried to fix the architecture in parts but it's quite difficult because of all the parts that are affected and the complexity of the code so changing stuff without breaking anything is exceedingly difficult. |
@metacosm I've updated the issue descriptions with links to issues that are opened related to each of your recommendations. |
Here are some ideas:Architecture ideasAbstract domain modelDomain model concepts should be hidden behind interfaces so that different implementations (s2i, devfile) can provide the same API to interested clients. Here are some initial thoughts on how this could look like: package model
type Component interface {
GetApplication() Application
GetType() string
GetName() string
GetSourceLocation() string
GetSourceType() SourceType
GetURLs() []URL
GetStorages() []Storage
GetEnvVars() []Env
GetPorts() []string
GetStatus() ComponentStatus
GetLinkedComponents() map[string]Component
GetLinkedServices() map[string]Service
}
type SourceType string
type Application interface {
GetName() string
GetComponents() map[string]Component
GetServices() map[string]Service
}
type URL interface {
GetHost() string
GetProtocol() string
GetPort() int
IsSecure() bool
GetKind() URLKind
GetTLSSecret() string
GetExternalPort() int
GetPath() string
GetStatus() URLStatus
}
type URLKind string
type Storage interface {
GetSize() string
GetPath() string
GetContainerName() string
GetStatus() StorageStatus
}
type Env interface {
GetName() string
GetValue() string
}
type Service interface {
GetName() string
GetBindingInfo() BindingInfo
GetStatus() ServiceStatus
}
type BindingInfo interface {
GetSecretName() string
GetMountedVolume() Storage
GetExposedEnvKeys() []string
} Using interfaces instead of structs should protect client code from how things are implemented but also make it possible to lazily compute / cache values as needed. ContextThe abstract domain model objects should probably not be instantiated directly but rather retrieved from a Context object. package context
type Context interface {
GetComponents() map[string]Component
GetCurrentComponent() Component
GetApplications() map[string]Application
GetCurrentApplication() Application
GetCurrentPath() string
GetOptions() map[string]string // access command flags if needed
GetClient() (Client, error)
SetClient(client Client) error
} The idea here would be that the Context would be retrieved from a factory function that would examine the context in which odo was invoked to determine which implementation to use based on the available information: command flags, presence of an identified devfile, etc. Similarly, the Context would examine its environment to determine the appropriate Client to instantiate, possibly returning a Note that the Context should be globally accessible from everywhere in the code without having to pass it around to simplify things. This could be either accomplished by providing a globally-accessible function to access the existing Context or by making the factory function return the existing instance if one exists instead of creating a new one. ClientThe Client should be only dealing with the domain model and not how they are realized on the cluster, and provide simple CRUD operations for each at the very least. Here are some ideas of how a Client interface could look like: package client
type Client interface {
CreateComponent(component Component) (ComponentStatus, error)
GetComponent(name string) (Component, error)
GetComponentStatus(name string) (ComponentStatus, error)
GetComponents() (map[string]Component, error)
GetComponentsFor(applicationName string) (map[string]Component, error)
DeleteComponent(name string) error
PushComponent(component Component) (ComponentStatus, error)
AddURL(url URL, to Component) (URLStatus, error)
GetURL(component, name string) (URL, error)
GetURLStatus(component, name string) (URLStatus, error)
GetURLsFor(component string) (map[string]URL, error)
Bind(service Service, to Component) (BindingStatus, error)
BindComponents(from, to Component) (BindingStatus, error)
GetBindings(component string) (map[string]BindingStatus, error)
AddStorage(storage Storage) (StorageStatus, error)
GetStorage(component, name string) (Storage, error)
GetStorageStatus(component, name string) (StorageStatus, error)
CreateApplication(application Application) (ApplicationStatus, error)
GetApplication(name string) (Application, error)
GetApplicationStatus(name string) (ApplicationStatus, error)
GetApplications() (map[string]Application, error)
DeleteApplication(name string) error
PushApplication(application Application) (ApplicationStatus, error)
// ...
GetCurrentProject() (string, error)
SetProject(name string) error
CreateProject(name string) error
GetBindingInfo(serviceName string) (BindingInfo, error)
GetBindingStatus(service string) (BindingStatus, error)
// possibly returns a different client instance, setting it on the Context if the cluster has changed
Login(options LoginOptions) (Client, error)
} No details of the underlying implementation should be made available to calling code so that the details of how the cluster ImplementationI would start with trying to introduce the domain model objects progressively replacing the numerous existing versions that exist ( Obviously, it's rather difficult to really devise an attack plan for these changes because you might encounter unexpected issues while working on it that might affect how you go about it. |
I am a bit confused since we have always used the client as a low level package which connects to the cluster and performs CRUD operations and this seems to convert the client to a higher level package. Since the client interface would be huge and in future might become even bigger, it might be a good idea to break it into smaller clients like StorageClient, URLClient etc. WDYT? |
Wouldn't it make sense to have storage, URL, project, etc. as separate packages the way they are right now, and use the client to perform the actions they need to do on the cluster? |
In my opinion, that's part of why the architecture is so messy because any code needing access to the cluster requires to know how things are represented on the cluster. This is an issue because that means that you can difficulty change these representations if you need to but, and perhaps more importantly, that means that all the rest of the application needs to be aware of and deal with all possible representations (kubernetes / openshift / docker / operators / whatever…). This is just bad from an architecture point of view.
That might make sense, though I don't really see what that accomplishes concretely. What's the purpose of splitting in different clients if these clients all need to share the same representation? It only makes sense to split things if they can change independently. Otherwise, it adds complexity for no obvious benefits. |
I haven't thought at that level yet. First, we need to agree on general architecture concepts, then see how to implement them concretely. More to your point: what benefits are expected from splitting the code in separate packages? It seems to me that each implementation of the interface should be encapsulated in its own package so that the internal representations are kept internal and don't leak to different packages as would probably be needed if you split the code in different packages. |
It might also help in preventing leakage and restricting unnecessary interactions between the various resources. |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
@openshift-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I open this issue again as it is in my opinion a very good direction to follow for refactoring the odo code. Particularly, the work we (re)started to do to refactor the context code follows this direction. |
I'm closing this one because I don't think it makes sense to keep it. A lot of ideas mentioned here were implemented in odo v3. Thank you, @metacosm, for this; It provided us with a lot of inspiration on how to better structure odo code. /close |
@kadel: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issues with the current architecture
Lack of abstraction
The main issue with the architecture, in my opinion, is that there is not enough abstraction over concepts and their actual
representation / implementation. This manifests itself at several levels, the main ones being:
This results in code complexity because client code is forced to deal with the low-level structures and particularities instead of being able to rely on a common abstraction. This has the following consequences:
fragile in case these representations need to be changed
Context under-utilization
Context was created to provide contextual information to commands so that each command didn't need to perform the same logic over and over. It seems that the concept hasn't been developed as much as it could, leading to commands performing logic to test things in slightly different ways resulting in differing behavior between commands. This is especially evident when looking at how commands identify whether they're dealing with a devfile component or not. This points also loops back to the lack of abstraction issue.
Moreover, commands and utility classes could be simplified if that contextual information could be reliably accessed at
different levels of the code instead of having to pass some parts of it around as function arguments often leading to functions with more than 5 arguments, which very often don't always need to be present at the same time depending on the execution context.
Recommendations
Create proper abstractions for the core concepts
The odo core concepts should have proper abstractions hiding the implementation details so that client code (e.g. commands) don't need to know or deal with the messy details, instead manipulating concepts as opposed to implementations. More precisely, odo should have
Component
,App
andProject
abstractions at the very least, each hiding how they are implemented. In particular, for theComponent
concept, it should hide whether we're dealing with an s2i or devfile component, provide an interface to access its different elements without having to know how they are implemented. In particular, it should hide how the information is represented locally or on the cluster: client code should not have to know how to parse an env / devfile / whatever to extract the name of the component or its URLs. Similarly, client code should not need to know that a Component is represented by aDeploymentConfig
on OpenShift but aDeployment
on plain Kubernetes. All these implementations should be hidden.Another aspect of this is that these abstractions should provide an interface, and their data should only be accessed via
accessor functions, not directly. This is a basic encapsulation idea, but it is very helpful for several reasons:
missing data, providing sane defaults or at least properly defined behavior) instead of letting client code figure out what to do and, thus, possibly doing different things at different spots
needed, this also means that availability of resources can also be delayed (for example, a connection to the cluster might not be required at the onset)
Related issues:
Unify cluster access
There should only be one client to access the cluster and that client should also provide an abstraction above the platform
being accessed. There is no reason a command should know whether it deals with a plain Kubernetes cluster or an OpenShift one: the client should hide the differences. Coupled with proper abstractions for concepts, this would lead to dramatically simpler code base because now commands would only deal with abstractions instead of having to know the details of how a component is implemented for each supported type and platform. The client should only return high level
Component
orApp
abstractions instead of low-level representations such asDeploymentConfig
orIngress
. It's not excluded that several client implementations be provided; these implementations should, however, be accessed via a unified interface and the specific implementation hidden from client code: a command should not need to know how to create a client, just that it can retrieve the appropriate one when needed.Related issues:
Generalize Context use
Commands require lots of information to be able to do their work and gathering that information (or making it easy to do so), is the role of the Context. Combined with the previous two points (abstractions over concepts and cluster access), moving gathering all the information that commands require into Context would make command implementations simpler and more coherent. Context would build the appropriate abstractions based on the command's execution context: which flags have been specified, what files are locally present, which cluster can be accessed and its type, etc. All that logic would be encapsulated in a single spot so that it could easily be changed if needed but, more importantly, it could be implemented only once instead of requiring commands to rewrite the same code over and over, possibly with slight differences leading to subtle bugs and incoherent behavior.
Associated with concept abstractions, the Context would provide access to lazily instantiated concepts instead of providing direct access to local config, env or devfile files. Associated with cluster abstraction, the Context would provide simpler cluster access with a lazily instantiated, unified client instead of providing one client for each platform and letting commands figure out what to do based on the nullity of one or the other…
Related issues:
Properly use the Complete/Validate/Run pattern for commands
The Complete/Validate/Run pattern for commands is meant to simplify the maintenance of command implementations and make sure that execution steps are taken when needed, not before. This also makes it easier to onboard new developers since command implementations have a structure that is easily to follow. While it sometimes makes sense to deviate from this pattern (for example, when validation could be performed more efficiently in the Complete step), the pattern should be followed as much as possible. In particular, any computation steps should not be performed before the command is actually executed (i.e. Complete is called), and definitely not when the command is created. The Context should be created in the Complete step so that it's available for further computation.
Related issues:
Conclusion
odo has reached a state where evolving it becomes more and more difficult, slowing down the development of new features while maintaining existing ones without breaking them. With the 2.0 release out of the way, it's time to clean up the architecture to be able to move forward with greater confidence and velocity.
The text was updated successfully, but these errors were encountered: