-
Notifications
You must be signed in to change notification settings - Fork 9
Collection requirements: expanding the section on best practices #33
Comments
Related discussion: ansible-collections/ansible-inclusion#27 (comment) |
Sounds good to me |
Summary of the discussion (full log at https://meetbot.fedoraproject.org/ansible-community/2021-07-28/ansible_community_meeting.2021-07-28-18.00.log.html#l-152):
Open questions:
|
Proposal follows, document the resource module pattern as an alternative with well-defined behavior. I would also like a number of other people to review this for accuracy and comment before it is finalized. OverviewNote: The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119. Note: The use of the "module" below customarily refers to either a module or action plugin. The resource module pattern is an alternative approach to developing ansible modules. As of 8/11/2021, this pattern is most commonly used with network and security modules. What is a resource module
Resource module pros
Resource module cons
Resource module states
Resource module check mode
Resource module return values
Ansible facts subsystem integration
Resource modules within a collection
Resource module scope
Resource module tests
Resource module development
|
I agree with each of the points mentioned by @cidrblock for resource module supported states and its ease to use and sync in functionality when it comes to network and security platform configuration. It's viable for any other content which supports or exposes REST APIs for their configuration. |
@cidrblock This explains everything about resource modules very well. Just a minor point about
so the current configuration is returned in the |
Not sure what makes the state enforcement + info module spilt a "standard" pattern? In the case of state enforcement modules, the current state of configuration for a resource is fetched (for idempotent check) but it is not returned in the response. IIUC the info modules return both configuration and operational data for the given resource. The response returned from the resource module is only the configuration data for that resource and is structured in module argpsec format. To get the same functionality with the info module approach it will be required
The data transformation part in the playbook is not that easy and this design pattern adds to more complexity from the playbook author's perspective IMO. |
The bullet point about not adding list and info states to existing modules is there for almost three years now and most of the non-networking modules follow that advice.
This is left to the implementation. But a lot of "standard" modules that interact with an API to configure resources also return back the state in the result. So I would say that the common practice is actually:
Samples of such module pairs can be found in all collections I helped develop (Sensu Go, ServiceNow, NGINX Unit, AWS), but point 1 holds for an even greater number of modules (there are very few modules that do not return anything). And yes, the output of the regular and info modules in such cases is usually compatible: the structure of resources is the same, but the regular module returns back a single resource while the info module usually returns back a list of resources.
There is nothing preventing you from implementing this in a standard module + info pair.
I may be misunderstanding this, but as far as I can tell, there is no difference between a resource module and a similarly designed standard module pair. Steps 1, 3, and 4 are required even when we use resource modules. Step 1 uses If the regular modules are designed in such a way that they return data in a form that is compatible with the SOT, we need exactly the same three steps: in step one we would run an info module, compare the data in step 3, and update things in step 4 with the state-enforcing module. But I would argue that doing things this way is very non-ansible-like. If we want to enforce the state, there is no need to compare the existing state first since state-enforcing modules should do this for us. Making sure things do not deviate from the SOT should be done using exactly the same task, but in check mode (and possibly with the
Again, this has nothing to do with the fact that resource modules combine the state and action parts in the same module. As demonstrated, any reasonably designed module can do that. I would argue that the resource module should actually be a "toolbox" of things that can help users manage certain resources. This toolbox would include:
So in my opinion, the idea behind the resource modules is sound, but the implementation is not something I would encourage people to adopt outside the networking ecosystem. |
Ansible content modules including built-in, networking, security, windows and so on don't follow module + info pair so I still wouldn't call it a standard pattern. The doc you referred to doesn't clearly define the module pattern for all the major endpoints that Ansible can manage and it requires updates IMO.
A state can consist of the operational and configuration data of the managed host. Resource modules only fetch the configuration state of the host that it needs to take action on and not the operational state. The info module returns both operational and configuration states. So in cases where users have to manage the config state only with resource module it can be done with single module whereas as module + info split requires using two modules.
There is a bunch of security content that is written with the resource module pattern and also there is scope for cloud modules (especially networking modules) that can adopt resource module patterns so IMO it is not limited to networking only. Finally, I would like to add resource modules are around for 2+ years now and are well received by Ansible users, partners and customers. Here's recent feedback that we received from a community user
Since there is a bunch of Ansible content that adheres to resource module and module + info split pattern IMO both the patterns can be added as recommended one and let the collection developer use the best judgement to pick the one as required. |
I'm in agreement with @ganeshrn comment and as mentioned in my earlier comment as well, resource modules and the implementation of its state is not confined to networking content, it's helpful for all of the contents where configuration can be made via REST APIs which includes SECURITY, CLOUD, and VMware content, as the resource module states (i.e. MERGED, REPLACED, OVERRIDDEN, DELETED) very well replicate and resonates the operational state what REST API exposes (i.e. GET, POST, PUT, PATCH, DELETE). Also, keeping two modules one managing configuration and the other (i.e. info modules) getting information only increases the number of modules (2X) and also burdens the user to specifically call the info modules to get the operational state of the configured device. This can be very well observed especially in security, cloud content space where there's ~100-500+ APIs to configure the box and if the content has module + info_modules ultimately the specific content will start to have 2x number of modules which is cumbersome for both the module creator and maintainer and as well as the end user. |
All the proposals related to the resource module are in the public domain and the content was never intended to be developed in isolation. There might be a feeling of isolation because of the time-zone centric process currently followed. The meeting where the community decisions are made is scheduled at 12.30 AM IST which is not suitable for all the contributors. BTW there is a good number of active Ansible content contributors (and growing) currently working in IST. So I think it will help going forward if the process is made more time-zone agnostic and asynchronous. |
That querying information is split from updating state to _info or _facts modules (depending on what this information is about) is a standard pattern. It has been followed by all non-network modules in ansible/ansible (I'm not really familiar with almost all network modules, so I cannot comment on them), and is still being followed by all builtin modules and many other modules (for example, everything in community.general, amazon.aws, community.aws, and many of the smaller community collections that were split off from ansible/ansible or community.general).
There is no rule that _info modules must do that. That's often what _info modules do, but they could also just return configuration state and not operational state, or separate them clearly. |
While this is a very valid point in general, I don't think it really applies to why resource modules behave differently from "everything else". Both resource modules and the standard for Ansible modules - information/facts gathering must be part of _info/_facts modules and not of regular modules - has been around long before the community meeting was created. As I understood it, the core team always embraced - and everything contained in ansible/ansible, except possibly network stuff, always conformed to - that information gathering is done in _facts or _info modules, and not by regular modules. (There are/were some legacy exception, like |
The current network modules gather information using the fact module itself. The information consist of both operational state and configuration state with the added constraint that the configuration data returned should follow the same structure as that of the corresponding resource module. A single info/fact module that can fetch operational/configuration state based on input options is more optimised as compared to having two separate modules for a single resource one for pushing configuration and the other for fetching configuration as one is a subset of the another. Most config modules (even core) get the current (config) state for which it has to take action, the only additional part in the resource module is the return response that has before and after keys and the value is in the resource argspec format. For resource module states
I don't agree with your comment for the reasons I mentioned above. Resource modules are doing what regular modules do with some added functionality. |
Since this issue has been "stuck" in limbo for two months and is blocking the inclusion/rejection of the trendmicro.deepsec Ansible Collection, I propose we answer the following question: Do we want to accept resource modules (where resource modules are defined as in #33 (comment)) as one of the standard ways for structuring and writing Ansible modules that want to become part of the community package? If the answer is yes, we can formally accept the definition of what the resource modules are and we are done. If the answer is no, we should answer the following question: Do we want to accept resource modules into the community package? The answers to this question I came up with/stole from the meeting discussions are:
At this point, we should know if resource modules can be part of the Ansible package or not. This should unblock the current inclusion decisions (and possibly open the discussion about what to do with existing resource modules that were grandfathered into the Ansible package). @abadger @felixfontein @gundalow @Andersson007 @acozine @ssbarnea @jillr @cidrblock @jamescassell @thaumos Can you please comment on this at your earliest convenience? Thank you all in advance! |
@tadeboro Thanks for creating the proposal, but would like to extend on the options as:
My vote: Option 1 As pointed in my earlier comment, the use of resource module pattern is and should not be limited to network/security content and should be available for ALL of the content where REST API-based configuration is supported, which includes Cloud, Container, Kubernetes, and VMware as well. |
My biggest concern is that the definition of how a resource module behaves and the features it offers is both well defined and followed for plugins that claim to be a resource module. Regardless of where the pattern is used, we need to ensure the contract between the developer and user that the resource module pattern guarantees is upheld. I believe it is the developers choice to follow and align with the resource module pattern where they believe it is a good fit and the pattern should be considered an alternative to the current best practices. My vote: Option 1, allow developers to choice the pattern for their content and do not deny inclusion into the Ansible package because a developer believes the resource module pattern provides the best experience for the users of their content. |
copy and pasting from another thread ansible-collections/ansible-inclusion#27 (comment)
|
Echoing all the points that @cidrblock @justjais @ganeshrn has mentioned in this thread. I vote for Option 1. Let the content developer decide which pattern to choose and NOT exclude modules that follow the Resource Module approach. |
@justjais Edit: Sorry Brad, I pinged you by mistake here. |
@tadeboro NP, I think everyone probably knows where I stand on this issue :) |
@IPvSean +100 |
My vote is Option 1 :-) |
As the concept is already in use in many places and seems to be defined well here, my vote is Yes, we should accept it as a separate concept as a whole. In other words, there will be 2 types of conventions:
As a summary, i vote Yes because:
We should highlight in the requirements at least that:
Edited: Also it doesn't feel fair to restrict the areas because if we accept the concept for network and security areas, this, imo, should be accepted everywhere where it fits. |
Consider you are modifying an httpd.conf, apache configuration file, a nginx.conf, etc. There are identical to a Cisco IOS configuration file, or a Junos, or an Arista.... we keep pretending network is dissimilar from server community but its not at all. How are you modifying any flat-file on any type of device? Can you write playbooks without complicated Jinja2? Yes, can you avoid it for a lot of these use-case mentioned. No... just browse stack overflow and see the blockinfile, lineinfile and jinja2 configuration methods....
So that is what we had... and we showed this to users, who were confused why you would create two separate entries for the same resource. Why would I use a separate module to gather info for X resource.... it was confusing from our user experience. Again this is based off user feedback.
This is a developer mindset versus a user. There is a lot of Ansible developers on this thread, but not a lot of users, trainers, etc which are just as important to the community. The users are confused WHY you would use a separate module to gather the exact same info from a resource. Just because you CAN do something doesn't mean its intuitive for novice users. Folks are thinking we make these decisions in a vacuum and that is not the case, we meet with field organizations weekly across multiple geography units, multiple domains, etc. We are literally being asked over and over "why can't we do this for use-case X (which is not security or network automation)" |
I'm not really sure what you are trying to get at. There are always different ways to configure things, no matter on what kind of device or for what kind of service. There is nothing stopping you from creating a module (pair) to configure httpd.conf, nginx.conf, or anything else in the same way as a network device using regular modules and _info modules. Just because web servers are commonly configured by templating config files doesn't mean you could have a module which allows to configure a Apache virtual server, and to query its configuration in the exact same format as you can put into the configuration module.
How is that related to resource modules vs. module + _info module pairs?
I again ask you: can you give a concrete example? You need the exact same number of tasks for operations with resource modules than with regular modules (if these are written as module + _info module pair using similar principles than resource modules) -- except in the one case where you template the state parameter (there you'd need a lot more templating to achieve the same thing with module + _info module pairs). But I don't think you are aiming at this very special case. |
community general has a
If the above had been developed following the resource module pattern, the ambiguity and need for data transformation would not exist. (editing for formatting) |
My vote: Option 1, allow developers to choose the pattern for their content and do not deny inclusion into the Ansible package because a developer believes the resource module pattern provides the best experience for the users of their content. (thanks @cidrblock ) This will futureproof any new methods and show inclusiveness as an open source project. Isn't the reason why we went to Collections was to lower the barrier to entry for contributors? Sounds like the project is slowly reverting back to being a bottleneck for innovation... |
You can always pick out specific module + _info pairs which do not follow similar principles than resource modules. That does not prove anything, and is the main reason I always add the additional qualification "with similar principles than resource modules".
For many existing module + _info module pairs this is not possible (and often was never intended to be possible). But as I said, "with similar principles than resource modules".
Exactly. But nobody stops you from creating module + _info module pairs which satisfy this.
That is also not guaranteed for resource modules. You can have a promise, as with resource modules, but that promise you can also give for specific module + _info module pairs.
For many existing module + _info module pairs this is not possible (and often was never intended to be possible). But as I said, "with similar principles than resource modules".
Yes, but as I said (I'm really repeating myself): this has nothing to do with resource modules per se. You can define very similar guidelines for module + _info module pairs, and uphold that with the same rigor as you uphold these properties for resource modules. |
Collections can do whatever they want. There is no restrictions. What's different is the Ansible community distribution. It is a collection of modules that should stick to a common set of guidelines. If a collection choses to ignore these guidelines, they will have trouble getting included. But nobody stops the collection from being available on Ansible Galaxy (or other places in the web), and nobody stops users from installing it and using it. |
As a user who likes the networking resource modules, my current preference is for option 1, but before I commit to that, I have a question: What is the drawback of including the Resource Module pattern? From what I've read, it seems like it's mostly based on the opinion that it causes confusion, or that it is somehow not "the Ansible Way", but most of the arguments for including it seem to be centred around feedback from users that they prefer that pattern. I guess I'm just confused about what the downside of this is? |
The resource module pattern violates the accepted principal that an Ansible module or plugin should follow the unix "philosophy of doing one thing well". See https://docs.ansible.com/ansible/latest/dev_guide/developing_modules.html Based on that principal, there are accepted ways to implement a module or plugin.
|
I added my vote but I didn't elaborate on it, so in addendum. When I look at this with my Ansible Community hat on, Brad and the network folks have a well documented and scoped, prescriptive architecture that is already in widespread adoption. That architecture is guided by a team of trusted engineers who are experts at Ansible development. While "most" modules - except that one exception, ok really two exceptions, and maybe a few over here - are developed in a generally consistent way to each other I'd disagree that there's only one obvious or "correct" way to write modules. I can point to probably half a dozen different ways to write modules in the AWS collections alone that all presumably seemed obvious to the original author(s). We have widespread adoption of the resource modules pattern. It's enabled a bunch of people to create Ansible content, whether we as individual developers like the way that code smells or not. The resource module pattern and the content developed from it seem to be far more consistent than many modules or collections and that's a huge win for our users. I can't see any downsides other than "we didn't use to do it that way". IMO the right thing for the health of the community is to +1 it |
@sivel I see a lot of "shoulds" in those definitions, not "musts". I have spent a lot of time working on network interop problems that often tracked down to interpretation of standards with the words "should" and "must", so I apologize for being pedantic about this. Are we talking about removing all the existing resource modules from the distribution? Are we going down the path of another fork that includes Orthodox Ansible + Networking? Can someone explain the negative impact on the Ansible project or community from adopting this module pattern? Preferably the ELI5 version, because I think I'm still missing something. |
The philosophy of doing one thing well can be interpreted in many ways and can be applied to Ansible module development
While there is consensus that the first pattern is not correct, saying that only your interpretation is correct and the resource module pattern interpretation is a violation doesn't go well with the bulk of Ansible content "community" (Yes I and many others like me who can't be part of IRC meetings due to various reasons consider themselves as very much part of Anisble community). In the end, I believe whichever philosophy and it's interpretation we adhere to the end goal should be to make the life of users/playbook writers easier and not that of module developers :-) |
@sivel already elaborated on this. Next to the link provided by him, https://docs.ansible.com/ansible/latest/dev_guide/developing_modules_best_practices.html#scoping-your-module-s is pretty central as well.
We are not. We also did not remove certain other modules (exceptions) that violated some of these principals after they have been added in the past. "gathered" is not a state
For me, the main negative impact is that resource modules encourage two bad practices:
Resource modules were introduced despite them being in violation to guidelines that were set up a long time ago. It would definitely be better if the parts of resource modules which violate these principles - I think the main issue is
What's a "ELI5 version"? (I assume you do not mean https://eli5.readthedocs.io/en/latest/) |
As you mentioned the guidelines were setup a long time ago and and long time ago Ansible was mainly used to manage server infrastructure. Over time Ansible uses cases has evolved into managing cloud, network, security, edge verticals and maybe it's time to revisit the guidelines. BTW based on the guidelines that @sivel shared
I see number of module following the info pattern violates above guideline and returns information about inventory resource as part of info module response. Maybe that's because the API used to fetch information doesn't understand what is Ansible inventory and non-inventory resource. |
That is definitely true, but there's a big difference between discussing whether to change guidelines before starting to ignore the guidelines large-scale and discussing whether to change guidelines after starting to ignore them large-scale. Instead of being at a point where we can discuss whether it makes sense to adjust the resource module specification to better fit with the existing guidelines (which could also involve adjusting the guidelines), we are now in a situation where we basically have to say either yes to the complete resource module specification, or say no to it. This really sucks and wastes a lot of time for everyone, and is extremely frustrating. |
If the Yes makes resource modules design to co-exist with existing present/absent and facts/info modules, giving the module creator an option to go with either of the two and in turns gives users and community best way possible for configuring objects be it network/security or any other domain which justifies and is suitable for the use-case, I don't understand why there's a need to hard-line the specification and limit the design to follow just one of the two. |
First, apologies for the reddit term, ELI5. "Explain it like I'm five years old". Based on the discussion, I think the ELI5 message is: "Resource modules have implemented a pattern which is contrary to the guidance of the project. This has created two ways of gathering information from resources; the '_info module' pattern, and the 'state: gathered' pattern. The Resource Module pattern was accepted into the project years ago, and is now in widespread use by end-users (mostly in the networking discipline). Removing those modules now would negatively impact those users. Allowing more modules that implement the Resource Module pattern would create ambiguity that could cause confusion for other users." Damned if you do, damned if you don't. I understand the frustration. This discussion leads me to a question (which I'm going to ask even though it may make me unpopular): Is there a reason (other than a historical decision) for having separate _info modules? It feels strange to me that all operations except "get" operations are carried out by one module. Especially since the module that makes changes has to "get" state anyway. Personally, I prefer the Resource Module pattern over the _info or _facts pattern, because I like the idea of having one entity managed by one module, and I find the creation of "read-only" modules creates a lot of noise when I'm looking through docs, trying to find the module I need (eg: the azure modules). |
I may be wrong but I've always looked at it as something like: _info modules perform Read operations and no other. A user should have complete confidence that running an _info module will never perform and Create, Update, Delete (CUD) operations. _info modules are for safely gathering information for the purpose of being used elsewhere (an assertion, another task, etc). A non-info module should be expected to make some CUD action unless run in check_mode and needs extra safety (like using check_mode) when testing or validating playbooks. I can appreciate that 5 years ago the community might not have agreed to the resource module pattern because it violates the principles discussed here. But that ship has sailed at this point. There were years for people to hash out concerns over the resource module pattern. Y'all, we built something and people are using it. Some of them are using it in ways that maybe we didn't anticipate, and maybe even ways we don't always like. But they're using our software to do awesome stuff. It's not like this proposal is to throw away all semblance of caring about what modules or collections look like, it's a proposal to formalize acceptance of an already established and well articulated additional specification. Let's help our users and community keep doing awesome new things with Ansible. |
That is sort of to the point about what I'm getting at. The Network Resource modules have expanded the possible operations on a resource. When Ansible was started, I think the number of potential operations was limited to "read the current state" (_info) and "enforce the desired state". Over time, the number of operations has expanded, and the network resource modules expanded it even more for network-specific operations, for example: 'rendered' to generate the commands required to configure a network device to the desired state, without actually enforcing that state. For me, the module paradigm makes more sense to be: "a module encapsulates the possible operations on an entity". Having separate _info, or _facts modules just for read-only operations feels counter-intuitive, even though I realize it is the established, and documented pattern. |
I don't think there's much gained, from a confidence perspective, that a _facts, or _info module only performs read-only operations, because that is ultimately a result of the access control mechanisms of the target resource. Ultimately, the credential used determines what a playbook can do, not the format of the modules. I think
is just as intuitive as
Also, I know a whole bunch of Cisco IOS 'show' commands that can cause significant performance degradation of the host device. Theoretically, they are read-only operations, but they are far from safe to execute in production. Relying on the module naming paradigm to have confidence you're doing something safe is just false security. All that said, I still don't have confidence in the "right" solution to this problem, because as felixfontein has pointed out, there have been conflicting implementations of the same functionality for years now. My feeling is to let both continue to exist, even though it causes a little confusion. We can address that confusion in the documentation, as this issue suggests. I look at this as evolution of the language, not tainting it. |
My vote is Yes Option 1. |
I am for resource modules for being included. Strategically, this is a path forward that we need to work towards. I am not for these discussions that are binary. There is a lot of merit to this as a type. Let's have a better discussion around how to leverage this than make it a yes or no. |
My vote: Option 1 Developers should be given the choice to follow the resource module pattern based on what users need. I agree with @IPvSean . The feedback we receive from the field and customers is that the potential use-cases based on resource modules are suited to what they need, especially in the security space. Giving users a pre-defined, declarative model to use makes adoption easier, which is aligned with Ansible's principles. https://github.com/ansible/ansible#design-principles |
Just to chime in; I find the resource module approach to cover a wide basis - they seem like they provide a mechanism to develop functionality that is desired by the community. This is very visible in the network space as it was never really feasible to template out a configuration for an entire device because reliably applying it would require rebooting into the configuration. Not exactly desired most of the time nor is it like rebooting a service on a single box at a time. Could it have been accomplished in a different pattern, possibly. I say keep it and if other areas find it a useful way to express things then so be it. Option 1 for me. |
@gundalow @acozine @ssbarnea @jamescassell your votes are still missing in this discussion. |
I am also for Option 1. As a side note, this thread grew so much is hard to follow it. Others remarked the problem of guidelines/practices not all being in a single place. I do agree that we should unify them. For example when I tell people to follow collection repository layout for all their ansible content, they often ask "why? I do not have a collection and we do not have plans to publish one". That is a serious issue and likely more people don't understand that the standard layout is not only for collection. Until we refactor our documentations to clarify these things, we will still face confused users. |
Firstly, I'm really happy that people have taken the time to get involved in the discussion, that's the whole reason we have community-topics. There is lot's of great feedback in here, thank you. I think as various people have said this is a difficult one, and likely not a binary decision. To recap where we are right now:
Personally, I think the resource modules (as they exist today) are good as:
I'd like to suggest that we accept Resource Modules are they are today, and therefore accept Moving forward, I think given the expertise of this group, would it make sense to:
|
I'm abstaining. I agree with the argument for separating read-only operations from change operations. I also agree that resource modules have been in use for a long time and have become an accepted alternative. Personally when I'm writing playbooks I prefer each module to do one thing and do it well, and I prefer more restrictive |
We discussed this issue in the community meeting and came to the following conclusion:
So as the IRC log excerpt says, we accepted the resource module pattern as an official way of writing modules. Thank you all for voicing your opinions! |
Summary
Right now, the development conventions section in collection requirements use general best practices as its base. And while those general rules do offer a good baseline, parts of the Ansible ecosystem developed their own set of best practices when developing Ansible content.
For example, resource modules have their own set of conventions that contradict the general best practices in certain places.
Maybe we should expand the best practices section and add some specialized subsections for things like network collections?
The text was updated successfully, but these errors were encountered: