-
Notifications
You must be signed in to change notification settings - Fork 82
Include cloud-init-vmware-guestinfo in core cloud-init package #2
Comments
Hi Greg, That’s always been my intent, but I just haven’t had the time. Is the process simply to file a ticket via the URL you provided? |
I also agree with @MnrGreg, this is something that should be in cloud-init, I have found this pretty useful as well. |
Hi! Upstream cloud-init here :) We have received a number of requests to get this datasource added to cloud-init and would be happy to see that done. Since this bug was filed, we moved upstream cloud-init to our Github project. In order to get this added to upstream, we would ask you to submit a merge request and sign the Canonical CLA. Further details are in our hacking document. If you have quesions feel free to drop in out #cloud-init on Freenode or join our mailing list. I would also be interested in reaching out directly to VMware for some additional assistance with VMware and cloud-init compatibility. How to use cloud-init with the larger variety of VMware products is a very common question and getting both sides on the same page would be great! Thanks! |
Hi @powersj, Thank you so much for reaching out! I am flattered that so many people have an interest in what started as a personal project. I am akutz at vmware if you would like to e-mail me directly. As I said above, it has always been my intent to add this datasource to cloud-init proper, but it is also the case that it being added after the fact enables greater flexibility and more frequent releases. That said, I am lazy and do not do releases that often :) |
This would be absolutely amazing. |
I'm planning on spending some cycles on this next quarter, will link to the cloud-init PR once I have it ready (~Aug) Some initial changes in progress was removing dependency on |
@janitha awesome, please do feel free to stop by #cloud-init on Freenode and ask questions. |
Hi @janitha, Removing the |
Has there been any movement on this feature? |
@akutz Is this ever going to happen? What can we do to push this along? This module just broke on our machines with the latest Ubuntu 18.04 update.. I don't have time to figure out why so I disabled cloud-init and static configured the network on the important boxes. Hopefully I'll get some cycles to dig deeper on the failure sometime in the next month... Can we please get this moved into the community cloud-init project where it will be able to be maintained and tested by the greater community? It's a very useful data bridge between Vsphere and VM's but when I reboot my machines and the datasource fails completely it creates complete downtime for the box until someone goes in on the console and manually fixes the network. It really doesn't do well to the 'VMware' brand to have this here in the corporate repository as a pet project either.. |
Hi @skyscooby, I understand your concerns, and I sympathize with you. I am not on this project full time -- it's something I created when I needed it, but I should certainly be doing more to try and keep it up to date. If someone can run with it, then I'd be more than happy. The primary impediments are the two external dependencies, and there is no good way around the netifaces one. I was speaking with someone the other day -- what I should do is get CI set up to automatically run this against multiple versions of multiple distros, but again, it's a matter of time that I do not currently have :( |
@akutz Much appreciate the honesty. I see a large part of the breakages happening due to changes in the upstream project being done out of band.. if this module was sitting in their tree I would assume it would be kept up to date when they change the class interface or module import paths. I'm not sure the Datasource interface has been fully designed with third party stability in mind.. that said my first question is really about what is the special case with this datasource that requires it to have dependencies on module that none of the other cloud init data sources seem to require (otherwise the dep would already be available)? Perhaps I am not using the full functionality of this datasource so I may not understand the edge case here.. but all we do is pump some encoded data into the Vpshere API using our automation tooling... which we want passed into cloud-init... cloud-init itself should mostly be handling the system level details to apply the desired configuration. There is a long list of other datasources which just work when passed a cloud-config... |
On mobile, but it boils down to DHCP and instance data. Go look for the commit where i introduced instance data support. EC2, GCE, those don’t rely on DHCP. We do. There’s not metadata service or IPAM native to vSphere, so if we want to avail ourselves of instance data that includes the network config, we have to write that ourselves. And that is discovered with netifaces. I’ll Add more when at a laptop |
@akutz OK that's one area that differs for our use case.. we are passing static network configuration through guestinfo.metadata ... but even with the DHCP case why do we need the instance-data.json to be retroactively updated with actual IP information should it not just contain the metadata as supplied by the user.. (ie that the interface should be configured with DHCP?)... in either case cloud-init would then take that and do whatever distro-specific actions are necessary to configure the interface for DHCP.. and the DCHP server would configure the interface down the line. Are you collecting IP / inventory information by going back and using the cloud-init file? or am I missing something? Is there some functionality in cloud-init that does not work correctly if we don't update instance-data.json with the allocated DHCP IP information? I checked on our failed nodes and when I run the datasource from the command line it hangs.. when I break out I can see it's gotten stuck in the wait_for_network sleep loop (even tho we are not passing a request to do so through the metadata).. so I am guessing something in this logic is what is causing this issue for us... when I manually re-configure the network and run the datasource it works again. |
I would gander that if there was expected cloud-init functionality that instance metadata be dynamically updated with a 'live' view of the systems network state that would and should be implemented in cloud-init core once and wouldn't be provider specific... From all I gather the cloud init metadata is only to be passed in one direction from the "cloud" -> instance. |
The expectation is that a cloud platform is supposed to provide the cloud instances with their network information -- and again, that doesn't exist or happen on vSphere environments with DHCP. This functionality simulates that behavior as it is required for many systems that rely on cloud-init instance metadata queries for discovery of network information. Every other cloud provider is able to indicate in advance what the networking information is and include it in the instance data where it is queryable and you can use Jinja to interpolate your user data with the instance data. Please see https://cloudinit.readthedocs.io/en/latest/topics/instancedata.html#using-instance-data for an example. Because vSphere often relies on DHCP, that information is not available in advance. This is not something that cloud-init provides. This is something the data source is supposed to provide. This functionality was introduced to allow Kubernetes images that are bootstrapped with cloud-init and need to know information that is only available after the network is online to still be used with the same templates every other cloud platform uses. |
Thanks very much for the explanation, that last bit helps me understand the goal you had in mind when building it, basically shimming in functionality to support existing templates that might require IP information at boot time.Indeed DHCP is not a typical use case in the cloud (unless it's supported in some VPC implementations.. not sure). I found the exception in the debug output that is causing me troubles.. are you aware of this
|
Disregard I see a Fix in #53... I will have to re-bake all my base images or pin cloud-init from being updated from upstream I guess. |
Actually #54 fixes this... master is still broken on latest ubuntu 18.04 cloud-init version at the moment. |
I'm in the same boat as you until the fix gets merged. I've created a number of images that utilize this and our infrastructure is dependent on those images now. I assumed that this had the full support of VMWare and wasn't one person's side project. Knowing that now makes me uneasy. I might pin cloud-init regardless from here on out. (This is not a knock on you at all @akutz, the work you've done is appreciated and this module has been very useful.) |
I totally understand. We do have coverage for OSs like Photon as this datasource is included with Photon OS, but as @skyscooby mentioned above, this datasource exists specifically to address gaps, such as DHCP on vSphere as opposed to static IPs via IPAM like other cloud providers. This makes it difficult to test with a wide range of operating systems. As I noted above, I've looked into setting up CI pipelines internally to test against cloud-init
Of course. People need software to be reliable. I do not take these concerns personally, and I appreciate everyone's patience. I am totally open to this going upstream, but I do not see a path forward for that without the inclusion of the netifaces dependency. That is the foundation of the feature set used by projects like Cluster API and VMware's own Project Pacific for bootstrapping machines. |
Would it be possible to split this into two different datasources in this case, where one of the datasources just queries for the static configs (like setting static addresses in metadata) found in the vmx file using vmware-rpctool, and a separate datasource that handles the dhcp nature of the above issues. That way, it would be possible to upstream the section that specifically reads from the vmx configs, the portion needed for reading dhcp net can remain here in this datasource. |
Hi @weikinhuang, Before I did something like that I would rather modify the datasource to only import the netifaces module when/if that feature was activated, making activation conditional on some configuration element. |
but wouldn't that still mean it would depend on netifaces, and therefore still make it hard to include upstream? |
Hi @weikinhuang, Python's imports only matter when they are actually loaded, and that can happen:
Python 2 $ python2 - <<EOF
if True:
import netifaces
else:
import netifaces2
print('hi')
EOF
hi Python 3 $ python3 - <<EOF
import os
if True:
import netifaces
else:
import netifaces2
print('hi')
EOF
hi |
oh sorry, I meant with regards to the coding style of the upstream cloud-init project. |
Could you please clarify this? Your statement was:
I took that to mean the dependency as the issue. How does that relate to the coding style? Perhaps you mean project rules and governance, such as may be related to external dependencies? |
yep |
Hi all, I just opened #56 to remove the dependency on |
@akutz was the objection from upstream not related to pulling netifaces from pip vs using it in general? It seems that rhel7 / rhel8 and ubuntu and others for some time has included python-netifaces packages in the distro itself.. https://pkgs.org/search/?q=netifaces. I wouldn't think they would have issues declaring a dependency on software that is part of the distro proper. I also thought they were open to vendoring in the pip dependency as well or was that ruled out? It would mean ensuring the use of netifaces in the module is compatible with all the different versions of netifaces each distro includes tho.. |
Hi @skyscooby, The pip thing is a non-factor. That's just how it's installed with curl bash. I don't really care where netifaces originates. |
@akutz |
First, great to see the conversation here :) Next, this week is the cloud-init summit and it is open to the community: https://discourse.ubuntu.com/t/cloud-init-online-summit-sept-23-24/17867 It would be very, very nice to have someone there to talk about their usage of cloud-init + VMware. Last year we were not able to have anyone from VMware and it is something that many of the other clouds and distro maintainers have requested. Finally, if you do go down this route, make sure it is python3-netifaces. Not saying we would add the dependency, but upstream cloud-init only supports Python 3.4 and newer as of right now. I am sure this will be re-evaluated at the summit. |
Hi @powersj, That’s the dep I use when building the RPM, so that should work. Thanks for reaching out; I’m on mobile, so my response will be brief for now. |
@powersj I am happy to attend the summit as long as remote attendance is acceptable. I could discuss the purpose of this data source and how it helps address a gap with cloud-init bootstrapped systems on vSphere. |
@akutz yep this is 100% remote! |
@powersj Awesome! I reached out to some people a few days ago about this project, and there does appear to be interest in moving ahead with getting this contributed to cloud-init proper. Maybe if I have a chance to discuss the reasoning behind the netifaces dependency at the summit, others will be on board with allowing that. I opened #56 to remove the dependency on deepmerge, so netifaces is the only one left standing. |
@powersj What are the next steps for me to attend the event? I'm akutz at the company for which I work if you would like to e-mail me. Thanks! |
Mail sent! Thanks! |
Hi, has there been anymore progress on this? Seemed to be picking up steam. |
Would love to see this move forward as well... |
+1 on this |
FYI, I'm renaming this |
Merged upstream into cloud-init |
Hi @akutz,
This module works really well. Have you considered submitting a PR for it to be added to the core cloud-init package? https://git.launchpad.net/cloud-init
I'm sure more people would use it if they were aware of it. Many could benefit.
The text was updated successfully, but these errors were encountered: