Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provider/openstack Compute Network Refactor #1347

Merged
merged 14 commits into from
Apr 2, 2015
Merged

provider/openstack Compute Network Refactor #1347

merged 14 commits into from
Apr 2, 2015

Conversation

jtopjian
Copy link
Contributor

@jtopjian jtopjian commented Apr 1, 2015

The following are a series of changes that I was working on during the OpenStack provider development. Unfortunately these changes did not make it for the initial merge.

I have a lot of notes and comments scattered across previous PRs, but in an attempt to start fresh, I'll try to lay everything out here.

First, to make sure everyone is on the same page, I want to define some areas:

  • Floating IP: Analogous to an Elastic IP in AWS
  • Neutron: An OpenStack Network service that is decoupled from the Compute service (Nova)
  • nova-network: The original network service that is integrated in Compute (Nova)

Neutron, nova-network, and Nova:

nova-network is sometimes labelled as "legacy". While it's true it was previously deprecated, the deprecation status was removed some time ago. As recently as last week there are still no plans to deprecate nova-network. The recent user survey shows that 30% of OpenStack clouds use nova-network.

The reason I'm placing so much emphasis on this is because if Terraform is going to officially support OpenStack, nova-network should be considered a peer to Neutron for the foreseeable future. Calling it "legacy" is the same as saying Neutron is still in development.

Supporting two network services can definitely cause headaches, but fortunately, the areas where Nova and Neutron cross paths are taken care of naturally through the Nova API. A lot of this was already done with this provider, but the one area that was left untouched was Floating IPs.

During the development of this provider, Floating IP support was initially added using the Neutron API. This was because gophercloud did not support Floating IPs through the Nova API. While this worked for Neutron-based clouds, it prevented anyone running nova-network from using Floating IPs.

gophercloud was patched to support floating IPs via the Nova API and I set out to remedy Terraform. Along the way I learned that the Nova API can also handle the allocation, deallocation, association, and disassociation of Neutron-based Floating IPs.

There are areas in OpenStack where it makes sense to strictly use the Neutron API with Floating IPs. This is why two Floating IP resources exist (a networking and compute one). However, when it comes to working solely with Compute, the Compute API can be used regardless of the underlying network service. This holds true for Neutron Security Groups as well which has always used the Nova API.

References:

So, there's the history.

Now, for the commits included, here's what each one is doing:

  • ea228ad: Simply swaps the Neutron-only APIs out for os-floating-ips. As per the commit message, it works with both network systems but has one drawback (will discuss later). At a minimum, this is the only commit that needs merged and a lot of nova-network users will be very happy. But not just that, it gets rid of a lot of Neutron code that is already handled by Nova. I'd be happy to simply submit this single commit as a PR.
  • 3ebe55c: Checking for a network called "public" is only beneficial when the cloud actually has a network called "public". If checks like this should be enabled, might as well have other checks for networks called "default", "external", "provider", "internet", and the myriad of other names I've come across.
  • ad1bfce: IPv6 is a real thing!
  • 9a12fb0: The ability to specify anything by something other than a UUID only makes a better end-user experience.
  • 58af3cb: The network information available to the instance is available from two different locations: the instance itself and the networks the instance is attached to. This commit attempts to make sense of all of that and combine the information where appropriate. You'll see a much more detailed result of terraform show with this patch.
  • c6a81c5: I meant this patch. :)
  • b6f524a: It looks as though MAC addresses aren't available on Nebula-based OpenStack clouds.
  • e605f31 and 2c83073: Typos and cleanup
  • 173a215: updated acceptance test. It looks a little strange because the environment variables aren't actually making it to the acceptance tests. References:

The Drawback

I mentioned a drawback. With Neutron environments, when specifying multiple networks, the network that will receive the floating IP attachment must be the first network specified in the instance resource. #1342 has a similar drawback, but it looks like it would be the last network?

I have a patch into gophercloud to remedy this, and I also have the corresponding terraform code ready, too, but I want to wait until gophercloud is patched.

Other References

So that's my case! I apologize for being overly verbose. It seems a little silly for a single feature of Floating IPs, but since this is a new audience, I wanted to lay everything out.

I'd love to get some feedback -- especially if there are mistakes. I'm in no way saying this is all perfect and without error, but I've been working with this issue, trying different ways of solving it, digging into code, and testing on multiple clouds for six weeks now.

@phinze
Copy link
Contributor

phinze commented Apr 1, 2015

@jtopjian Ah reading this I get more context around that other PR. Does this supercede #1342? If so I'd be willing to revert that so we can review this and get it in.

Clearly I chose the wrong order to walk through PRs today. 😀

@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

@phinze It does, yes. #1342 did help with regard to the existing Neutron only floating IP support, but this PR here would indeed end up removing all of that.

I'm don't mind rebasing this PR against #1342. #1342 will help a few people during the period of time when this PR is reviewed and possibly (hopefully) merged.

@phinze
Copy link
Contributor

phinze commented Apr 1, 2015

Ok, lets plan on rebasing this. My http://docs.openstack.org/developer/devstack/guides/single-vm.html just finished bootstrapping so I'll take this for a spin! 👍

@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

@phinze rebased and tested in my clouds. Do let me know if you run into issues.

@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

aw, crap. commit log got messed up. I'll fix that.

This commit causes the resource to manage floating IPs by way of the
os-floating-ips API.

At the moment, it works with both nova-network and Neutron environments,
but if you use multiple Neutron networks, the network that supports the
floating IP must be listed first.
This is only possible if the OpenStack cloud explicitly has a network
called "public".
This commit allows the user to specify a network by name rather than
just uuid. This is done via the os-tenant-networks api extension.
This works for both neutron and nova-network.
This commit changes how the network info is read from OpenStack.
It pulls all relevant information from server.Addresses and merges
it with the available information from the networks parameters.
The access_v4, access_v6, and floating IP information is then
determined from the result.

A MAC address parameter is also added since that information is
available in server.Addresses.
@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

OK, better.

This allows the obtained network information to be successfully stored
for environments that do not require a network resource to be specified.
@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

@phinze just a heads up: one additional commit.

@hartzell
Copy link
Contributor

hartzell commented Apr 1, 2015

By way of testing, this config file

resource "openstack_compute_floatingip_v2" "test" {
  pool = "nebula"
}

resource "openstack_compute_instance_v2" "test" {
  name = "tf-test"
  image_id = "62896feb-0c93-49bd-93a3-23f597d3f9ec"
  flavor_id = "n1.small"
  key_pair = "alanturing-nebula-keypair"
  floating_ip = "${openstack_compute_floatingip_v2.test.address}"
  security_groups = ["default"]
}

works using "my" nebula cloud with jtopjian's compute-network-refactor branch cloned a couple of minutes ago:

terraform>>git log | head
commit 4244d0947ec6c3d842a1781a5e7c22ffddb0a459

@hartzell
Copy link
Contributor

hartzell commented Apr 1, 2015

Whoops. Found a rough spot. I'm running against my Nebula system (nova networking) with jtopjian's compute-network-refactor branch cloned earlier this morning.

terraform>>git log | head
commit 4244d0947ec6c3d842a1781a5e7c22ffddb0a459

Given this config file:

resource "openstack_compute_floatingip_v2" "test" {
  pool = "nebula"
}

resource "openstack_compute_instance_v2" "test" {
  name = "tf-test"
  image_id = "62896feb-0c93-49bd-93a3-23f597d3f9ec"
  flavor_id = "n1.small"
  key_pair = "alanturing-nebula-keypair"
  floating_ip = "${openstack_compute_floatingip_v2.test.address}"
  security_groups = ["default"]
}

I can apply and show and things look good in the GUI. Unfortunately, terraform destroy fails trying to make a call against os-tenant-networks. E.g.

(alacrity)[13:27:27]terraform-playground>>terraform apply
openstack_compute_floatingip_v2.test: Creating...
  address:     "" => "<computed>"
  fixed_ip:    "" => "<computed>"
  instance_id: "" => "<computed>"
  pool:        "" => "nebula"
  region:      "" => "RegionOne"
openstack_compute_floatingip_v2.test: Creation complete
openstack_compute_instance_v2.test: Creating...
  access_ip_v4:      "" => "<computed>"
  access_ip_v6:      "" => "<computed>"
  flavor_id:         "" => "n1.small"
  flavor_name:       "" => "<computed>"
  floating_ip:       "" => "10.29.92.49"
  image_id:          "" => "62896feb-0c93-49bd-93a3-23f597d3f9ec"
  image_name:        "" => "<computed>"
  key_pair:          "" => "alanturing-nebula-keypair"
  name:              "" => "tf-test"
  network.#:         "" => "<computed>"
  region:            "" => "RegionOne"
  security_groups.#: "" => "1"
  security_groups.0: "" => "default"
openstack_compute_instance_v2.test: Creation complete

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

The state of your infrastructure has been saved to the path
below. This state is required to modify and destroy your
infrastructure, so keep it safe. To inspect the complete state
use the `terraform show` command.

State path: terraform.tfstate
(alacrity)[13:27:55]terraform-playground>>terraform show
openstack_compute_floatingip_v2.test:
  id = 31
  address = 10.29.92.49
  fixed_ip =
  instance_id =
  pool = nebula
  region = RegionOne
openstack_compute_instance_v2.test:
  id = 8fba7fb2-3e58-4333-a483-124fb0d9772a
  access_ip_v4 = 10.29.92.49
  access_ip_v6 =
  flavor_id = n1.small
  flavor_name = n1.small
  floating_ip = 10.29.92.49
  image_id = 62896feb-0c93-49bd-93a3-23f597d3f9ec
  image_name = CentOS 6.6
  key_pair = alanturing-nebula-keypair
  name = tf-test
  network.# = 1
  network.0.fixed_ip_v4 = 10.0.0.67
  network.0.fixed_ip_v6 =
  network.0.mac =
  network.0.name = nebula
  network.0.port =
  network.0.uuid =
  region = RegionOne
  security_groups.# = 1
  security_groups.0 = default

(alacrity)[13:30:16]terraform-playground>>terraform destroy
Do you really want to destroy?
  Terraform will delete all your managed infrastructure.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

openstack_compute_floatingip_v2.test: Refreshing state... (ID: 31)
openstack_compute_instance_v2.test: Refreshing state... (ID: 8fba7fb2-3e58-4333-a483-124fb0d9772a)
Error refreshing state: 1 error(s) occurred:

* 1 error(s) occurred:

* Expected HTTP response code [200 204] when accessing [GET http://proxy.nebula.gene.com:8774/v2/6ba254b787054e23a3abe5030f416b4f/os-tenant-networks], but got 404 instead
404 Not Found

The resource could not be found.


(alacrity)[13:30:22]terraform-playground>>

@hartzell
Copy link
Contributor

hartzell commented Apr 1, 2015

ps. terraform refresh seems to have the same problem:

(alacrity)[13:30:57]terraform-playground>>terraform refresh
openstack_compute_floatingip_v2.test: Refreshing state... (ID: 31)
openstack_compute_instance_v2.test: Refreshing state... (ID: 8fba7fb2-3e58-4333-a483-124fb0d9772a)
Error refreshing state: 1 error(s) occurred:

* 1 error(s) occurred:

* Expected HTTP response code [200 204] when accessing [GET http://proxy.nebula.gene.com:8774/v2/6ba254b787054e23a3abe5030f416b4f/os-tenant-networks], but got 404 instead
404 Not Found

The resource could not be found.

@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

@hartzell Ah - I know where that's coming from. Nebula-spcific item and I should be able to easily fix it. I'll let you know.

@phinze
Copy link
Contributor

phinze commented Apr 1, 2015

Okay @jtopjian - I'm making progress!

I get all the tests to run, but I'm seeing most of the floating IP tests fail.

https://gist.github.com/phinze/e2aa7cc286ade0379abc

I'm sure it's probably a config issue with the devstack setup I'm using. Any chance you can tell what I'm missing based on the errors?

@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

@phinze sure - just finishing up with @hartzell's issue now and I'll look at your log right away.

This commit resolves an issue where the tenant-network api extension
does not exist. The caveat is that the user must either specify no
networks (single network environment) or can only specify UUIDs for
network configurations.
@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

@hartzell Let me know if that latest commit helps you.

@hartzell
Copy link
Contributor

hartzell commented Apr 1, 2015

@jtopjian, looks good. Thanks!

Changes the test to require a network UUID rather than a name.
@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

@phinze Ah! I think two of the failures (the ones related to the floating IP resource) are similar to the discussion on #1342 -- they require a network to be specified.

The failure of TestAccComputeV2Instance_floatingIPAttach is, ironically enough, similar to @hartzell's problem.

Instead of specifying OS_NETWORK_NAME for the test, specify OS_NETWORK_ID with the UUID of the network and let me know if the Instance_floatingIPAttach test now passes. If so, I can work on fixing the other tests.

As a side note, I'm curious if you could post:

  • Your instructions on how you set up your devstack environment
  • Whether or not these commands work:
    • nova net-list
    • nova network-list

@phinze
Copy link
Contributor

phinze commented Apr 1, 2015

Instead of specifying OS_NETWORK_NAME for the test, specify OS_NETWORK_ID with the UUID of the network and let me know if the Instance_floatingIPAttach test now passes.

It does! Though I had to leave OS_NETWORK_NAME set to something or the acceptance test yelled at me that it was required. 😉

I'm running through the rest of the tests with that set to get the full results.

As a side note, I'm curious if you could post (... env, commands)

Sure! I had success getting https://github.com/lorin/devstack-vm set up with the changes from lorin/devstack-vm#6 applied locally. (I also manually bumped the number of vcpus in the Vagrantfile to get a bit more performance out of the VM and get more headroom for QEMU instances.)

Both the commands work - here's what the output looks like:

vagrant@vagrant-ubuntu-trusty-64:~$ nova net-list
+--------------------------------------+---------+------+
| ID                                   | Label   | CIDR |
+--------------------------------------+---------+------+
| 0c340886-87d3-461d-8727-fa0a26946194 | public  | -    |
| 1fea2249-63ad-4a5c-8af0-392a2bff7cb4 | private | -    |
+--------------------------------------+---------+------+
vagrant@vagrant-ubuntu-trusty-64:~$ nova network-list
+--------------------------------------+---------+------+
| ID                                   | Label   | Cidr |
+--------------------------------------+---------+------+
| 0c340886-87d3-461d-8727-fa0a26946194 | public  | -    |
| 1fea2249-63ad-4a5c-8af0-392a2bff7cb4 | private | -    |
+--------------------------------------+---------+------+

@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

All great news! And I'm glad the output of the commands worked... I'm curious why the tests did not work with OS_NETWORK_NAME -- I will research that.

It's also odd that you were still required to set OS_NETWORK_NAME. The string no longer exists in the code.

I will work on fixing up the other two acceptance tests. Do let me know if you run into any more errors.

The errors you're seeing about Firewall and such are most likely due to them not being set up in the devstack environment.

@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

@phinze Also, if you have time, can you do a terraform apply with something similar to the following?

resource "openstack_compute_floatingip_v2" "testfp" {
  pool = "public" # might have to replace based on the output of "nova floating-ip-pool-list
}

resource "openstack_compute_instance_v2" "test" {
  name = "tf-test"
  image_name = "CirrOS" # replace as needed
  flavor_name = "m1.tiny"  # replace as needed
  security_groups = ["default"] # replace as needed
  floating_ip = "${openstack_compute_floatingip_v2.testfp.address}"

  network {
    name = "private"   # please try only this first
    uuid = "1fea2249-63ad-4a5c-8af0-392a2bff7cb4" # then try only this
  }

}

Then do a terraform show and let me know if both the network UUID and name appear even though you only specified one?

@phinze
Copy link
Contributor

phinze commented Apr 1, 2015

Okay here's latest test run: https://gist.github.com/phinze/d7e7db4bd8f91709c196

I realized that I hadn't pulled latest earlier, which probably accounts for the OS_NETWORK_NAME network variable issue.

I'll try that config now! 👍

@phinze
Copy link
Contributor

phinze commented Apr 1, 2015

The errors you're seeing about Firewall and such are most likely due to them not being set up in the devstack environment.

FWIW here's the devstack config I got by default from that project:

[[local|localrc]]
# Default passwords
ADMIN_PASSWORD=password
MYSQL_PASSWORD=password
RABBIT_PASSWORD=password
SERVICE_PASSWORD=password
SERVICE_TOKEN=password


SCREEN_LOGDIR=/opt/stack/logs


HOST_IP=192.168.27.100

#
# Enable Neutron
#
# https://wiki.openstack.org/wiki/NeutronDevstack
disable_service n-net
enable_service q-svc
enable_service q-agt
enable_service q-dhcp
enable_service q-l3
enable_service q-meta
enable_service neutron

# Enable Swift
enable_service s-proxy
enable_service s-object
enable_service s-container
enable_service s-account


# Disable security groups entirely
Q_USE_SECGROUP=False
LIBVIRT_FIREWALL_DRIVER=nova.virt.firewall.NoopFirewallDriver

disable_service tempest

What environment are you using to for testing, @jtopjian?

@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

I'm running my tests against a few production clouds I maintain. One is nova-network and one is Neutron, but the Neutron does not have the FWaaS and LBaaS services enabled, either.

Just to clarify with your above config: LIBVIRT_FIREWALL_DRIVER is related to Security Groups and not the Firewall resources (FWaaS service).

@phinze
Copy link
Contributor

phinze commented Apr 1, 2015

I'm running my tests against a few production clouds I maintain.

You big show off! 😛

Okay I tested your config, and it worked with either the name or the ID set. I did a terraform destroy in between, and both times the instance seems to come up with the IP attached just fine.

instances_-_openstack_dashboard

@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 1, 2015

@phinze lol - thanks

OK, definitely great news that you can define networks in instances by both either UUID or Name and Terraform is obtaining the omitted one. This is confirming my theory that the os-tenant-networks API extension is enabled by default and must be explicitly disabled. So far I've only seen that happen in Nebula based clouds ( @hartzell's issue).

What's strange is that the acc tests were not working until you used an ID. I can look into that in more detail... but at least it looks like it's local to the acc tests.

I just pushed some changes to the floating IP acc tests. I think some further work needs done to the networking_floatingip resource (as mentioned in the comments), but that perhaps can wait? I've been trying to focus on the compute side here.

@phinze
Copy link
Contributor

phinze commented Apr 1, 2015

I think some further work needs done to the networking_floatingip resource (as mentioned in the comments), but that perhaps can wait?

We release 0.4 tomorrow, and it seems like it'd be a shame if these improvements had to wait until 0.5 - so it seems like a good plan to limit the scope so we can land it sooner.

Given that - how do you feel about the current state of this? Do all cloudstack acceptance tests pass on your side? If so, maybe we merge as is and continue to make improvements from 0.4 -> 0.5.

A change was made to account for clouds with multiple networks.
@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 2, 2015

I agree it'd be a shame. Thank you for giving this so much attention today -- I really do appreciate it.

I just ran through the Compute acceptance tests and made a modification that prevented the Instance test from passing on a Neutron cloud. They all now work for me in two different clouds.

One thing to keep in mind is that only the Compute Instance resource has been modified. Of course, that's the biggest part of this whole provider, but any issues that are apparent in other resources (such as the networking_floatingip timing issue) exist outside of these changes.

At least with these changes, things like networking_floatingip have a good workaround for when mixing with Compute: for example, just use compute_floatingip instead. You've seen yourself that this works with the test tf config I gave :)

I never want to say something is 100% perfect, but I have confidence with these changes. I'm also very interested to see the initial reception of this provider, and if there are any issues, especially if directly related to these changes, I'll want to find a fix.

@phinze
Copy link
Contributor

phinze commented Apr 2, 2015

Great! This all sounds good - I'll tentatively plan on getting this merged first thing tomorrow.

@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 2, 2015

Awesome 😄

@hartzell
Copy link
Contributor

hartzell commented Apr 2, 2015

As an eager consumer: Awesome^2. Love to see it in 0.4.

Thanks to you both for all the work!

@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 2, 2015

Here's something I threw together that uses Terraform to deploy DevStack and run the Compute acceptance tests:

https://gist.github.com/jtopjian/4ffc82bfcbbcc78d07e4

All passed. :)

If I had more time tonight, I'd create a few variations that test against different versions of OpenStack and maybe run some of the DevStack "exercises".

Also, perhaps a minor note, but I just wanted to clarify: In Neutron environments, when creating a "private" and "public" (floating IP) network, this still counts as only a single-network environment. It's not until the tenant creates more than one "private" network that multi-network issues come in. This escaped me today and I apologize. The good news is that the acceptance tests will run just fine in either type of environment (which I have tested), given an OS_NETWORK_ID environment variable.

@phinze
Copy link
Contributor

phinze commented Apr 2, 2015

Amazing work @jtopjian. Major thanks! 🙇

phinze added a commit that referenced this pull request Apr 2, 2015
provider/openstack Compute Network Refactor
@phinze phinze merged commit e0cdadf into hashicorp:master Apr 2, 2015
@jtopjian
Copy link
Contributor Author

jtopjian commented Apr 2, 2015

@phinze Yay! Thank you!

@jrperritt
Copy link
Contributor

Due to the nuances of nova-networks and Neutron, this was probably the hardest and most time-consuming aspect of the OpenStack provider to implement. Great work, @jtopjian

@ghost
Copy link

ghost commented May 3, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators May 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants