Skip to content

Conversation

@derekhiggins
Copy link
Collaborator

No description provided.

Copy link

@etingof etingof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a handful of nits, some may not be relevant to this change because it moves code from one place to the other.


# Workaround so that the dracut network module does dhcp on eth0 & eth1
if [ ! -e $IRONIC_DATA_DIR/html/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 ] ; then
pushd $IRONIC_DATA_DIR/html/images
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could $IRONIC_DATA_DIR/html/images directory be missing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its created above in get_images.sh

# Workaround so that the dracut network module does dhcp on eth0 & eth1
if [ ! -e $IRONIC_DATA_DIR/html/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 ] ; then
pushd $IRONIC_DATA_DIR/html/images
qemu-img convert redhat-coreos-maipo-47.284-openstack.qcow2 redhat-coreos-maipo-47.284-openstack.raw
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: will qemu-img fail if redhat-coreos-maipo-47.284-openstack.raw accidentally exists from some unsuccessful previous run?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, its ok if the file already exists

mkdir -p /tmp/mnt
sudo kpartx -a /dev/$LOOPBACK
sudo mount /dev/mapper/${LOOPBACK}p1 /tmp/mnt
sudo sed -i -e 's/ip=eth0:dhcp/ip=eth0:dhcp ip=eth1:dhcp/g' /tmp/mnt/grub2/grub.cfg
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is it possible that ip=eth1:dhcp is already present in the config? Or ip=eth0:dhcp is absent?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The safest thing would probably be to use augeas cli for this purpose

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make this a little more robust in a later patch, I'm trying as much as possible to just move around the existing code to keep the patch as simple as possible.

#fi

# Workaround so that the dracut network module does dhcp on eth0 & eth1
if [ ! -e $IRONIC_DATA_DIR/html/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 ] ; then
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would it make sense to move redhat-coreos-maipo-47.284 name to a variable to ease further maintenance?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I'll add another patch to do this, the current patch is just moving code around



for name in ironic ironic-inspector ; do
sudo podman ps | grep "$name$" && sudo podman kill $name
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would grep -w harden the matching a bit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do, thanks

README.md Outdated
- `./04_build_ocp_installer.sh`
- `./04_setup_ironic.sh`

This we setup Ironic on the host server and download the resources it requires
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "will" perhaps

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

#fi

# Workaround so that the dracut network module does dhcp on eth0 & eth1
if [ ! -e $IRONIC_DATA_DIR/html/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 ] ; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if [ ! -e $IRONIC_DATA_DIR/html/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 ] ; then
if [ ! -e "${IRONIC_DATA_DIR}/html/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2" ] ; then


# Workaround so that the dracut network module does dhcp on eth0 & eth1
if [ ! -e $IRONIC_DATA_DIR/html/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 ] ; then
pushd $IRONIC_DATA_DIR/html/images
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pushd $IRONIC_DATA_DIR/html/images
pushd "${IRONIC_DATA_DIR}/html/images"

mkdir -p /tmp/mnt
sudo kpartx -a /dev/$LOOPBACK
sudo mount /dev/mapper/${LOOPBACK}p1 /tmp/mnt
sudo sed -i -e 's/ip=eth0:dhcp/ip=eth0:dhcp ip=eth1:dhcp/g' /tmp/mnt/grub2/grub.cfg
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The safest thing would probably be to use augeas cli for this purpose

done

# Start Ironic and inspector
sudo podman run -d --net host --privileged --name ironic -v $IRONIC_DATA_DIR/dnsmasq.conf:/etc/dnsmasq.conf -v $IRONIC_DATA_DIR/html/images:/var/www/html/images -v $IRONIC_DATA_DIR/html/dualboot.ipxe:/var/www/html/dualboot.ipxe -v $IRONIC_DATA_DIR/html/inspector.ipxe:/var/www/html/inspector.ipxe ${IRONIC_IMAGE}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sudo podman run -d --net host --privileged --name ironic -v $IRONIC_DATA_DIR/dnsmasq.conf:/etc/dnsmasq.conf -v $IRONIC_DATA_DIR/html/images:/var/www/html/images -v $IRONIC_DATA_DIR/html/dualboot.ipxe:/var/www/html/dualboot.ipxe -v $IRONIC_DATA_DIR/html/inspector.ipxe:/var/www/html/inspector.ipxe ${IRONIC_IMAGE}
sudo podman run -d --net host --privileged --name ironic \
-v "${IRONIC_DATA_DIR}/dnsmasq.conf:/etc/dnsmasq.conf" \
-v "${IRONIC_DATA_DIR}/html/images:/var/www/html/images" \
-v "${IRONIC_DATA_DIR}/html/dualboot.ipxe:/var/www/html/dualboot.ipxe" \
-v "${IRONIC_DATA_DIR}/html/inspector.ipxe:/var/www/html/inspector.ipxe" \
"$IRONIC_IMAGE"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we'd use the bridge cni plugin to run it with an interface on the brovc bridge so it doesn't need net host. That would probably even let it run non-privileged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the bridge cni plugin, is it something we could swap out later?


# FIXME(shardy) we should parameterize the image
openstack baremetal node set $node --instance-info image_source=http://172.22.0.1/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 --instance-info image_checksum=$(md5sum images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 | awk '{print $1}') --instance-info root_gb=25 --property root_device="{\"name\": \"$ROOT_DISK\"}"
openstack baremetal node set $node --instance-info image_source=http://172.22.0.1/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 --instance-info image_checksum=$(md5sum $IRONIC_DATA_DIR/html/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 | awk '{print $1}') --instance-info root_gb=25 --property root_device="{\"name\": \"$ROOT_DISK\"}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
openstack baremetal node set $node --instance-info image_source=http://172.22.0.1/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 --instance-info image_checksum=$(md5sum $IRONIC_DATA_DIR/html/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 | awk '{print $1}') --instance-info root_gb=25 --property root_device="{\"name\": \"$ROOT_DISK\"}"
openstack baremetal node set $node \
--instance-info image_source=http://172.22.0.1/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2 \
--instance-info image_checksum=$(md5sum "${IRONIC_DATA_DIR}/html/images/redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2" | awk '{print $1}') \
--instance-info root_gb=25 --property root_device="{\"name\": \"$ROOT_DISK\"}"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna leave this as is for the moment as I havn't touched the file, just renamed it

common.sh Outdated
# Ironic vars
export IRONIC_IMAGE=${IRONIC_IMAGE:-"quay.io/metalkube/metalkube-ironic"}
export IRONIC_INSPECTOR_IMAGE=${IRONIC_INSPECTOR_IMAGE:-"quay.io/metalkube/metalkube-ironic-inspector"}
export IRONIC_DATA_DIR=$WORKING_DIR/ironic
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export IRONIC_DATA_DIR=$WORKING_DIR/ironic
export IRONIC_DATA_DIR="${WORKING_DIR}/ironic"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

get_images.sh Outdated

mkdir -p images
pushd images
mkdir -p $IRONIC_DATA_DIR/html/images
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mkdir -p $IRONIC_DATA_DIR/html/images
mkdir -p "${IRONIC_DATA_DIR}/html/images"

get_images.sh Outdated
mkdir -p images
pushd images
mkdir -p $IRONIC_DATA_DIR/html/images
pushd $IRONIC_DATA_DIR/html/images
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pushd $IRONIC_DATA_DIR/html/images
pushd "${IRONIC_DATA_DIR}/html/images"

@derekhiggins derekhiggins force-pushed the move-ironic branch 4 times, most recently from 1ad9162 to d3f1048 Compare February 19, 2019 12:56
@derekhiggins derekhiggins changed the title WIP: Move ironic to the host Move ironic to the virt host Feb 19, 2019
ocp_run:
./05_deploy_bootstrap_vm.sh
./06_deploy_masters.sh
./06_deploy_bootstrap_vm.sh
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either here or in the default target we need to call 04_setup_ironic.sh?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opps, will put it into default


mkdir -p images
pushd images
mkdir -p "$IRONIC_DATA_DIR/html/images"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to chown my $WORKING_DIR for this to work, but that could be because I've still got it pointed to /home/stack which was world readable but not writable by my local shardy user..

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing which would be nice is to move any existing images directory to $IRONIC_DATA_DIR, then any existing users can avoid downloading all-the-things again? :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@hardys
Copy link

hardys commented Feb 19, 2019

This looks really close modulo nits already mentioned, but is anyone else having issues accessing the new image location via http?

[shardy@tripleodev2 html]$ wget http://172.22.0.1/
--2019-02-19 15:50:08--  http://172.22.0.1/
Connecting to 172.22.0.1:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2019-02-19 15:50:08 ERROR 403: Forbidden.

[shardy@tripleodev2 html]$ wget http://172.22.0.1/images
--2019-02-19 15:50:09--  http://172.22.0.1/images
Connecting to 172.22.0.1:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2019-02-19 15:50:09 ERROR 404: Not Found.

This causes the ironic deploy to fail, but may be specific to my env (permissions look OK but I am using a non-default WORKING_DIR pointing at /home/stack)

@derekhiggins
Copy link
Collaborator Author

This causes the ironic deploy to fail, but may be specific to my env (permissions look OK but I am using a non-default WORKING_DIR pointing at /home/stack)

hmm, can you check if the images directory got mounted ok?

# podman exec -it ironic ls /var/www/html/images/
ironic-python-agent.initramfs  ironic-python-agent.kernel  redhat-coreos-maipo-47.284-dualdhcp.qcow2  redhat-coreos-maipo-47.284-openstack.qcow2  redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2

Also check the logs for the podman run command to make sure the correct directories were selected for the volumes.

@derekhiggins derekhiggins force-pushed the move-ironic branch 2 times, most recently from eed56bc to b5ccabf Compare February 20, 2019 09:47
@hardys
Copy link

hardys commented Feb 20, 2019

Looks good, sorry I think I caused another conflict though by merging #85 (rebased cleanly for me locally though).

One question - where should we wire in cleanup for the ironic containers?

I guess we could create a new script and ensure it's called on make clean, or add it to the existing libvirt_cleanup (renamed to host_cleanup.sh perhaps?)

@derekhiggins
Copy link
Collaborator Author

Looks good, sorry I think I caused another conflict though by merging #85 (rebased cleanly for me locally though).

No prob, I'll rebase now

One question - where should we wire in cleanup for the ironic containers?

I guess we could create a new script and ensure it's called on make clean, or add it to the existing libvirt_cleanup (renamed to host_cleanup.sh perhaps?)

I'll add it to libvirt_cleanup and rename

export OS_URL=http://localhost:6385/

wait_for_json ironic \
"${OS_URL}/v1/nodes" \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not specific to this patch but I noticed the wait_for_ironic thing doesn't work.

I noticed this because the default makefile target is still missing ironic, so the containers weren't ruinning:

[shardy@tripleodev2 dev-scripts]$ curl -g -X GET http://localhost:6385//v1/nodes '-H Accept: application/json -H Content-Type: application/json -H User-Agent: wait-for-json -H X-Auth-Token: fake-token'
curl: (7) Failed connect to localhost:6385; Connection refused
[shardy@tripleodev2 dev-scripts]$ curl -g -X GET http://localhost:6385//v1/nodes '-H Accept: application/json -H Content-Type: application/json -H User-Agent: wait-for-json -H X-Auth-Token: fake-token' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed connect to localhost:6385; Connection refused
[shardy@tripleodev2 dev-scripts]$ echo $?
0
Waiting for ironic to respond++ date +%s
+ start_time=1550658918
+ curl -g -X GET http://localhost:6385//v1/nodes '-H Accept: application/json -H Content-Type: application/json -H User-Agent: wait-for-json -H X-Auth-Token: fake-token'
+ jq .
+ echo ' Success!'
 Success!
+ return 0

Copy link
Collaborator Author

@derekhiggins derekhiggins Feb 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not specific to this patch but I noticed the wait_for_ironic thing doesn't work.

Ya, I also noticed this wasn't working, I havn't look into why,

I noticed this because the default makefile target is still missing ironic, so the containers weren't ruinning:

Opps I thought I added it, will fix now.

Just a placeholder script for the moment, so the next
patches are easier to follow.
@hardys
Copy link

hardys commented Feb 20, 2019

Also check the logs for the podman run command to make sure the correct directories were selected for the volumes.

Yeah this looks Ok so not yet sure why it's 404ing:

+ openstack baremetal node deploy --config-drive configdrive openshift-master-0
Failed to validate deploy or power info for node b683113a-7ce1-41d6-88c1-dd626de1c124. Error: Validation of image href http://172.22.0.1/images/redhat-coreos-maipo-47.284-dualdhcp.qcow2 failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request. (HTTP 400)

[shardy@tripleodev2 dev-scripts]$ sudo podman exec -it ironic ls /var/www/html/images
ironic-python-agent.initramfs  redhat-coreos-maipo-47.284-dualdhcp.qcow2   redhat-coreos-maipo-47.284-openstack_dualdhcp.qcow2
ironic-python-agent.kernel     redhat-coreos-maipo-47.284-openstack.qcow2

[shardy@tripleodev2 dev-scripts]$ curl http://172.22.0.1/images/redhat-coreos-maipo-47.284-dualdhcp.qcow2
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /images/redhat-coreos-maipo-47.284-dualdhcp.qcow2 was not found on this server.</p>
</body></html>

Move the ironic and inspector containers from the bootstrap node
to the virt host. This will reduce delays between booting the
bootstrap node and having it available to do work.
These shouldn't have been hardcoded in the first place.
We don't need this at the moment as ironic is now on the host.
We will need it soon for ironic on the openshift masters but one
rule will do.
Add a hook to build the ironic containers while they
are being beveloped locally. Also move some vars common.sh
so they can be set in config_$USER.sh if needed.
@hardys
Copy link

hardys commented Feb 20, 2019

Edit the 404 was because I had httpd running on the host, and the container starts even though there's a port conflict. I can push a follow-up to check for that.

@hardys
Copy link

hardys commented Feb 20, 2019

I'm still having issues in that http://172.22.0.1/dualboot.ipxe is accessible via the host, but the openshift_master_* nodes can't reach it on boot. It seems to be using the right device, getting a DHCP lease and the next-server, but then failing with Connnection Timed out attempting to download the ipxe config.

Anyone else seeing similar? Possibly something specific to my env as this isn't a freshly installed box.

@derekhiggins
Copy link
Collaborator Author

I'm still having issues in that http://172.22.0.1/dualboot.ipxe is accessible via the host, but the openshift_master_* nodes can't reach it on boot. It seems to be using the right device, getting a DHCP lease and the next-server, but then failing with Connnection Timed out attempting to download the ipxe config.

Anyone else seeing similar? Possibly something specific to my env as this isn't a freshly installed box.

I havn't seen this but 2 things worth checking that spring to mind

  1. Make sure you have only one interface with 172.22.0.1 , (also check the bootstrap node)
  2. If you have previously added a route on your host to 172.22.0.1 , you should probably delete it (I didn't have one set so mightn't have seen the potential issue)

@hardys
Copy link

hardys commented Feb 20, 2019

2. If you have previously added a route on your host to 172.22.0.1 , you should probably delete it (I didn't have one set so mightn't have seen the potential issue)

Yeah thanks this was it, I had a stale route sending traffic via virbr0 (doh!), everything now works fine for me, so I'm good to merge if nobody else has issues :)

@hardys hardys self-requested a review February 20, 2019 16:37
@hardys hardys removed their assignment Feb 20, 2019
@hardys
Copy link

hardys commented Feb 20, 2019

Ok for the benefit of anyone else who may hit similar issues, I had two problems when testing this which were related to my environment - httpd was running on the host which slilently conflicts with the webserver in the ironic container, and there was a stale route for 172.22.0.0/24 which needed to be removed.

Other than that everything seems to work well so lets merge and iterate on remaining issues.

@hardys hardys merged commit 6613c13 into openshift-metal3:master Feb 20, 2019
@hardys hardys mentioned this pull request Feb 20, 2019
@derekhiggins derekhiggins deleted the move-ironic branch April 24, 2019 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants