Skip to content

Conversation

@Prashanth684
Copy link
Contributor

For s390x in pXE mode, the network device has the bootindex 1. Add a disk
without specifying it as a primary disk with bootindex.

For s390x in pXE mode, the network device has the bootindex 1. Add a disk
without specifying it as a primary disk with bootindex.
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Prashanth684
To complete the pull request process, please assign lorbuschris
You can assign the PR to them by writing /assign @lorbuschris in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Prashanth684
Copy link
Contributor Author

s390x does not have the boot-once option and so the bootindex is used to determine the device to boot from.

@cgwalters
Copy link
Member

This will (textually) conflict with #1330

OK so this problem presumably came in with #1312 ...it feels to me like it'd be better if we tried to adddress this at the qemu.go level and not testiso.go because (among other reasons) we eventually want kola spawn --iso and kola spawn --iso-install which would run through an install, and those share the metal.go and qemu.go.

@cgwalters
Copy link
Member

I don't understand though how s390x PXE works without this though...how does the firmware know to boot from the disk and not the network?

@Prashanth684
Copy link
Contributor Author

This will (textually) conflict with #1330

OK so this problem presumably came in with #1312 ...it feels to me like it'd be better if we tried to adddress this at the qemu.go level and not testiso.go because (among other reasons) we eventually want kola spawn --iso and kola spawn --iso-install which would run through an install, and those share the metal.go and qemu.go.

right ..i was wondering whether to do it in qemu.go but this is very specific to the pXE scenario and netbooting which is why i thought it should go to testiso.

@Prashanth684
Copy link
Contributor Author

I don't understand though how s390x PXE works without this though...how does the firmware know to boot from the disk and not the network?

Yes. I remember there was some quirks about the bootindex in s390x and zipl updating it. I will have to check with some of the s390x experts and find out.

@cgwalters
Copy link
Member

A different way to do this would be for us to use the QMP protocol to dynamically tell qemu to change the bootorder, triggered by a request from the guest. This closely matches how projects like Foreman do it for real bare metal PXE setups.

We'd need to do some plumbing to add QMP support to qemu.go but I could take a crack at that.

@cgwalters
Copy link
Member

We'd need to do some plumbing to add QMP support to qemu.go but I could take a crack at that.

I started to write that then discovered
https://github.com/digitalocean/go-qemu
and...it looks sane, need to investigate more though. Dumping this code here

diff --git a/mantle/platform/qemu.go b/mantle/platform/qemu.go
index 421923e2..f22540a7 100644
--- a/mantle/platform/qemu.go
+++ b/mantle/platform/qemu.go
@@ -22,6 +22,7 @@ import (
 	"io/ioutil"
 	"math"
 	"math/rand"
+	"net"
 	"os"
 	"path/filepath"
 	"regexp"
@@ -64,11 +65,12 @@ type Disk struct {
 type QemuInstance struct {
 	qemu       exec.Cmd
 	tmpConfig  string
-	swtpmTmpd  string
+	tempdir    string
 	swtpm      exec.Cmd
 	tmpFiles   []string
 	nbdServers []exec.Cmd
 
+	qmpChan     *net.Conn
 	journalPipe *os.File
 }
 
@@ -214,14 +216,13 @@ func (inst *QemuInstance) Destroy() {
 		}
 	}
 	inst.qemu = nil
-	if inst.swtpmTmpd != "" {
-		if inst.swtpm != nil {
-			inst.swtpm.Kill() // Ignore errors
-		}
+	if inst.swtpm != nil {
+		inst.swtpm.Kill() // Ignore errors
 		inst.swtpm = nil
-		// And ensure it's cleaned up
-		if err := os.RemoveAll(inst.swtpmTmpd); err != nil {
-			plog.Errorf("Error removing swtpm dir: %v", err)
+	}
+	if inst.tempdir != "" {
+		if err := os.RemoveAll(inst.tempdir); err != nil {
+			plog.Errorf("Error removing qemu tempdir: %v", err)
 		}
 	}
 	for _, nbdServ := range inst.nbdServers {
@@ -853,6 +854,32 @@ func (builder *QemuBuilder) VirtioChannelRead(name string) (*os.File, error) {
 	return r, nil
 }
 
+// func allocateSocketPair() (*net.Conn, *os.File, error) {
+// 	fds, err := syscall.Socketpair(syscall.AF_LOCAL, syscall.SOCK_STREAM, 0)
+// 	if err != nil {
+// 		return nil, nil, err
+// 	}
+
+// 	hostfd := os.NewFile(uintptr(fds[0]), "sockpair1")
+// 	defer hostfd.Close()
+
+// 	guestfd := os.NewFile(uintptr(fds[1]), "sockpair2")
+// 	deallocGuestfd := true
+// 	defer func() {
+// 		if deallocGuestfd {
+// 			guestfd.Close()
+// 		}
+// 	}()
+
+// 	hostSock, err := net.FileConn(hostfd)
+// 	if err != nil {
+// 		return nil, nil, err
+// 	}
+
+// 	deallocGuestfd = false
+// 	return &hostSock, guestfd, nil
+// }
+
 func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 	builder.finalize()
 	var err error
@@ -899,6 +926,11 @@ func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 	// We never want a popup window
 	argv = append(argv, "-nographic")
 
+	inst.tempdir, err = ioutil.TempDir("", "mantle-qemu-tmp")
+	if err != nil {
+		return nil, err
+	}
+
 	// Handle Ignition
 	if builder.ConfigFile != "" && !builder.configInjected {
 		if builder.supportsFwCfg() {
@@ -925,16 +957,11 @@ func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 
 	// Handle Software TPM
 	if builder.Swtpm && builder.supportsSwtpm() {
-		inst.swtpmTmpd, err = ioutil.TempDir("", "kola-swtpm")
-		if err != nil {
-			return nil, err
-		}
-
-		swtpmSock := filepath.Join(inst.swtpmTmpd, "swtpm-sock")
+		swtpmSock := filepath.Join(inst.tempdir, "swtpm-sock")
 
 		inst.swtpm = exec.Command("swtpm", "socket", "--tpm2",
 			"--ctrl", fmt.Sprintf("type=unixio,path=%s", swtpmSock),
-			"--terminate", "--tpmstate", fmt.Sprintf("dir=%s", inst.swtpmTmpd))
+			"--terminate", "--tpmstate", fmt.Sprintf("dir=%s", inst.tempdir))
 		cmd := inst.swtpm.(*exec.ExecCmd)
 		// For now silence the swtpm stderr as it prints errors when
 		// disconnected, but that's normal.
@@ -956,9 +983,13 @@ func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 		case "ppc64le":
 			argv = append(argv, "-device", "tpm-spapr,tpmdev=tpm0")
 		}
-
 	}
 
+	qmpPath := filepath.Join(inst.tempdir, "qmp.sock")
+	qmpId := "mantle-qmp"
+	builder.Append("-chardev", fmt.Sprintf("socket,id=%s,path=%s,server,nowait", qmpId, qmpPath))
+	builder.Append("-mon", fmt.Sprintf("chardev=%s,mode=control", qmpId))
+
 	// Set up the virtio channel to get Ignition failures by default
 	journalPipeR, err := builder.VirtioChannelRead("com.coreos.ignition.journal")
 	inst.journalPipe = journalPipeR
@@ -1000,6 +1031,25 @@ func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 		return nil, err
 	}
 
+	// retry polling its state
+	if err := util.Retry(10, 1*time.Second, func() error {
+		qmpChan, err := net.Dial("unix", qmpPath)
+		if err != nil {
+			return err
+		}
+		inst.qmpChan = &qmpChan
+		return nil
+	}); err != nil {
+		return nil, errors.Wrapf(err, "Connecting to qemu monitor")
+	}
+
+	go func() {
+		r := bufio.NewReader(*inst.qmpChan)
+		loop {
+			
+		}
+	}()
+
 	return &inst, nil
 }
 

@cgwalters
Copy link
Member

(I really wanted to use a socketpair() but for some reason qemu was saying Permission denied)

@Prashanth684
Copy link
Contributor Author

Prashanth684 commented Apr 13, 2020

We'd need to do some plumbing to add QMP support to qemu.go but I could take a crack at that.

I started to write that then discovered
https://github.com/digitalocean/go-qemu
and...it looks sane, need to investigate more though. Dumping this code here

diff --git a/mantle/platform/qemu.go b/mantle/platform/qemu.go
index 421923e2..f22540a7 100644
--- a/mantle/platform/qemu.go
+++ b/mantle/platform/qemu.go
@@ -22,6 +22,7 @@ import (
 	"io/ioutil"
 	"math"
 	"math/rand"
+	"net"
 	"os"
 	"path/filepath"
 	"regexp"
@@ -64,11 +65,12 @@ type Disk struct {
 type QemuInstance struct {
 	qemu       exec.Cmd
 	tmpConfig  string
-	swtpmTmpd  string
+	tempdir    string
 	swtpm      exec.Cmd
 	tmpFiles   []string
 	nbdServers []exec.Cmd
 
+	qmpChan     *net.Conn
 	journalPipe *os.File
 }
 
@@ -214,14 +216,13 @@ func (inst *QemuInstance) Destroy() {
 		}
 	}
 	inst.qemu = nil
-	if inst.swtpmTmpd != "" {
-		if inst.swtpm != nil {
-			inst.swtpm.Kill() // Ignore errors
-		}
+	if inst.swtpm != nil {
+		inst.swtpm.Kill() // Ignore errors
 		inst.swtpm = nil
-		// And ensure it's cleaned up
-		if err := os.RemoveAll(inst.swtpmTmpd); err != nil {
-			plog.Errorf("Error removing swtpm dir: %v", err)
+	}
+	if inst.tempdir != "" {
+		if err := os.RemoveAll(inst.tempdir); err != nil {
+			plog.Errorf("Error removing qemu tempdir: %v", err)
 		}
 	}
 	for _, nbdServ := range inst.nbdServers {
@@ -853,6 +854,32 @@ func (builder *QemuBuilder) VirtioChannelRead(name string) (*os.File, error) {
 	return r, nil
 }
 
+// func allocateSocketPair() (*net.Conn, *os.File, error) {
+// 	fds, err := syscall.Socketpair(syscall.AF_LOCAL, syscall.SOCK_STREAM, 0)
+// 	if err != nil {
+// 		return nil, nil, err
+// 	}
+
+// 	hostfd := os.NewFile(uintptr(fds[0]), "sockpair1")
+// 	defer hostfd.Close()
+
+// 	guestfd := os.NewFile(uintptr(fds[1]), "sockpair2")
+// 	deallocGuestfd := true
+// 	defer func() {
+// 		if deallocGuestfd {
+// 			guestfd.Close()
+// 		}
+// 	}()
+
+// 	hostSock, err := net.FileConn(hostfd)
+// 	if err != nil {
+// 		return nil, nil, err
+// 	}
+
+// 	deallocGuestfd = false
+// 	return &hostSock, guestfd, nil
+// }
+
 func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 	builder.finalize()
 	var err error
@@ -899,6 +926,11 @@ func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 	// We never want a popup window
 	argv = append(argv, "-nographic")
 
+	inst.tempdir, err = ioutil.TempDir("", "mantle-qemu-tmp")
+	if err != nil {
+		return nil, err
+	}
+
 	// Handle Ignition
 	if builder.ConfigFile != "" && !builder.configInjected {
 		if builder.supportsFwCfg() {
@@ -925,16 +957,11 @@ func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 
 	// Handle Software TPM
 	if builder.Swtpm && builder.supportsSwtpm() {
-		inst.swtpmTmpd, err = ioutil.TempDir("", "kola-swtpm")
-		if err != nil {
-			return nil, err
-		}
-
-		swtpmSock := filepath.Join(inst.swtpmTmpd, "swtpm-sock")
+		swtpmSock := filepath.Join(inst.tempdir, "swtpm-sock")
 
 		inst.swtpm = exec.Command("swtpm", "socket", "--tpm2",
 			"--ctrl", fmt.Sprintf("type=unixio,path=%s", swtpmSock),
-			"--terminate", "--tpmstate", fmt.Sprintf("dir=%s", inst.swtpmTmpd))
+			"--terminate", "--tpmstate", fmt.Sprintf("dir=%s", inst.tempdir))
 		cmd := inst.swtpm.(*exec.ExecCmd)
 		// For now silence the swtpm stderr as it prints errors when
 		// disconnected, but that's normal.
@@ -956,9 +983,13 @@ func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 		case "ppc64le":
 			argv = append(argv, "-device", "tpm-spapr,tpmdev=tpm0")
 		}
-
 	}
 
+	qmpPath := filepath.Join(inst.tempdir, "qmp.sock")
+	qmpId := "mantle-qmp"
+	builder.Append("-chardev", fmt.Sprintf("socket,id=%s,path=%s,server,nowait", qmpId, qmpPath))
+	builder.Append("-mon", fmt.Sprintf("chardev=%s,mode=control", qmpId))
+
 	// Set up the virtio channel to get Ignition failures by default
 	journalPipeR, err := builder.VirtioChannelRead("com.coreos.ignition.journal")
 	inst.journalPipe = journalPipeR
@@ -1000,6 +1031,25 @@ func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 		return nil, err
 	}
 
+	// retry polling its state
+	if err := util.Retry(10, 1*time.Second, func() error {
+		qmpChan, err := net.Dial("unix", qmpPath)
+		if err != nil {
+			return err
+		}
+		inst.qmpChan = &qmpChan
+		return nil
+	}); err != nil {
+		return nil, errors.Wrapf(err, "Connecting to qemu monitor")
+	}
+
+	go func() {
+		r := bufio.NewReader(*inst.qmpChan)
+		loop {
+			
+		}
+	}()
+
 	return &inst, nil
 }
 

I did try with the digital ocean library - one issue was - where do we call the connect to the socket and send commands? I see the qmp.sock created only after the VM gets started. The place above where you try to connect to the socket , I see that it has not been created yet.

Also, just for my understanding - how do we want to use this? send qmp commands to control the device to boot from and leave the bootindex as is ?

@Prashanth684
Copy link
Contributor Author

We'd need to do some plumbing to add QMP support to qemu.go but I could take a crack at that.

I started to write that then discovered
https://github.com/digitalocean/go-qemu
and...it looks sane, need to investigate more though. Dumping this code here

diff --git a/mantle/platform/qemu.go b/mantle/platform/qemu.go
index 421923e2..f22540a7 100644
--- a/mantle/platform/qemu.go
+++ b/mantle/platform/qemu.go
@@ -22,6 +22,7 @@ import (
 	"io/ioutil"
 	"math"
 	"math/rand"
+	"net"
 	"os"
 	"path/filepath"
 	"regexp"
@@ -64,11 +65,12 @@ type Disk struct {
 type QemuInstance struct {
 	qemu       exec.Cmd
 	tmpConfig  string
-	swtpmTmpd  string
+	tempdir    string
 	swtpm      exec.Cmd
 	tmpFiles   []string
 	nbdServers []exec.Cmd
 
+	qmpChan     *net.Conn
 	journalPipe *os.File
 }
 
@@ -214,14 +216,13 @@ func (inst *QemuInstance) Destroy() {
 		}
 	}
 	inst.qemu = nil
-	if inst.swtpmTmpd != "" {
-		if inst.swtpm != nil {
-			inst.swtpm.Kill() // Ignore errors
-		}
+	if inst.swtpm != nil {
+		inst.swtpm.Kill() // Ignore errors
 		inst.swtpm = nil
-		// And ensure it's cleaned up
-		if err := os.RemoveAll(inst.swtpmTmpd); err != nil {
-			plog.Errorf("Error removing swtpm dir: %v", err)
+	}
+	if inst.tempdir != "" {
+		if err := os.RemoveAll(inst.tempdir); err != nil {
+			plog.Errorf("Error removing qemu tempdir: %v", err)
 		}
 	}
 	for _, nbdServ := range inst.nbdServers {
@@ -853,6 +854,32 @@ func (builder *QemuBuilder) VirtioChannelRead(name string) (*os.File, error) {
 	return r, nil
 }
 
+// func allocateSocketPair() (*net.Conn, *os.File, error) {
+// 	fds, err := syscall.Socketpair(syscall.AF_LOCAL, syscall.SOCK_STREAM, 0)
+// 	if err != nil {
+// 		return nil, nil, err
+// 	}
+
+// 	hostfd := os.NewFile(uintptr(fds[0]), "sockpair1")
+// 	defer hostfd.Close()
+
+// 	guestfd := os.NewFile(uintptr(fds[1]), "sockpair2")
+// 	deallocGuestfd := true
+// 	defer func() {
+// 		if deallocGuestfd {
+// 			guestfd.Close()
+// 		}
+// 	}()
+
+// 	hostSock, err := net.FileConn(hostfd)
+// 	if err != nil {
+// 		return nil, nil, err
+// 	}
+
+// 	deallocGuestfd = false
+// 	return &hostSock, guestfd, nil
+// }
+
 func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 	builder.finalize()
 	var err error
@@ -899,6 +926,11 @@ func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 	// We never want a popup window
 	argv = append(argv, "-nographic")
 
+	inst.tempdir, err = ioutil.TempDir("", "mantle-qemu-tmp")
+	if err != nil {
+		return nil, err
+	}
+
 	// Handle Ignition
 	if builder.ConfigFile != "" && !builder.configInjected {
 		if builder.supportsFwCfg() {
@@ -925,16 +957,11 @@ func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 
 	// Handle Software TPM
 	if builder.Swtpm && builder.supportsSwtpm() {
-		inst.swtpmTmpd, err = ioutil.TempDir("", "kola-swtpm")
-		if err != nil {
-			return nil, err
-		}
-
-		swtpmSock := filepath.Join(inst.swtpmTmpd, "swtpm-sock")
+		swtpmSock := filepath.Join(inst.tempdir, "swtpm-sock")
 
 		inst.swtpm = exec.Command("swtpm", "socket", "--tpm2",
 			"--ctrl", fmt.Sprintf("type=unixio,path=%s", swtpmSock),
-			"--terminate", "--tpmstate", fmt.Sprintf("dir=%s", inst.swtpmTmpd))
+			"--terminate", "--tpmstate", fmt.Sprintf("dir=%s", inst.tempdir))
 		cmd := inst.swtpm.(*exec.ExecCmd)
 		// For now silence the swtpm stderr as it prints errors when
 		// disconnected, but that's normal.
@@ -956,9 +983,13 @@ func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 		case "ppc64le":
 			argv = append(argv, "-device", "tpm-spapr,tpmdev=tpm0")
 		}
-
 	}
 
+	qmpPath := filepath.Join(inst.tempdir, "qmp.sock")
+	qmpId := "mantle-qmp"
+	builder.Append("-chardev", fmt.Sprintf("socket,id=%s,path=%s,server,nowait", qmpId, qmpPath))
+	builder.Append("-mon", fmt.Sprintf("chardev=%s,mode=control", qmpId))
+
 	// Set up the virtio channel to get Ignition failures by default
 	journalPipeR, err := builder.VirtioChannelRead("com.coreos.ignition.journal")
 	inst.journalPipe = journalPipeR
@@ -1000,6 +1031,25 @@ func (builder *QemuBuilder) Exec() (*QemuInstance, error) {
 		return nil, err
 	}
 
+	// retry polling its state
+	if err := util.Retry(10, 1*time.Second, func() error {
+		qmpChan, err := net.Dial("unix", qmpPath)
+		if err != nil {
+			return err
+		}
+		inst.qmpChan = &qmpChan
+		return nil
+	}); err != nil {
+		return nil, errors.Wrapf(err, "Connecting to qemu monitor")
+	}
+
+	go func() {
+		r := bufio.NewReader(*inst.qmpChan)
+		loop {
+			
+		}
+	}()
+
 	return &inst, nil
 }
 

I did try with the digital ocean library - one issue was - where do we call the connect to the socket and send commands? I see the qmp.sock created only after the VM gets started. The place above where you try to connect to the socket , I see that it has not been created yet.

Also, just for my understanding - how do we want to use this? send qmp commands to control the device to boot from and leave the bootindex as is ?

Ah..never mind..i missed the retries...After adding retries it works. Let me poke around the commands a little bit Not very familiar with QMP.

@cgwalters
Copy link
Member

.After adding retries it works. Let me poke around the commands a little bit Not very familiar with QMP.

Oh wow, I hadn't meant for you to take on using QMP in cosa but that's really awesome that you are!

To be clear...I am not opposed to merging this PR as is either. But I do think we can do this in a cleaner way (QMP would be cleanest but failing that, it could work out OK to push some of the "here's a CDROM and disk, set things up to boot" into qemu.go too.)

@cgwalters
Copy link
Member

OK #1330 merged - feel free to rebase this one and we can do QMP as a followup if you prefer.

@openshift-ci-robot
Copy link

@Prashanth684: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Prashanth684
Copy link
Contributor Author

closed in favor of #1348

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants