Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle_writesame_check: Illegal Data-Out: iov_cnt 2 length: 63 error #476

Open
keith-holder opened this issue Sep 11, 2018 · 5 comments
Open

Comments

@keith-holder
Copy link

keith-holder commented Sep 11, 2018

I have written a tcmu-runner plugin and I am being occasionally see these error messages ,
handle_writesame_check: ... Illegal Data-Out: iov_cnt 2 length: 63 in the tcmu-runner error log.
I think the test in the function_writesame_check maybe broken
...
if (cmd->iov_cnt != 1 || cmd->iovec->iov_len != block_size) {

I have added some extra debug and although iov_cnt is 2, the sum of the iov_len is 512 (63 & 449), albeit inefficient. I don't know why the SCSI has split the IO command like this, but surely the iovec is valid????

@mikechristie
Copy link
Collaborator

I think we just never saw a command like that, thought it was not possible and did not code handle_writesame to handle it.

It seems valid. What LIO fabric driver are you using?

Are you using the passthrough ws code path or the emulated in tcmur_cmd_handler.c:handle_writesame() code path? For the latter it looks like the loop and related code needs to be fixed up:

        write_lbas = length / block_size;
        for (i = 0; i < write_lbas; i++)
                memcpy(write_same->iov_base + i * block_size,
                       cmd->iovec->iov_base, block_size);

@keith-holder
Copy link
Author

It seems valid. What LIO fabric driver are you using?

I am using the LIO loopback fabric, with an ext4 file system mounted on the block device with a DB back store.

Are you using the passthrough ws code path or the emulated in tcmur_cmd_handler.c:handle_writesame() code path? For the latter it looks like the loop and related code needs to be fixed up:

I believe it is coming in via the tcmur_cmd_handler() call to handle_writesame() on a WRITE_SAME SCSI command.

@keith-holder
Copy link
Author

It seems a possible reason for the iov_cnt being 2 is that the IO transfer has crossed a page boundary and some code that has mapped kernel virtual to user virtual has decided to split the transfer??? But that's a guess. Dumping out the iov_base + iov_len always seems to coincide to a page boundary.

@mikechristie
Copy link
Collaborator

Do you need me to work on a patch or were you going to?

I am busy at work this week. I will look into next week.

@keith-holder
Copy link
Author

Do you need me to work on a patch or were you going to?

I am busy at work this week. I will look into next week.

If you can provide a patch, that would be welcome. I am not in an immediate hurry, but will try to workaround the issue, if I can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants