Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: Handle zero-len packets from ADB to avoid BSOD #602

Merged
merged 1 commit into from
May 24, 2023

Conversation

itsenkov
Copy link
Contributor

This PR addresses BSOD issue while using ADB, but also other with tools as reported in #248 (comment)
The code change is small but diagnosing the issue was a substantial effort.
Here is the analysis:

0. Setup

Hyper-V VM with Debian 11 (unstable) guest
Host: Windows 10 (Version 10.0.19045 Build 19045)
ADB for Android Nougat
Custom ADB device

1. ADB

ADB tool that I’ve used to reproduce the issue was an older version e.g. from Android Nougat
In this version of the tool data is sent in chunks of size 256k.
Dimitar Stankov provided me with custom build of this ADB version with reduced chunk size of 16k,
because I’ve suspected that chunk size goes beyond VboxUSB limits. Chunk size turned out not to be the case eventually.
Dimitar supported me also with enabling ADB trace logging and also added some prints in the custom-built version.
In the logs I’ve noticed a pattern

--- adb starting (pid 61) ---
adb I 04-28 17:56:11    61    61 main.cpp:62] Android Debug Bridge version 1.0.41
adb I 04-28 17:56:11    61    61 main.cpp:62] Version 29.0.6-eng.root.20230428.135514
...
adb D 04-28 17:56:12    61    68 usb_linux.cpp:425] ++ usb_write ++
adb D 04-28 17:56:12    61    68 usb_linux.cpp:307] ++ usb_bulk_write ++
adb D 04-28 17:56:12    61    68 usb_linux.cpp:308]   DEBUG: urb length = 24
adb D 04-28 17:56:12    61    68 usb_linux.cpp:308] 
adb D 04-28 17:56:12    61    67 usb_linux.cpp:392] [ urb @0x7fcd140109a8 status = 0, actual = 24 ]
adb D 04-28 17:56:12    61    67 usb_linux.cpp:404] [ reap urb - OUT compelete ]
adb D 04-28 17:56:12    61    67 usb_linux.cpp:368] [ reap urb - wait ]
adb D 04-28 17:56:12    61    68 usb_linux.cpp:339]   DEBUG: urb actual_length = 24
adb D 04-28 17:56:12    61    68 usb_linux.cpp:339] 
adb D 04-28 17:56:12    61    68 usb_linux.cpp:447] -- usb_write --
adb D 04-28 17:56:12    61    68 usb_linux.cpp:425] ++ usb_write ++
adb D 04-28 17:56:12    61    68 usb_linux.cpp:307] ++ usb_bulk_write ++
adb D 04-28 17:56:12    61    68 usb_linux.cpp:308]   DEBUG: urb length = 16384
adb D 04-28 17:56:12    61    68 usb_linux.cpp:308] 
adb D 04-28 17:56:12    61    67 usb_linux.cpp:392] [ urb @0x7fcd140109a8 status = 0, actual = 16384 ]
adb D 04-28 17:56:12    61    67 usb_linux.cpp:404] [ reap urb - OUT compelete ]
adb D 04-28 17:56:12    61    67 usb_linux.cpp:368] [ reap urb - wait ]
adb D 04-28 17:56:12    61    67 usb_linux.cpp:392] [ urb @0x7fcd14010970 status = 0, actual = 24 ]
adb D 04-28 17:56:12    61    67 usb_linux.cpp:395] [ reap urb - IN complete ]
adb D 04-28 17:56:12    61    67 usb_linux.cpp:463] [ usb read 512 ] = 24, path=/dev/bus/usb/001/002
adb D 04-28 17:56:12    61    67 usb_linux.cpp:478] -- usb_read --
adb D 04-28 17:56:12    61    68 usb_linux.cpp:339]   DEBUG: urb actual_length = 16384
adb D 04-28 17:56:12    61    68 usb_linux.cpp:339] 
adb D 04-28 17:56:12    61    68 usb_linux.cpp:307] ++ usb_bulk_write ++
adb D 04-28 17:56:12    61    67 transport_usb.cpp:45] UsbReadMessage
adb D 04-28 17:56:12    61    67 usb_linux.cpp:456] ++ usb_read ++
adb D 04-28 17:56:12    61    68 usb_linux.cpp:308]   DEBUG: urb length = 0
adb D 04-28 17:56:12    61    68 usb_linux.cpp:308] 
adb D 04-28 17:56:12    61    67 usb_linux.cpp:461] [ usb read 512 fd = 14], path=/dev/bus/usb/001/002
adb D 04-28 17:56:12    61    67 usb_linux.cpp:347] ++ usb_bulk_read ++
adb D 04-28 17:56:12    61    67 usb_linux.cpp:368] [ reap urb - wait ]
adb D 04-28 17:56:12    61    67 usb_linux.cpp:392] [ urb @0x7fcd140109a8 status = 0, actual = 0 ]
adb D 04-28 17:56:12    61    67 usb_linux.cpp:404] [ reap urb - OUT compelete ]
adb D 04-28 17:56:12    61    67 usb_linux.cpp:368] [ reap urb - wait ]
adb D 04-28 17:56:12    61    68 usb_linux.cpp:339]   DEBUG: urb actual_length = 0
adb D 04-28 17:56:12    61    68 usb_linux.cpp:339] 
adb D 04-28 17:56:12    61    68 usb_linux.cpp:425] ++ usb_write ++
adb D 04-28 17:56:12    61    68 usb_linux.cpp:307] ++ usb_bulk_write ++
adb D 04-28 17:56:12    61    68 usb_linux.cpp:308]   DEBUG: urb length = 24
adb D 04-28 17:56:12    61    68 usb_linux.cpp:308] 
adb D 04-28 17:56:12    61    67 usb_linux.cpp:392] [ urb @0x7fcd140109a8 status = 0, actual = 24 ]
adb D 04-28 17:56:12    61    67 usb_linux.cpp:404] [ reap urb - OUT compelete ]
adb D 04-28 17:56:12    61    67 usb_linux.cpp:368] [ reap urb - wait ]
adb D 04-28 17:56:12    61    68 usb_linux.cpp:339]   DEBUG: urb actual_length = 24

So here it’s observable that after every chunk of 16k data, there is a chunk with 0 len of data.
Looking at ADB source code I found the following:
https://android.googlesource.com/platform/system/core/+/refs/heads/nougat-release/adb/usb_linux.cpp

int usb_write(usb_handle *h, const void *_data, int len)
{
    D("++ usb_write ++");
    unsigned char *data = (unsigned char*) _data;
    int n = usb_bulk_write(h, data, len);
    if (n != len) {
        D("ERROR: n = %d, errno = %d (%s)", n, errno, strerror(errno));
        return -1;
    }
    if (h->zero_mask && !(len & h->zero_mask)) {
        // If we need 0-markers and our transfer is an even multiple of the packet size,
        // then send a zero marker.
        return usb_bulk_write(h, _data, 0);
    }
    D("-- usb_write --");
    return 0;
}

Here the interesting part is sending “zero-markers”. It turned out that for older version(s) of adb tool and adb protocol
these zero-markers are required to signify the end of a bulk transfer, so the recipient device to know. In my case
these makers are sent, as I’ve observed in the adb traces. And here is where the problem stars.

2. usbipd-win

In usbipd-win implementation when a zero-length packet is received, a zero length buffer (var buf) is created.
Also later on a valid gcHandle pointer is created and set to urb.buf property. After that this URB is passed to
VboxUSB driver.

 async Task HandleSubmitAsync(UsbIpHeaderBasic basic, UsbIpHeaderCmdSubmit submit, CancellationToken cancellationToken)
    {
        ......

        var urb = new UsbSupUrb()
        {
            ...
            len = submit.transfer_buffer_length,
            ...
        };

	...

        var bytes = new byte[Marshal.SizeOf<UsbSupUrb>()];
        var buf = new byte[urb.len]; // Here an array of 0 elements is created. This is a valid object

        ...

        if (basic.direction == UsbIpDir.USBIP_DIR_OUT)
        {
            await Stream.ReadMessageAsync(buf.AsMemory()[payloadOffset..], cancellationToken); // Here nothing is read because there is sanity for isEmpty
        }

        ...

        else
        {
            // To support UNLINK, we must be able to abort the pipe that is used for this URB.
            // We need the raw USB endpoint number, i.e. including the high bit for input pipes.
            if (!PendingSubmits.TryAdd(basic.seqnum, basic.RawEndpoint()))
            {
                throw new ProtocolViolationException($"duplicate sequence number {basic.seqnum}");
            }
            pending = true;

            // Input or output, exceptions or not, this buffer must be locked until after the ioctl has completed.
            var gcHandle = GCHandle.Alloc(buf, GCHandleType.Pinned);
            try
            {
                urb.buf = gcHandle.AddrOfPinnedObject(); // Here a valid pointer to valid object is created and set. However array length is 0.
                StructToBytes(urb, bytes);
                ioctl = Device.IoControlAsync(SUPUSB_IOCTL.SEND_URB, bytes, bytes); // URB struct with urb.len == 0 and urb.buf == non-null pointer is passed to Windows' ioctl
            }
            catch
            {
                gcHandle.Free();
                throw;
            }
            _ = ioctl.ContinueWith((task) =>
            {
                gcHandle.Free();
            }, CancellationToken.None, TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);
        }

       ...
    }
3. VboxUSB driver 

In VboxUSB during initialization of internal structures a urb.buf (non-null) with length 0 is passed to IoAllocateMdl.
In file src/VBox/HostDrivers/VBoxUSB/win/dev/VBoxUsbRt.cpp

static NTSTATUS vboxUsbRtUrbSend(PVBOXUSBDEV_EXT pDevExt, PIRP pIrp, PUSBSUP_URB pUrbInfo)
{
    NTSTATUS Status = STATUS_SUCCESS;
    PVBOXUSB_URB_CONTEXT pContext = NULL;
    PMDL pMdlBuf = NULL;
    ULONG cbUrb;

    ...

    do
    {
        ...

        pMdlBuf = IoAllocateMdl(pUrbInfo->buf, (ULONG)pUrbInfo->len, FALSE, FALSE, NULL);
        if (!pMdlBuf)
        {
            AssertMsgFailed((__FUNCTION__": IoAllocateMdl failed for buffer (0x%p) length (%d)\n", pUrbInfo->buf, pUrbInfo->len));
            Status = STATUS_INSUFFICIENT_RESOURCES;
            break;
        }

        __try
        {
            MmProbeAndLockPages(pMdlBuf, KernelMode, IoModifyAccess);
        }
        __except(EXCEPTION_EXECUTE_HANDLER)
        {
            Status = GetExceptionCode();
            IoFreeMdl(pMdlBuf);
            pMdlBuf = NULL;
            AssertMsgFailed((__FUNCTION__": Exception Code (0x%x)\n", Status));
            break;
        }
	...

     }

     ...
}

Unfortunately IoAllocateMdl returns non-null pointer in this case, which is kind of undefined behavior.
Everything after this call passes as if it was a valid URB passed to VboxUSB.
OnCompletion follows after some time and then BSOD happens during freeing pointer returned by IoAllocateMdl.
Actually similar issue was reported some time ago here: https://learn.microsoft.com/en-us/answers/questions/1167623/ioallocatemdl-bug
Google search the function IoAllocateMdl also resulted in reports of similar BSOD cases with IoAllocateMdl from year 2005.
At the end fixing this undesirable code path is just by setting null pointer to urb.buf in C# code.
This is the long story short. Acknowledgements to Dimitar Stankov [email protected] for his support during investigation of the issue.

Related issues: #461 #410 #248

Older versions of ADB send zero-makers, but other tools also do so.
Handling these properly.

Related issues: dorssel#461 dorssel#410 dorssel#248

Co-authored-by: Dimitar Stankov <[email protected]>
Signed-off-by: Ivaylo Tsenkov <[email protected]>
@d0n13
Copy link

d0n13 commented May 24, 2023

Great work and much appreciated. Let’s hope we can get this merged in and a test build out. I’m really interested in testing this.
🙏

@codecov
Copy link

codecov bot commented May 24, 2023

Codecov Report

Patch coverage has no change and project coverage change: -0.07 ⚠️

Comparison is base (81e89fb) 42.66% compared to head (60d205d) 42.59%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #602      +/-   ##
==========================================
- Coverage   42.66%   42.59%   -0.07%     
==========================================
  Files          31       31              
  Lines        2712     2716       +4     
==========================================
  Hits         1157     1157              
- Misses       1555     1559       +4     
Impacted Files Coverage Δ
Usbipd/AttachedClient.cs 0.00% <0.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@dorssel
Copy link
Owner

dorssel commented May 24, 2023

Absolutely awesome!

Thanks for tracking this one down. We knew it had to do something with the mapping/unmapping of MDL pages. But I never thought of zero-length buffers... I agree with the comments in https://learn.microsoft.com/en-us/answers/questions/1167623/ioallocatemdl-bug: this is handled really awkward by Windows. And nothing in the DDK docs about it either. Well done!

@dorssel dorssel merged commit 2aa1573 into dorssel:master May 24, 2023
@yodamaster
Copy link

yodamaster commented Aug 14, 2023

This PR addresses BSOD issue while using ADB, but also other with tools as reported in #248 (comment) The code change is small but diagnosing the issue was a substantial effort. Here is the analysis:

...

Related issues: #461 #410 #248

well done! thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants