usb: host: fix HS device enumeration and interface probing#106802
usb: host: fix HS device enumeration and interface probing#106802Girinandha-M wants to merge 2 commits intozephyrproject-rtos:mainfrom
Conversation
|
@josuah @carlescufi @MarkWangChinese ping for review. |
|
Note: The bMaxPacketSize0 fix (8b0bba9) was identified by @santhosh-c-c in the CDC ECM PR #99097 as well. |
7123840 to
fc15953
Compare
| * HS devices always have mps0 of 64 (USB 2.0 section 5.5.3). | ||
| * For FS, start with 8 until the device descriptor is read. | ||
| */ | ||
| udev->dev_desc.bMaxPacketSize0 = 8; | ||
| if (udev->speed == USB_SPEED_SPEED_HS) { | ||
| udev->dev_desc.bMaxPacketSize0 = 64; | ||
| } else { | ||
| udev->dev_desc.bMaxPacketSize0 = 8; | ||
| } | ||
|
|
There was a problem hiding this comment.
No, that is absolute pointless to set it conditional here.
There was a problem hiding this comment.
Tested three configurations on two HS devices:
Test 1 - MPS0=8 (current code):
ECM dongle: First read FAILED err=-5
Modem: First read FAILED err=-116 (timeout)
Test 2 - MPS0=64 unconditionally:
ECM dongle: OK, enumerated, bNumInterfaces=1
Modem: OK, enumerated, bNumInterfaces=6
Test 3 - MPS0 conditional on speed (this commit):
ECM dongle: OK, enumerated, bNumInterfaces=1
Modem: OK, enumerated, bNumInterfaces=6
The EHCI controller configures the EP0 pipe width from
bMaxPacketSize0 before sending the SETUP token. With
MPS0=8, the hardware rejects HS packets because HS EP0
always uses 64-byte packets on the wire, regardless of
the transfer length requested.
Without this fix, no HS device can enumerate on the
Zephyr USB host stack.
If you prefer setting MPS0=64 unconditionally instead
of the speed conditional, I can make that change.
I do not have a FS device to verify whether MPS0=64
causes issues for FS enumeration.
There was a problem hiding this comment.
I cannot foresee any problem with setting bMaxPacketSize0 to 64 here. We are requesting just 8 bytes so valid transfer will always be just one packet regardless of actual bMaxPacketSize0.
| err = usbh_device_set_address(udev, new_addr); | ||
| if (err) { | ||
| goto error; | ||
| } | ||
|
|
||
| err = alloc_device_address(udev, &new_addr); | ||
| LOG_INF("New device with address %u state %u", udev->addr, udev->state); | ||
|
|
||
| err = usbh_req_desc_dev(udev, sizeof(udev->dev_desc), &udev->dev_desc); |
There was a problem hiding this comment.
The current sequence reads the full 18-byte device descriptor
at address 0 before assigning a device address. This causes
enumeration failure on the Quectel EG916Q-GL which returns
bNumConfigurations=0 on the second GET_DESCRIPTOR at address 0.
What does it return on third and fourth GET_DESCRIPTOR request?
reorder to assign the address after the initial 8-byte read
and before the full descriptor read.
There will be another non-compliant USB device that fails with new request order tomorrow. What will we change it to then? This absolutely does not seem like the right solution to me.
Both orderings are
permitted by the USB 2.0 spec, but the new order is strictly
more compatible and matches the enumeration sequence used by
Linux (hub_port_init) and xHCI host stacks.
More compatible to what?
There was a problem hiding this comment.
Tested repeated GET_DESCRIPTOR at address 0 as requested:
Quectel EG916Q-GL (modem):
GET_DESC #1 addr 0: err=0 bNumConf=0
GET_DESC #2 addr 0: err=0 bNumConf=0
GET_DESC #3 addr 0: err=0 bNumConf=0
GET_DESC #4 addr 0: err=0 bNumConf=0
ECM dongle:
GET_DESC #1 addr 0: err=0 bNumConf=2
GET_DESC #2 addr 0: err=0 bNumConf=2
GET_DESC #3 addr 0: err=0 bNumConf=2
GET_DESC #4 addr 0: err=0 bNumConf=2
The modem transfer completes with no error but consistently
returns bNumConf=0 at address 0. Retry does not help.
Verified the reorder does not break the ECM dongle:
ECM dongle (with reorder):
New device with address 1 state 2
Configuration 1 bNumInterfaces 1
Quectel modem (with reorder):
New device with address 1 state 2
Configuration 1 bNumInterfaces 6
Both devices enumerate correctly with the reorder.
SET_ADDRESS after the initial 8-byte read is a valid
state transition.
There was a problem hiding this comment.
Both orderings are
permitted by the USB 2.0 spec, but the new order is strictly
more compatible and matches the enumeration sequence used by
Linux (hub_port_init) and xHCI host stacks.
Again, strictly more compatible to what? How I see it, it does not really "matches the enumeration sequence used by Linux". So what does it mean "more compatible"?
There was a problem hiding this comment.
Updated the commit message. Removed the "Linux (hub_port_init) and xHCI" comparison and the "strictly more compatible" framing. The reorder addresses an enumeration failure on the Quectel EG916Q-GL (returns bNumConfigurations=0 on the 18-byte GET_DESCRIPTOR at address 0); ECM dongle continues to enumerate correctly with the new order.
| } | ||
|
|
||
| /* Skip the interface if the head is interface */ | ||
| if (usbh_desc_is_valid_interface(head)) { |
There was a problem hiding this comment.
usbh_desc_get_next_function() incorrectly sets skip_num=1
when starting from an interface descriptor.
Why incorrectly? I think that is what usbh_desc_get_next_function() should do. @josuah ?
There was a problem hiding this comment.
Boot log WITH skip_num=1 (current code):
Claimed 2c7c:6007 iface 0
Claimed 2c7c:6007 iface 2
not matching interface 4
Interfaces 1, 3 never appear - completely skipped.
Boot log WITHOUT skip_num=1 (this fix):
Claimed 2c7c:6007 iface 0
Claimed 2c7c:6007 iface 1
Claimed 2c7c:6007 iface 2
Claimed 2c7c:6007 iface 3
not matching interface 4
All vendor-class interfaces probed.
The issue is that usbh_desc_get_next() already advances
past the current descriptor before the while loop runs.
When called with interface 0:
- skip_num set to 1
- usbh_desc_get_next() advances past interface 0
- Loop hits interface 1
- skip_num=1, decrement, SKIP interface 1
- Loop hits interface 2, skip_num=0, return
Interface 1 is skipped because skip_num consumes the
NEXT interface instead of the current one. For IADs this
logic is correct since bInterfaceCount covers grouped
interfaces. For plain interfaces there is nothing to skip.
There was a problem hiding this comment.
Sorry for the long wait.
I see how this is a bug and how it was not visible until a class with only one descriptor per interface triggers it.
fc15953 to
ed5a651
Compare
|
@MarkWangChinese @carlescufi ping for review. Thanks. |
|
@jfischer-no ping for review. Thanks. |
| * HS EP0 always uses 64-byte packets per USB 2.0 spec (5.5.3). | ||
| * For FS the actual mps0 is unknown until the device descriptor | ||
| * is read, but 64 is a safe upper bound for the initial 8-byte | ||
| * read since wLength fits in a single packet at any FS mps0. |
There was a problem hiding this comment.
Please use high-speed/full-speed devices instead of HS/FS.
/*
* High-speed devices control endpoint always uses 64 bytes MPS0 per
* USB 2.0 spec (5.5.3). For full-speed devices, the actual MPS0 is
* unknown until the device descriptor is read, but 64 bytes is a safe
* upper bound for the initial 8-byte control request, since data fits
* in a single packet at any full-speed device.
*/Please also fix it in the commit message.
The commit message also states:
...which rejects the 64-byte response from any HS device as a babble
error.
Does it mean that your USB device responses with 64 bytes data stage despite the requested length is 8 bytes? Btw, you mentioned EHCI controller, what platform are you using? @AidenHu are you aware of it?
There was a problem hiding this comment.
Hi @jfischer-no
In my previous host video test case, for the first 8 bytes device desc request, device responses 8 bytes correctly.
I do not see a device responses more data than the length host requests.
There was a problem hiding this comment.
@Girinandha-M There are still open questions.
You mentioned EHCI controller, what platform/driver are you using?
There was a problem hiding this comment.
Platform: mimxrt1064_evk. Driver: uhc_mcux (drivers/usb/uhc/uhc_mcux_ehci.c), which is the EHCI variant of the NXP MCUX USB host driver.
There was a problem hiding this comment.
Platform: mimxrt1064_evk. Driver: uhc_mcux (drivers/usb/uhc/uhc_mcux_ehci.c), which is the EHCI variant of the NXP MCUX USB host driver.
In that case, I doubt your first commit is fixing anything. You can remove it from the PR to unblock it.
There was a problem hiding this comment.
Platform: mimxrt1064_evk. Driver: uhc_mcux (drivers/usb/uhc/uhc_mcux_ehci.c), which is the EHCI variant of the NXP MCUX USB host driver.
In that case, I doubt your first commit is fixing anything. You can remove it from the PR to unblock it.
@jfischer-no I want to make sure I understand correctly. Are you saying the first commit should be removed entirely from this PR? Or split into a separate PR?
On my platform (mimxrt1064_evk, uhc_mcux_ehci), without this commit the first 8-byte GET_DESCRIPTOR fails with err=-116 (timeout). No data is received at all.
@AidenHu which platform/UHC driver did you test on? The behavior may be controller-specific.
There was a problem hiding this comment.
@Girinandha-M and @jfischer-no
I used rd_rw612_bga board with uhc_mcux_ehci driver.
There was a problem hiding this comment.
On my platform (mimxrt1064_evk, uhc_mcux_ehci), without this commit the first 8-byte GET_DESCRIPTOR fails with err=-116 (timeout). No data is received at all.
This contradicts your previous statement in the commit message 9198b3d, and I asked you about it in #106802 (comment).
There was a problem hiding this comment.
@jfischer-no You are right. The commit message said "babble error" but the actual failure is a timeout at -116 with no data delivered - two different mechanisms. Root cause is the net_buf size inflation in usb_pool_data_alloc that MarkWang fixed in #108620, verified on mimxrt1064_evk. Dropping 9198b3d from this PR.
| err = usbh_device_set_address(udev, new_addr); | ||
| if (err) { | ||
| goto error; | ||
| } | ||
|
|
||
| err = alloc_device_address(udev, &new_addr); | ||
| LOG_INF("New device with address %u state %u", udev->addr, udev->state); | ||
|
|
||
| err = usbh_req_desc_dev(udev, sizeof(udev->dev_desc), &udev->dev_desc); |
There was a problem hiding this comment.
Both orderings are
permitted by the USB 2.0 spec, but the new order is strictly
more compatible and matches the enumeration sequence used by
Linux (hub_port_init) and xHCI host stacks.
Again, strictly more compatible to what? How I see it, it does not really "matches the enumeration sequence used by Linux". So what does it mean "more compatible"?
ed5a651 to
54ad112
Compare
|
@jfischer-no @josuah @tmon-nordic ping for review. Thanks |
| * in a single packet at any full-speed device. | ||
| */ | ||
| udev->dev_desc.bMaxPacketSize0 = 8; | ||
| udev->dev_desc.bMaxPacketSize0 = 64; |
There was a problem hiding this comment.
High-speed device control endpoints are required to declare
bMaxPacketSize0 of 64 per USB 2.0 spec (5.5.3). The host
controller programs the EP0 pipe from this field, so the
current default of 8 leaves EP0 misconfigured for high-speed
and enumeration fails.
Could you help to explain why the enumeration fails? From my understand, the first control transfer only get 8 bytes, the mps = 8 should be OK. If you don't know the root cause. since you are using mimxrt1064_evk, how can I re-produce the issue easily? Thanks.
There was a problem hiding this comment.
I can re-produce the issue that enumeration fail, the root case is: The mps is 8, and the ch9 request 8 bytes, but the net_buf alloc 32 bytes because of USB_BUF_ROUND_UP. So the nxp uhc ehci use 32 (net_buf_tailroom(buf) is 32) to receive data, and the controller keeps to receive more data after receiving the first 8 bytes.
I do one fix in #108620, @Girinandha-M please help to verify, I verified the enumeration in my side.
There was a problem hiding this comment.
@MarkWangChinese Verified on mimxrt1064_evk with a Quectel EG916Q-GL modem (HS). With #108620 applied and bMaxPacketSize0 kept at 8, GET_DESCRIPTOR(DEVICE, 8) completes correctly and full enumeration proceeds. Your analysis of the net_buf size inflation is the actual root cause. Dropping the bMaxPacketSize0 = 64 commit from this PR.
Move alloc_device_address() and usbh_device_set_address() before reading the full device descriptor. the current sequence reads the full 18-byte device descriptor at address 0 before assigning a device address. this causes enumeration failure on the Quectel EG916Q-GL, which returns bNumConfigurations=0 on the second GET_DESCRIPTOR request at address 0. reorder the sequence to assign the address after the initial 8-byte read, and fetch the full 18-byte descriptor using the newly assigned address. the new order works around devices that fail the second GET_DESCRIPTOR at address 0, while remaining functional on devices that enumerate correctly with the original order (verified on ECM dongle). Signed-off-by: Girinandha Manivelpandiyan <girinandha@linumiz.com>
usbh_desc_get_next_function() sets skip_num=1 when starting from a plain interface descriptor. usbh_desc_get_next() already advances past the input, so an additional skip_num=1 makes the loop consume one extra interface beyond the current one. Devices with consecutive plain interfaces (e.g. Quectel EG916Q-GL with 4 vendor interfaces) only get ifaces 0 and 2 probed; 1 and 3 are silently skipped. Drop the assignment. The IAD branch is unaffected since its skip_num comes from bInterfaceCount, which counts interfaces grouped inside the IAD, not the IAD descriptor itself. Signed-off-by: Girinandha Manivelpandiyan <girinandha@linumiz.com>
54ad112 to
f78700d
Compare
|



Three bugs in the USB host stack prevent HS multi-interface devices from enumerating and probing correctly.
Found during development of the vendor serial class driver (PR #99173) while testing with Quectel EG916Q-GL on mimxrt1064_evk.
Commit 1: bMaxPacketSize0 hardcoded to 8 for all devices.
USB 2.0 section 5.5.3 requires 64 for HS. Causes control transfer timeout on any HS device.
Commit 2: Full device descriptor read happens at address 0 before address assignment.
Some HS devices do not respond to a second descriptor read at address 0, returning bNumConfigurations=0.
Commit 3: usbh_desc_get_next_function() skips every other interface due to incorrect skip_num=1 on plain interface
descriptors. A 6-interface device only gets 0, 2, 4 probed.
Tested on mimxrt1064_evk + Quectel EG916Q-GL (USB 2.0 HS,
6 interfaces). All interfaces now enumerate and probe correctly.