-
Notifications
You must be signed in to change notification settings - Fork 8.1k
drivers: can: mcux: flexcan: fix MB overrun when CAN FD enabled #95097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
drivers: can: mcux: flexcan: fix MB overrun when CAN FD enabled #95097
Conversation
|
Hello @william-tang914, and thank you very much for your first pull request to the Zephyr project! |
From the MCUX_FLEXCAN_MAX_MB implementation, seems zephyr/drivers/can/can_mcux_flexcan.c Lines 37 to 41 in 5eeddab
|
|
For the SOCs which have different MB number by FlexCAN instance, user should configure I insert a guard clause for zephyr/drivers/can/can_mcux_flexcan.c Lines 1188 to 1191 in 533681f
Another guard clause for zephyr/drivers/can/can_mcux_flexcan.c Lines 1196 to 1199 in 533681f
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henrikbrixandersen can you provide a review. This LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't test it on hardware, but I think I needs a few changes to be able to run on hw where instances have a different number of message buffers.
drivers/can/can_mcux_flexcan.c
Outdated
| data->tx_mb_number = max_mb - MCUX_FLEXCAN_MAX_RX; | ||
| data->rx_mb_number = MCUX_FLEXCAN_MAX_RX; | ||
|
|
||
| if (data->tx_mb_number > MCUX_FLEXCAN_MAX_TX) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As tx_mb_number is scaled for canfd I think we need to adopt the MCUX_FLEXCAN_MAX_TX as well don't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that MCUX_FLEXCAN_MAX_TX is static, unless update MCUX_FLEXCAN_MAX_MB by Kconfig.
drivers/can/can_mcux_flexcan.c
Outdated
| return -ENODEV; | ||
| } | ||
|
|
||
| data->tx_mb_number = max_mb - MCUX_FLEXCAN_MAX_RX; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here we will always use all physically available mb of the CAN instance. Doesn't that make the CONFIG_CAN_MAX_MB kind of unneeded, as it will also have to be set to the max value to avoid getting an error in the following tx_mb_number check. And consequently I think it makes it impossible to use two CAN instances of differing max mb number, as we always assign the same value to flexcan_config.maxMbNum(reg MCR.MAXMB).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current implementation, CONFIG_CAN_MAX_MB is used to statically define the size of tx_cbs and tx_allocs via MCUX_FLEXCAN_MAX_TX. If we remove this macro, we can no longer rely on compile-time constants for these arrays, and would need to switch to dynamic initialization instead.
While dynamic allocation is technically feasible, I'm not sure if it's appropriate to introduce such a change within this pull request, as it affects memory layout and initialization logic. It might be better to keep the static allocation for now, or handle the transition to dynamic allocation in a separate PR to keep changes scoped and easier to review.
zephyr/drivers/can/can_mcux_flexcan.c
Lines 119 to 126 in 533681f
| ATOMIC_DEFINE(rx_allocs, MCUX_FLEXCAN_MAX_RX); | |
| struct k_mutex rx_mutex; | |
| struct mcux_flexcan_rx_callback rx_cbs[MCUX_FLEXCAN_MAX_RX]; | |
| ATOMIC_DEFINE(tx_allocs, MCUX_FLEXCAN_MAX_TX); | |
| struct k_sem tx_allocs_sem; | |
| struct k_mutex tx_mutex; | |
| struct mcux_flexcan_tx_callback tx_cbs[MCUX_FLEXCAN_MAX_TX]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update flexcan_config.maxMbNum = MCUX_FLEXCAN_MAX_MB to flexcan_config.maxMbNum = max_mb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't want go toward dynamic allocation either.
My concern is just that in case FSL_FEATURE_FLEXCAN_HAS_MESSAGE_BUFFER_MAX_NUMBERn(x) and MCUX_FLEXCAN_MAX_MB differ we might use more Buffers than configured in MCR.MAXMB.
And the following tx_mb_number check will not be triggered in FD mode.
I'll have to check this in more detail and will try to clarify what I meant, when I have more time, hopefully in the evening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @str4t0m , let me introduce intent of the Pull Request.
The maximum number of FlexCAN message buffers (mailboxes) is determined by the following factors:
- Whether CAN FD is enabled.
- If CAN FD is enabled, the amount of memory allocated per mailbox (in this case, 64 bytes).
- The number of memory blocks available to each FlexCAN instance. Each instance's memory blocks is different on some SoCs (e.g., the S32K3 series).
Based on these three factors, I calculate the actual maximum number of mailboxes for each FlexCAN instance. Then, using the number of receive mailboxes configured via Kconfig, I compute the remaining available transmit mailboxes. Finally, I assign the actual maximum mailbox count to MCR[MAXMB].(will update)
The check max_mb < MCUX_FLEXCAN_MAX_RX ensures that the user does not configure more receive mailboxes than the hardware can support.
The check data->tx_mb_number > MCUX_FLEXCAN_MAX_TX prevents out-of-bounds access when iterating over the tx_cbs array. Expanding this condition leads to max_mb > MCUX_FLEXCAN_MAX_MB, which can only occur if:
- The number of memory blocks differs between FlexCAN instances, and
- The user incorrectly sets MCUX_FLEXCAN_MAX_MB via Kconfig.
This PR aims to make the mailbox allocation more robust and adaptable to different SoC configurations, while also preventing misconfiguration-related issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To fully resolve this issue, the driver would need to dynamically calculate the actual maximum number of mailboxes for each FlexCAN instance, or alternatively, this information could be provided via the device tree.
I think each SoC should disclose the amount of mailboxes for each FlexCAN instance in the SoC device tree. Similar how it's done for the FIFO sizes for LPSPI.
https://docs.zephyrproject.org/latest/build/dts/api/bindings/spi/nxp%2Clpspi.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been meaning to introduce dts properties for this for quite some time (it's also needed by the Zephyr-native FlexCAN driver, I am working on), but I have not gotten around to it yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @henrikbrixandersen , I'm glad you are trying to add mailboxes property for Zephyr-native CAN driver. Would you mind sharing me your plan or timeline to fix this issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @henrikbrixandersen , shall we apply a workaround before introduce mailbox to device tree?
On many platforms, there are only 7 mailboxes for CAN FD. So this pull request is necessary for those platfroms. Thanks a lot.
#91138
Fix message buffer (MB) overrun when CAN FD enabled by dynamically calculate the number of MB based on payload size and CAN instance capability. Maximum number of MB is less than classical CAN when CAN FD enabled. Add new data structure member 'tx_mb_number' and 'rx_mb_number', which reflect real TX MB and RX MB per FlexCAN instance can accommodate. Use 'tx_mb_number' and 'rx_mb_number' to define maximum semaphore count and traverse atomic bitmap, ensure safe concurrent access to shared resources on different FlexCAN instance. This commit assumes 'tx_mb_number' less than or equal to 'MCUX_FLEXCAN_MAX_TX'. For S32 platform, it requires suitable Kconfig symbol 'CAN_MAX_MB' value to ensure above relation. MCUX platform can ignore this limit. Fixes zephyrproject-rtos#92798 Signed-off-by: William Tang <[email protected]>
533681f to
16ba31a
Compare
|
|
What is the plan for this PR |




Fix message buffer (MB) overrun when CAN FD enabled by dynamically calculate the number of MB based on payload size and CAN instance capability. Maximum number of MB is less than classical CAN when CAN FD enabled.
Add new data structure member 'tx_mb_number' and 'rx_mb_number', which reflect real TX MB and RX MB per FlexCAN instance can accommodate.
Use 'tx_mb_number' and 'rx_mb_number' to define maximum semaphore count and traverse atomic bitmap, ensure safe concurrent access to shared resources on different FlexCAN instance.
This commit assumes 'tx_mb_number' less than or equal to 'MCUX_FLEXCAN_MAX_TX'. For S32 platform, it requires suitable Kconfig symbol 'CAN_MAX_MB' value to ensure above relation. MCUX platform can ignore this limit.
Fixes #92798