RFC: A layered approach to netdev #7736

miri64 · 2017-10-13T19:40:50Z

@haukepetersen and I today talked a little bit about the problems with netdev_t and how to dynamically add features independent from a specific device. Examples for this are

common link-layer specific operations (e.g. IEEE 802.15.4 options (currently implemented in netdev_ieee802154_t or deduplication of received packets (currently implemented in gnrc_netif_ieee802154: drop duplicate broadcast packets (optionally) #7577))
link-layer address filtering (currently implemented in the link-layer specific sub-layer)
packet-counting / collecting network statistics (currently implemented on a per-device basis)
device-independent retransmissions (see gnrc_netdev2: link-layer retransmissions outside the transceiver driver #4795)

This is what we came up with:

In addition to actual device drivers there are additional netdev_driver_ts available who also have another netdev_driver_t associated to themselves (I called this construction netdev_layered_t here but I'm not stuck on the name). This allows a network stack to use those features as though they were a normal network device and the whole thing would be very transparent to the network stack (so I wouldn't count this as a major API change ;-)). The benefit of this should be clear:

no code duplication for link-layer specific operations (like packet counting, link-layer address filtering, option setting/getting)
simplification of the netdev_ieee802154_set()/netdev_ieee802154_get() situation that seem to confuse people

The questions that are still open are:

Where to store netdev_layered_t instances?
How initialize such a "netdev_layered_t" stack while also having it configurable?

The text was updated successfully, but these errors were encountered:

miri64 · 2017-10-13T19:45:31Z

Another thing to consider might be that this could waste a few bytes of memory, since any of these layers could just call the next for certain operations without adding any new functionality (e.g. a netstats counter wouldn't be involved in set() or isr() wasting basically 8 byte of RAM and a bit of ROM for the calldown).

miri64 · 2017-10-13T19:45:52Z

But that might be premature optimization ;-)

jnohlgard · 2017-10-14T10:14:57Z

Since many of the layers will only hook into one or two of the functions, another approach could be to change the functions themselves to linked lists, where the next function in the list is called if the argument is not handled by the current function. That would reduce memory usage when the layer implements 3 or less of the 6 API functions, and it would reduce the latency for the functions which have few hooks (netdev_t::isr for example).
I don't have any solution to where and how to allocate these list items though.

Another important consideration: How do we decide which order the layers should be linked? Do we need a kind of sorting key or do we just add them in the order they are initialized during application start?

Edit: The memory reduction assumes that a function pointer and a pointer to a struct uses the same number of bytes in memory.

jnohlgard · 2017-10-14T10:40:55Z

I do like the ideas here, it would certainly be possible to simplify MAC layer implementations if the netstats stuff could be broken out into its own layer, just counting the number of packets passing back and forth. And though it would be working the same way as today, I think it would make netdev_ieee802154_send/recv more visible. The monkey patching of send and recv being done by netdev_ieee802154_init in the current implementation was not at all obvious to me until I saw the call chain in a backtrace in the debugger from inside the send of my netdev driver, but maybe I was just being blind.

Finding a solution for the allocation issue would mean that we also could potentially move the extra members in gnrc_netdev_t added by preprocessor conditions on MODULE_GNRC_MAC (https://github.com/RIOT-OS/RIOT/blob/master/sys/include/net/gnrc/netdev.h#L119-L163). It breaks ABI compatibility between builds to have public API structs change members depending on the included modules.

kaspar030 · 2017-10-14T19:55:00Z

I don't see news here. Just bad documentation and its effects.

Netdev2 has been designed to be stackable from the beginning. Just add a "parent" pointer to any netdev_t and add some logic to pass send/recv/get/set calls up to that parents.

Was anyone listening when I said "implement mac layers stacked on top of the device drivers using netdev"? edit sorry, that came out a lot more rude than intended.

jnohlgard · 2017-10-15T06:41:37Z

@kaspar030 thank you, I think using netdev is a good solution for a MAC layer, but there's still the issue of allocating it somewhere for each network device. Do you have any suggestions on how to tackle that?

kaspar030 · 2017-10-23T21:56:34Z

there's still the issue of allocating it somewhere for each network device. Do you have any suggestions on how to tackle that?

Not really. If it is not possible to do that statically (in a not completely ugly way), we could consider allowing some kind of oneway-malloc that can be used only in auto-init.

jnohlgard · 2017-10-24T04:30:12Z

I think 1-way malloc is fine on init during boot. Maybe it would be an idea to add a 1-way malloc which works during boot but after calling a certain function (malloc_finalize or sth) then any calls to that malloc will cause a panic, to prevent inadvertent malloc uses and running out of memory. For systems where you don't need dynamic allocation, and want to be able to ensure stability.

miri64 · 2017-11-15T18:00:25Z

Added the idea of #4795 to the list above to not loose track of it.

bergzand · 2017-11-27T11:53:21Z

I've been thinking a bit about this the last few days since I have a few network features in mind which could greatly benefit from "dynamic" allocation per network device. Most notably at the moment #6873 where ETX tracking doesn't make a lot of sense on wired interfaces.

Not really. If it is not possible to do that statically (in a not completely ugly way), we could consider allowing some kind of oneway-malloc that can be used only in auto-init.

Would it be possible to reuse the memory space of a thread for this? Stack starting on one end, "one way heap" on the other end. Maybe even use a thread flag to indicate that this one way malloc is allowed for the thread, as to have a way to enforce restrictions on this. As said before, the malloc could be really simple when assumed that a free() is not necessary.

miri64 · 2017-11-27T12:15:40Z

Would it be possible to reuse the memory space of a thread for this? Stack starting on one end, "one way heap" on the other end. Maybe even use a thread flag to indicate that this one way malloc is allowed for the thread, as to have a way to enforce restrictions on this. As said before, the malloc could be really simple when assumed that a free() is not necessary.

There are no threads in netdev and not supposed to be, so no.

bergzand · 2017-11-28T09:40:26Z

For now, I'm trying to ~~solve~~ get some discussion going on the issue of where to store the netdev_layered_t data if they are not statically allocated at compile time.

There are no threads in netdev and not supposed to be, so no.

Just so that I understand what you're saying here, the netdev radio drivers are not thread aware, so no calls to threading functions right? Somewhere up in the stack there has to be some kind of thread running an event loop controlling the drivers somehow right (gnrc_netif or gnrc_netdev)?

miri64 · 2017-11-28T11:05:27Z

Just so that I understand what you're saying here, the netdev radio drivers are not thread aware, so no calls to threading functions right? Somewhere up in the stack there has to be some kind of thread running an event loop controlling the drivers somehow right (gnrc_netif or gnrc_netdev)?

Yes, the threading, if required, is provided by the network stack, but netdev's event management (the isr()+event_callback() bootstrap) allows for totally thread-less environments as well.

bergzand · 2017-11-28T12:56:19Z

Yes, the threading, if required, is provided by the network stack, but netdev's event management (the isr()+event_callback() bootstrap) allows for totally thread-less environments as well.

Makes sense.

This whole idea of (one time dynamic) allocation of the netdev_layered_t structs doesn't necessarily have to happen inside netdev right? What I'm thinking of is a structure where the process/functions controlling a netdev_t instance (gnrc_netif or a different threadless design) also configures it with the required layered structs. How this allocation then happens is the problem of the functions above netdev and netdev doesn't have to care whether the allocation happens either dynamic or static, as long as it receives a valid pointer to a netdev_layered_t struct.

miri64 · 2017-11-28T12:58:19Z

Sound's like a nice idea. Are you willing maybe to provide a proof-of-concept. Would do it myself, but currently don't have enough time at hand for that. :-/

bergzand · 2017-11-28T13:03:17Z

Sure, not going to promise anything as my time is limited too, but it sounds like a fun challenge :)

bergzand · 2017-12-05T14:46:40Z

While implementing this for netdev, I was thinking if this layered approach is possible for netif too. Maybe MAC layer protocols such as LWMAC and GoMacH could benefit from this approach.

miri64 · 2017-12-05T15:01:50Z

Could be.

miri64 · 2018-07-16T13:48:10Z

In general this work aims to reduce code complexity. The layers this issue talks about currently already exist in part and existed in the referenced paper as well (see modules like netdev_ieee802154 and netdev_ethernet). However the calling hierarchy is quite messed up at the moment + private member fields are touched in places where they are not supposed to be touched (and I admit that this is to 98% my fault ;-)). One example: when a radio specific option is retrieved for a network interface via gnrc_netapi, the interface's thread then first asks the radio driver. If the radio driver doesn't have it, it asks the IEEE 802.15.4 netdev layer. This is supposed to be the otherway around. However, to isolate IEEE 802.15.4 specific options (such as addresses, PAN ID, header flags) and to reduce code duplication, the comparably complicated header construction (they have variable length) of the IEEE 802.15.4 header is also moved to that module in #9417. That's all that is happening and shouldn't have much influence on performance (which should be proven of course).

bergzand · 2018-11-12T15:48:12Z

Okay, I'd like to have some opinions again.

I'm currently looking at the way link layer addresses are initialized. My main issue with the current architecture is that the link layer address generated by the device driver is passed up and down the stack.

Problem

The device driver generates an address based on the luid module. This is written to the netdev_t struct by the device driver. gnrc_netif requests the addres from the device driver where it is then also stored.

My problem here is that 1. there is data directly written between "layers", by the device driver, to the netdev(_ieee802154)_t struct — and 2. In a setup where multiple layers require knowledge of the link layer address, there is no way to guarantee this.

Solution

My current solution would be to have the higher layer (gnrc_netif or the netdev glue layers) generate the link layer address and set it in the device driver. This way it has to traverse all netdev layers, giving all layers explicitly knowledge of the new link layer address.

In this implementation checks would be required to check whether the netdev stack uses a link layer address and to check if the device driver provides a link layer address and then use that address.

miri64 · 2018-11-12T15:51:45Z

Didn't we "solve" the luid problem already by using luid_custom() with the netdev's pointer instead of luid_get() (see #9656 (comment)).

bergzand · 2018-11-12T15:54:05Z

Didn't we "solve" the luid problem already by using luid_custom() with the netdev's pointer instead of luid_get() (see #9656 (comment)).

That only solves changes to the link layer address when the device driver is reset, but that doesn't solve my problem number 1 and number 2 here (right?) :)

miri64 · 2018-11-12T15:55:02Z

Well, problem 1 and 2 won't arise, when the reset is idempotent ;-).

miri64 · 2018-11-12T15:56:24Z

Or am I misunderstanding them? :-/

bergzand · 2018-11-12T16:04:58Z

As a practical (or less hypothetical) example, the nrf52840 doesn't have hardware filtering. The software filter could be implemented as a netdev layer. This extra filter layer would then require knowledge of both the generated link layer address and the PAN ID. IMHO the easiest way for this layer to get the link layer address would be if it could grab the information from a netdev::set call.

Or am I misunderstanding them? :-/

That just means I didn't explain them good enough :)

I'm trying to remove this write by the device driver to the netdev_ieee802154_t struct. At the same time I'm trying to get to a solution that is usable when in the future multiple layers require knowledge of the link layer address.

A different solution for the second issue might be to do a netdev::get call in the init function of the layer for the link layer address and also listen for any netdev::set call that changes the link layer address. Not sure yet if this also works always though.

bergzand · 2018-11-13T13:53:49Z

These "blockers" are all cases where data is directly read from a netdev struct at a position where in a layered module the content of this struct can't be guaranteed. List might be expanded in the future

Blockers:

Link layer address generation:
As done here (called in the reset), a lot of drivers generate the address and directly set it in the netdev_ieee802154_t struct. Complicated to simply move to the netdev_ieee802154 code because some radios (socket_zep) generate the link layer address in a driver specific mode.
PAN id generation:
Similar to the LL address, the PAN id is "generated" in the reset function and written to the netdev_ieee802154_t struct in the pan_id setter. Cleaned and refactored in netdev_ieee802154/radios: refactor PAN ID reset to generic ieee802154 reset #10384
Channel
Another one comparable to the list above. Set again in the init of the driver and passed to the netdev_ieee802154_t struct in the channel setter after verification.
netstats:
Resolved with net stats: move layer 2 netstats from netdev driver to gnrc_netif #9793
numerous flags in the device drivers:
A number of device flags that are enabled/disabled are propagated to the netdev_ieee802154_t flags. Happens at least in the mrf24j40ma and at the kw2xrf driver. The at86rf2xx had this issue resolved with at86rf2xx: Move flags from netdev to radio #9581 and the mrf24j40 has mrf24j40: Move flags from netdev to radio #9583.
802.15.4 sequence number in gnrc_netif_ieee802154.c
The gnrc_netif::state is a pointer to the netdev_ieee802154_t struct. Direct access is done to grab and increment the sequence number. Has netdev_ieee802154: Use intermediate layer for mac header #9417 as a possible solution.
802.15.4 flags in gnrc_netif_ieee802154.c
Similar to the sequence number, flags are also directly read by the netif module. Has netdev_ieee802154: Use intermediate layer for mac header #9417 as a possible solution.
GOMACH and LWMAC
At least GOMACH has a number of places where the sequence number is manipulated. These should probably be refactor to a separate netdev_layer_t module.

bergzand · 2018-11-15T14:18:24Z

Link layer address generation:

At the moment I think that the easiest way is to remove the link layer address from the netdev_ieee802154_t struct. This way the behaviour becomes identical to the behaviour of the ethernet drivers.

The main issue is that now the address generated by the device driver has to be synced somehow with the netdev_ieee802154_t struct member. It is not easily possible to have the netdev_ieee802154_t layer generate the address, some radios (socket_zep) have a built in address which should be used instead.

The only place where the link layer address is requested is with the ifconfig shell command, when netif is requesting the l2 address to initialize it's own copy and with a SLAAC failure (after setting the address).

miri64 · 2018-11-15T14:22:00Z

with a SLAAC failure (after setting the address).

DAD failure ;-).

bergzand · 2018-11-15T14:23:44Z

DAD failure ;-).

That's what I was thinking, but not what I was writing :)

stale · 2019-08-10T03:08:59Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

miri64 assigned jnohlgard, haukepetersen, miri64, OlegHahm and kaspar030 Oct 13, 2017

miri64 mentioned this issue Nov 15, 2017

gnrc_netdev2: link-layer retransmissions outside the transceiver driver #4795

Closed

This was referenced Nov 30, 2017

[RFC] net/netstats L1/L2 per neighbor transfer statistics #6873

Closed

netdev: Initial implementation of a more layered approach to netdev #8198

Merged

This was referenced Jul 17, 2018

at86rf2xx: Move flags from netdev to radio #9581

Merged

mrf24j40: Move flags from netdev to radio #9583

Closed

bergzand mentioned this issue Nov 13, 2018

netdev_ieee802154/radios: refactor PAN ID reset to generic ieee802154 reset #10384

Merged

6 tasks

miri64 mentioned this issue Nov 15, 2018

at86rf2xx: Don't use netdev_ieee802154_t for link layer address #10401

Closed

This was referenced Nov 15, 2018

mrf24j40: Don't use netdev_ieee802154_t for link layer address #10402

Merged

cc2538_rf: Don't use netdev_ieee802154_t for link layer address #10425

Merged

cc2538_rf: Don't use netdev_ieee802154_t for channel #10426

Merged

miri64 mentioned this issue Nov 28, 2018

[RFC] gnrc_netif: Group multiple devices/connections into single thread? #10496

Open

This was referenced Dec 1, 2018

kw2xrf: Don't use netdev_ieee802154_t for link layer address #10534

Merged

kw2xrf: Don't use netdev_ieee802154_t for channel #10535

Merged

jnohlgard mentioned this issue Dec 10, 2018

drivers: Add support for KW41Z builtin transceiver #7107

Closed

stale bot added the State: stale State: The issue / PR has no activity for >185 days label Aug 10, 2019

miri64 added State: don't stale State: Tell state-bot to ignore this issue and removed State: stale State: The issue / PR has no activity for >185 days labels Aug 10, 2019

jia200x mentioned this issue Nov 13, 2019

RFC: lower network stack rework #12688

Closed

14 tasks

jia200x mentioned this issue Nov 21, 2019

Extensebility of netdev_driver_t API #12469

Closed

MrKevinWeiss added this to the Release 2021.07 milestone Jun 21, 2021

MrKevinWeiss removed this from the Release 2021.07 milestone Jul 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: A layered approach to netdev #7736

RFC: A layered approach to netdev #7736

miri64 commented Oct 13, 2017 •

edited

Loading

miri64 commented Oct 13, 2017

miri64 commented Oct 13, 2017

jnohlgard commented Oct 14, 2017 •

edited

Loading

jnohlgard commented Oct 14, 2017

kaspar030 commented Oct 14, 2017 •

edited

Loading

jnohlgard commented Oct 15, 2017

kaspar030 commented Oct 23, 2017

jnohlgard commented Oct 24, 2017

miri64 commented Nov 15, 2017

bergzand commented Nov 27, 2017

miri64 commented Nov 27, 2017

bergzand commented Nov 28, 2017

miri64 commented Nov 28, 2017

bergzand commented Nov 28, 2017

miri64 commented Nov 28, 2017

bergzand commented Nov 28, 2017

bergzand commented Dec 5, 2017

miri64 commented Dec 5, 2017

miri64 commented Jul 16, 2018

bergzand commented Nov 12, 2018

miri64 commented Nov 12, 2018

bergzand commented Nov 12, 2018

miri64 commented Nov 12, 2018

miri64 commented Nov 12, 2018

bergzand commented Nov 12, 2018

bergzand commented Nov 13, 2018 •

edited

Loading

bergzand commented Nov 15, 2018

miri64 commented Nov 15, 2018

bergzand commented Nov 15, 2018

stale bot commented Aug 10, 2019

RFC: A layered approach to netdev #7736

RFC: A layered approach to netdev #7736

Comments

miri64 commented Oct 13, 2017 • edited Loading

miri64 commented Oct 13, 2017

miri64 commented Oct 13, 2017

jnohlgard commented Oct 14, 2017 • edited Loading

jnohlgard commented Oct 14, 2017

kaspar030 commented Oct 14, 2017 • edited Loading

jnohlgard commented Oct 15, 2017

kaspar030 commented Oct 23, 2017

jnohlgard commented Oct 24, 2017

miri64 commented Nov 15, 2017

bergzand commented Nov 27, 2017

miri64 commented Nov 27, 2017

bergzand commented Nov 28, 2017

miri64 commented Nov 28, 2017

bergzand commented Nov 28, 2017

miri64 commented Nov 28, 2017

bergzand commented Nov 28, 2017

bergzand commented Dec 5, 2017

miri64 commented Dec 5, 2017

miri64 commented Jul 16, 2018

bergzand commented Nov 12, 2018

Problem

Solution

miri64 commented Nov 12, 2018

bergzand commented Nov 12, 2018

miri64 commented Nov 12, 2018

miri64 commented Nov 12, 2018

bergzand commented Nov 12, 2018

bergzand commented Nov 13, 2018 • edited Loading

Blockers:

bergzand commented Nov 15, 2018

miri64 commented Nov 15, 2018

bergzand commented Nov 15, 2018

stale bot commented Aug 10, 2019

miri64 commented Oct 13, 2017 •

edited

Loading

jnohlgard commented Oct 14, 2017 •

edited

Loading

kaspar030 commented Oct 14, 2017 •

edited

Loading

bergzand commented Nov 13, 2018 •

edited

Loading