Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Reduce the number of gnrc_netif threads #13762

Closed
jia200x opened this issue Mar 31, 2020 · 16 comments
Closed

RFC: Reduce the number of gnrc_netif threads #13762

jia200x opened this issue Mar 31, 2020 · 16 comments
Assignees
Labels
Area: network Area: Networking State: duplicate State: The issue/PR is a duplicate of another issue/PR State: stale State: The issue / PR has no activity for >185 days Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation

Comments

@jia200x
Copy link
Member

jia200x commented Mar 31, 2020

Description

Currently GNRC netif allocates one thread per interface. This doesn't add any advantages since all threads have the same priority. In practice, this procedure allocates much more memory than needed.

So, I open this issue to decide how to proceed with this.
I see 2 alternatives:

  1. Make one global gnrc_netif thread that demux to the right gnrc_netif_xxx implementation:
    We could have only one gnrc_netif thread (same as having one gnrc_ipv6 and one gnrc_udp).
    Instead of sending a packet to a given PID, we can send all packets to the same thread. This one would operate the specific link layer (via the gnrc_netif_ops)

  2. We could make gnrc_netif agnostic to the source of events (like netdev).
    With this approach, gnrc_netif doesn't have any thread and requires someone else to process the event. For instance, we could think of reusing the event_thread module to handle interface events (send, recv, get, set).

Some open questions:

  • Regardless of 1. or 2., how do we pass the interface ID to gnrc_netif? With both approaches the interfaces need a ID independent of the PID (see netif: add functions to get and get by identifier  #12738). For gnrc_netif_send it's possible to use the netif header in the pktbuf, but how to achieve the same with gnrc_netif_get and gnrc_netif_set?
  • Which approach might make more sense? I tend to think 2. could perform better considering that MAC layer timers might be implemented with event_thread too. But 1. can also do the trick

All feedback is more than welcome

Related PRs

@jia200x jia200x added the Area: network Area: Networking label Mar 31, 2020
@jia200x
Copy link
Member Author

jia200x commented Mar 31, 2020

@miri64 @benpicco

@jia200x jia200x changed the title RFC: Reduce the number of gnrc_netif RFC: Reduce the number of gnrc_netif threads Mar 31, 2020
@bergzand
Copy link
Member

This doesn't add any advantages since all threads have the same priority

The only advantage (as far as I know) shows up when multiple interfaces are using DMA based interfaces that are able to switch context during the transfer. Then multiple interfaces would be able to block on I/O-based operations at the same time. That's not possible with a single thread with the current SPI API. For the other threads (ipv6 and udp) this is not an issue because they are CPU-intensive and not I/O intensive.

This is one of the reasons why I'm very curious how performance is affected with this PR versus the RAM saved in a multi-interface scenario. I'm not sure if we actually have SPI driver implementation that allow for a thread switch during the transfer.

I don't see much of a difference between option 1 and 2, except that it is a different name for the same thing. I prefer to have an explicit thread for gnrc, it allows for more flexible IPC (messages, events, flags) and I think it makes debugging easier, though that last point is more a gut feeling than an actual argument.

@PeterKietzmann
Copy link
Member

The only advantage (as far as I know) shows up when multiple interfaces are using DMA based interfaces

Is this a use case, or, do we have that kind of interface in RIOT nowadays?

I'm not sure if we actually have SPI driver implementation that allow for a thread switch during the transfer.

Then, for SPI based interfaces there would be no difference between the single- vs multi-threaded solution, do I get you correctly?

@benpicco
Copy link
Contributor

benpicco commented Apr 3, 2020

Is this a use case, or, do we have that kind of interface in RIOT nowadays?

IMHO polling based SPI is a kludge until someone provides a proper implementation.
We shouldn't make design decisions that rule out benefits from DMA or Interrupt based SPI transfers.

@bergzand
Copy link
Member

bergzand commented Apr 3, 2020

The only advantage (as far as I know) shows up when multiple interfaces are using DMA based interfaces

Is this a use case, or, do we have that kind of interface in RIOT nowadays?

I've done this for the nrf52 devices in a branch (still have to clean up and PR). There thread initiating the transfer blocks on a mutex while the transfer is busy and is released as soon as the transfer is done. This way the SPI API does not have to be changed but the CPU is free to handle different threads (or sleep) during the transfer.

I'm not sure if we actually have SPI driver implementation that allow for a thread switch during the transfer.

Then, for SPI based interfaces there would be no difference between the single- vs multi-threaded solution, do I get you correctly?

What I was trying to say here is that currently none of the SPI implementations block on a mutex or other thread sync mechanism but spinlock while the SPI peripheral is busy. Hower this is purely an implementation limitation and not an API limitation.

For now I don't expect a difference in performance and I'm aware that right now this is a non-issue due to the SPI periph driver implementation. However as this is still in a design phase, judging by the RFC label, I'd say this is the point in time to look ahead and explore possible limitations of the approach. Of course I don't mean to block this issue on this, but I want to have this mentioned so that we can actively make a decision on whether to take this future limitation into account or not.

@jia200x
Copy link
Member Author

jia200x commented Apr 3, 2020

IMHO polling based SPI is a kludge until someone provides a proper implementation.
We shouldn't make design decisions that rule out benefits from DMA or Interrupt based SPI transfers.

The option 2. (gnrc_netif agnostic to the source of the events) doesn't rule out the usage of DMA or Interrupt based SPI transfer. The implementor of the Bottom-Half processor should deal with this.

I'm aware this pattern could be done with multiple thread, but IMHO it's overkill to have one full thread per network interface.

@bergzand
Copy link
Member

bergzand commented Apr 3, 2020

Bottom-Half processor

I'm not familiar with this term, care to elaborate?

I'm aware this pattern could be done with multiple thread, but IMHO it's overkill to have one full thread per network interface.

That's what we're discussing here right? Whether it makes sense to decrease the number of threads and what the trade offs are. That's why I mentioned the possible (future) performance impact, before refactoring to a different approach.

@kaspar030
Copy link
Contributor

IMHO polling based SPI is a kludge until someone provides a proper implementation.
We shouldn't make design decisions that rule out benefits from DMA or Interrupt based SPI transfers.

Polling based SPI implementation might be "a kludge until someone provides a proper implementation", but having a blocking API is IMO a sound decision. We don't have any language support for asynchronous APIs, so they would get messy. And providing a blocking implementation is easy, providing an asynchronous one is more difficult (and thus less likely to be available for any given hardware).

I'd prefer fully async capable periph API's, especially as it is trivial to wrap a blocking API on top. But I don't think building this is trivial (meaning, this will not happen in a llong time).

@jia200x
Copy link
Member Author

jia200x commented Apr 3, 2020

I'm not familiar with this term, care to elaborate?

The Bottom-Half processing is basically the ISR offloading mechanism. It's the component that calls the IRQ handler of the device. We currently use the gnrc_netif thread as the Bottom-Half processor

That's what we're discussing here right? Whether it makes sense to decrease the number of threads and what the trade offs are. That's why I mentioned the possible (future) performance impact, before refactoring to a different approach.

Just to clarify, the performance impact (which l recognize it can be present) is not related to gnrc_netif but to the fact that we hardcode the Bottom-Half processor to the gnrc_netif thread.

If we make gnrc_netif agnostic to the source of events, we don't assume the input architecture (e.g a queue based arquitecture vs a single thread processor). Both cases require the same "input" to the network stack but a different Bottom-Half processor.

This goes beyond this point with the fact that all network stack in master could reuse the same Bottom-Half processor to subscribe to the network device events.

@miri64 miri64 added the Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation label Jul 6, 2020
@miri64 miri64 added this to the Release 2020.07 milestone Jul 6, 2020
@jia200x
Copy link
Member Author

jia200x commented Sep 18, 2020

During the summit's PHY/MAC rework breakout session, there was rough consensus that probably w don't want to change the way GNRC defines interfaces since it was designed to e flexible. Having one thread for all interfaces could affect flexibility.

If there are no more commnets here, I would simply close this issue

@miri64
Copy link
Member

miri64 commented Sep 18, 2020

Was there a consensus for that? IMHO the consensus was to not enforce this, but I think there is still value (especially memory-wise) to allow for grouping GNRC's netifs to dedicated threads.

@benpicco
Copy link
Contributor

As far as I understand devices with multiple interfaces are pretty rare, so there is probably not much to gain here in optimizing for that.

@jia200x
Copy link
Member Author

jia200x commented Feb 2, 2021

As far as I understand devices with multiple interfaces are pretty rare, so there is probably not much to gain here in optimizing for that.

Doesn't the at86rf215 expose 2 interfaces? Or is it only one but 2 threads for the radio?

@MrKevinWeiss MrKevinWeiss removed this from the Release 2021.07 milestone Jul 15, 2021
@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added the State: stale State: The issue / PR has no activity for >185 days label Mar 2, 2022
@stale stale bot closed this as completed Apr 18, 2022
@miri64
Copy link
Member

miri64 commented Apr 18, 2022

How is this different from #10496?

@jia200x
Copy link
Member Author

jia200x commented Apr 19, 2022

yes, it's basically the same

@miri64 miri64 added the State: duplicate State: The issue/PR is a duplicate of another issue/PR label Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: network Area: Networking State: duplicate State: The issue/PR is a duplicate of another issue/PR State: stale State: The issue / PR has no activity for >185 days Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation
Projects
None yet
Development

No branches or pull requests

7 participants