-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for "critical" subscriptions (message must be sent or will return error) #926
Comments
Not to add another option - I just realized you can also use the "Generic counter" feature of ES to do a similar backpressure concept. Consider this:
If the TO app dies or is killed then it automatically back pressures, as nothing will increment the count. It also has the advantage of being lockless and not requiring any new features of ES or SB. That being said, I do think if the "critical" subscription type/qos is easy enough to implement, it makes sense and seems generic enough to be useful for other items too, so I'm fine with it. |
Would it be reasonable to have a mechanism so that if a critical message is sent, the task for the receiver is prioritized (or even a task switch is initiated)? (I am an RTOS newb, so apologies if this is a bad idea...) |
I'll probably change the name of this flag to something like error if not sent to all since "critical" is ambiguous. I'd think for your use case it'd be better to have a dedicated pipe that the task pends on, and have the task set as a high priority (and have it just do the high priority work). |
Yes, in this case "critical" is not the best word - really just means "provide backpressure to sender based on my ability to ingest this data" ... Not the type of thing one would change priority over. Rather than speeding up the receiver, it will slow down the sender to match what the receiver is capable of. Dynamic reassignment of priority is (almost) never desirable - normally the paradigm should be for the system engineer to set priorities based on how they want the tasks to perform based on their system requirements. If they want to avoid data delays, create a dedicated pipe plus child task with higher priority to handle it. |
For the use cases targeted for this change and to keep it simple:
CFE_SB_PassMsgWithPipeReceipt(CFE_MSG_Message_t *MsgPtr) (or whatever you want to call it) Returns positive number of buffers available after this message allocated For many subscribers to the same topic. No message is sent to any if any destination is full. Returned buffer count is lowest one. Typical use case is one to one, where publisher generates message once and then holds until send when a pipe buffer is available at the next cycle/iteration.
|
Ok, we now just got another option that is different than the original ticket that came from the last discussion (now adding a new API). Do we really need to redesign this again? I thought all we were going to do is set the QOS to indicate it's critical and use the same APIs. |
Sorry. didn't read the whole thread. I'm fine with the QoS approach with return codes added for "No Subscribers" and "Queue full". Is that what was intended? The positive number of pipe buffers left was just a nice to have. |
Just had a thought that a "Queue Full" error code is useful especially in the CF case. If TO stops working and CF can't send any PDUs, it can detect this and freeze the channel. (same behavior as loss of comms) Just a thought. |
You might see "Queue Full" for a few iterations but it should transition to "No Subscribers" if TO exits. If TO has no downlink does it keep reading the CF pipe? It should not. We should maybe talk about this. Can we assume that SC will freeze CP channels(s) before the end of the contact? |
I'm hoping you are only asking for "no subscribers" be returned for a route with the QoS set as critical? Otherwise that's a change in the traditional behavior of send (no subscribers isn't an error for the sending app)... I'd prefer to avoid impacting other apps w/ this change. |
Yes. The additional return codes only apply if QoS is set as critical. I too want to avoid impacts to existing code. Your comment brought up an issue. What if one subscriber has QoS set as critical but other's don't? What should the behavior be? |
I feel like we are repeating discussions :) "Queue full" return was the original proposal, send all or none. QoS needs to be associated with the route to return "No Subscribers", otherwise if the destination specifies it and goes away the route would be non-critical and you'd never get "no subscribers". |
Has anyone considered my previous comment about using counters to do this? We could even (potentially) add a feature to the counter such that a task can wait for an increment. I think you can get all the backpressure you need (more effectively, actually) without even changing SB and risking breaking other stuff. |
Although I think it is a more elegant solution, I got the impression the request was to reduce coupling with TO. Removing the semaphore in favor of a counter doesn't seem to meet the intent. Although even the initial suggestion requires TO to subscribe with the appropriate QoS, so there is still a dependency... |
Current solution to CF flow control/throttling does not require this change. Leaving as a possible enhancement for now if requirements and resources are identified. |
Is your feature request related to a problem? Please describe.
Software bus currently returns success even if a message isn't sent to the subscribers (queue full or over message limit). This causes the message to be dropped with no notification for the sender.
This spawned from the CF use case where notification is required to be able to eliminate the semaphore that is currently used for flow control.
Describe the solution you'd like
Add support for a subscription to be "critical". On send, check that all critical destinations have room for the message, if not don't send to any destinations and return an error. If every critical destination has room, send to all destinations. All done within the SB lock.
For the CF use case, typically the receiver would dedicate a pipe with just that subscription and the individual msg limit check is sufficient (as long as it's smaller than or equal to the queue limit).
May make sense to transition QOS to a bitfield (currently an enum), supporting the subscription critical option.
Describe alternatives you've considered
See #918, #920
Additional context
Discussed that CF should cap work per cycle (avoid free-run if unsubscribed, or no subscribers). Also generate the message once, and retain to send next cycle if there is no room.
Requester Info
Jacob Hageman - NASA/GSFC (spawned from splinter on #920)
The text was updated successfully, but these errors were encountered: