Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider Node to Participant mapping #250

Merged
merged 11 commits into from
Jun 11, 2020

Conversation

ivanpauno
Copy link
Member

Adding a document analyzing the drawbacks of enforcing a one-to-one mapping between ROS Nodes and DDS Participants and proposing alternative approaches.

@ivanpauno ivanpauno added the in review Waiting for review (Kanban column) label Aug 9, 2019
@ivanpauno ivanpauno self-assigned this Aug 9, 2019
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
Copy link
Member

@kyrofa kyrofa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts on the security front.

articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
@wjwwood wjwwood mentioned this pull request Aug 16, 2019
34 tasks
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
@ivanpauno
Copy link
Member Author

ivanpauno commented Aug 26, 2019

We had a meeting with @artivis @hidmic @kyrofa @dirk-thomas @ruffsl @wjwwood and @drastx about how DDS Security API would look like after changing Node to Participant mapping to one Participant per Context.

Here are the meeting notes. Please, correct me if something is wrong or missing.

Items

  • DDS security can be configured at Participant level
    • How to pass a key to each Participant?
    • How to specify control access policies?

Notes

  • We intend to keep harnessing DDS security support.
  • If the node to participant mapping becomes many to one, control policies apply to a set of nodes.
    • Are these on the same process? Yes, a participant cannot expand process boundaries.
  • If context to participant becomes one to one, multiple participants may coexist in the same process. These may not share the same access control policies BUT do share the address space. It also means that full intra-process communication, which currently occurs on a per context basis, may be precluded by having multiple participants.
  • If control policies are specified on a per node basis and combined as fragments of a single control policy for a given process that composes several nodes:
    • How are collisions handled?
      • An error is shown when the file of a Node tries to add a permit and another is denying it.
    • How are policies combined? A preprocessor may help.
  • Participant control policies cannot be changed dynamically, only upon construction/configuration. It may be possible to teardown participants and bring them back up to allow for it.
    • Is it even currently possible? Node lifecycles could aid on this regard.

Proposals

  • Restriction: SROS2 and dynamic node composition DO NOT mesh together (dynamic change of the control access policies is not allowed, it would be possible to dynamically load nodes if the initial access control policies are permissive enough).
  • Introduce the concept of a context in security profile files.
    • What is context in paths to nodes’ security governance files?
      • Context can get a name, and fallback to “_default” or “” if none is provided.
  • As a first step, we could only accept one access control policy file per Context. Combining policies of different nodes could be done in a second step.

@drastx
Copy link

drastx commented Aug 26, 2019

@ivanpauno, my github id is drastx

@mikaelarguedas
Copy link
Member

That looks like a great plan!

I have a few questions to help me picture how this will all fit together:

If context to participant becomes one to one, multiple participants may coexist in the same process. These may not share the same access control policies BUT do share the address space. It also means that full intra-process communication, which currently occurs on a per context basis, may be precluded by having multiple participants.

Is it possible that context would span multiple participants in the future?
If so a way to set and refer to participant names instead of context name would be valuable.

Introduce the concept of a context in security governance files.

Could you clarify what this would mean in terms of governance file's content?

Context can get a name, and fallback to “_default” or “” if none is provided.

Sorry the notion of context is still unclear to me. Where would that be done? Will the context name be set at the launchfile level or in code?

Then I assume that rcl would retrieve the context name, the corresponding security directory and pass it down to rmw to set up the participant's security plugins, is that right?

As a first step, we could only accept one access control policy file per Context. Combining policies of different nodes could be done in a second step.

This would be a great next step 👍. As @kyrofa pointed out in the past, the node author is the most knowledgeable person to list all the resources the node needs access to, and offloading to each integrator to figure out the permissions of each node in the system can quickly result in a lot of extra work for each user.

@ruffsl
Copy link
Member

ruffsl commented Aug 28, 2019

Could you clarify what this would mean in terms of governance file's content?

In the notes, I think @ivanpauno is confusing policy/permissions/governances that we discussed. We worked towards an idea where we'll add an grouping of multiple profiles by say including a <context> tag just above or perhaps bellow the <profiles> tag, so that during templating multiple profiles from nodes that share the same context will be combined/flattened into a single corresponding permission/governance file for the one participant.

Is it possible that context would span multiple participants in the future?
If so a way to set and refer to participant names instead of context name would be valuable.

I haven't formed an opinion on that, but it seems to me that if context would span multiple participants, the participant could just load the same permissions/governance. I'm not sure what boundary context would have outside of DDS rmw, but I think we'd like to use the process boundary as the line, given the implications of secure memory access.

Sorry the notion of context is still unclear to me. Where would that be done? Will the context name be set at the launchfile level or in code?

I think the launch file level, as that is where the composition of nodes is configured/declared, yes?

Then I assume that rcl would retrieve the context name, the corresponding security directory and pass it down to rmw to set up the participant's security plugins, is that right?

On the money 💵 . I think we'll have to expand upon the keystore file tree structure, similar to the policy document tree I mentioned with the tags above, so that paths to contexts are resolvable. I'm not sure how that all should be nested, say in conjunction with nodes not composed, but we should open a separate issues for these details. Perhaps it makes sense to always declare a context for consistency, but I'm not sure how that should fold with node name uniqueness?

As a first step, we could only accept one access control policy file per Context. Combining policies of different nodes could be done in a second step.

Again, I think this per context is applicable to access control permission/governace files, but not (SROS2) policy files, as we'd likely want a document tree that can reflect intersection between multiple contexts. As composed node interactions are likely not going to be limited to only nodes of a common context, we linkly want to verify the correctness/completeness such orchestrations.

@ivanpauno
Copy link
Member Author

In the notes, I think @ivanpauno is confusing policy/permissions/governances that we discussed. We worked towards an idea where we'll add an grouping of multiple profiles by say including a tag just above or perhaps bellow the tag, so that during templating multiple profiles from nodes that share the same context will be combined/flattened into a single corresponding permission/governance file for the one participant.

Sorry for the error. I've edited the original comment.

Is it possible that context would span multiple participants in the future?
If so a way to set and refer to participant names instead of context name would be valuable.

I think we don't want to introduce the concept of Participant to ROS, which is something DDS specific. So, each Context will map one to one with a Participant.

Sorry the notion of context is still unclear to me. Where would that be done? Will the context name be set at the launchfile level or in code?

IMO, in code and with the possibility of remapping it.
Usually, you don't need to explicitly create a Context. In that case, the default name would be used, and you could remap it from the launch file (or when executing it from the command line).

Again, I think this per context is applicable to access control permission/governace files, but not (SROS2) policy files, as we'd likely want a document tree that can reflect intersection between multiple contexts. As composed node interactions are likely not going to be limited to only nodes of a common context, we linkly want to verify the correctness/completeness such orchestrations.

Explicitly or implicitly constructed, a Context is always there. With the ability of remapping the Context name (regardless if it was implicitly or explicitly constructed), I think the orchestration can be verified.

@mikaelarguedas
Copy link
Member

Great thanks for the insighful answers.

I didn't mean to bring implementation details here, I just wanted to make sure there will be an easy way at the launchfile level to remap and group nodes under a specific context name to be able to set permissions without having to touch / compile any code. And that seem to be the case 👍

Each `Participant` could store in its user data, the list of node names that it owns.
When this data is changed, each `ParticipantListener` will be notified.
This is not a good option, as `UserData` is just a sequence of bytes.
Organizing a complex message in it won't be easy nor performant.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, would it not be possible to reuse the proposed IDL for encoding the ROS discovery info to serialize it into the UserData paload of the discovery packet, so that users wouldn't have seperate discovery layers that act outside the middlewares own QoS/security settings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's definitely an option.
The only problem, is that that will require the user data to be sent again when modified. Though that's supported in DDS, I don't see support of that in some implementations (particularly, I don't think FastRTPS support that).

There was another way of communicating this information, using userData from the Participant, DataWriter and DataReader, that didn't require them to change (the document mentions using Publisher/Subscriber groupData, but that can be replaced with DataWriter/DataReader userData). Again, the limitation in this case was the lack of support in FastRTPS for DataWriter/DataReader userData (or Publisher/Subscribe GroupData).

The solution that I like the most, is the one combining Participant/DataWriter/DataReader userData, because information doesn't need to be updated and is just provided once. In this case, we could also take advantage of using a ROS message and serializing it there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, the limitation in this case was the lack of support in FastRTPS for DataWriter/DataReader userData (or Publisher/Subscribe GroupData).

Is there an open issue on this we could track, or do we need to open a new one?

The solution that I like the most, is the one combining Participant/DataWriter/DataReader userData, because information doesn't need to be updated and is just provided once.

Just to clarify, for every new DDS DataReader/DataWriter pair that a new node would add to a DDS participant, that pair could broadcast UserData that would include the ROS level discovery info for that respective node? If a node subscribes to the same topic as another node in the same context, is a new DataReader still added to the participant? Similarly for publishing ad DataWriters as well?

I guess if the two nodes have different QoS settings for the same topic, then such might need to be the case. But if the QoS setting are the same, would separate DataReader/DataWriters still be necessary for other reasons like ownership of messages in the message queue history for callbacks.

I'm just trying to figure out if the UserData from DDS DataWriter/DataReader would be unique to a node, or from a collection of nodes that share that DataWriter/DataReader instance by virtue of being in the same DDS participant or ROS context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an open issue on this we could track, or do we need to open a new one?

It's commented in the code that there's no support for DataWriter/DataReader userData/groupData. You can open a ticket to ask for it to be added.
In the case of Participant userData there's no API allowing to modify it after the Participant gets created. A ticket could be opened too.

Just to clarify, for every new DDS DataReader/DataWriter pair that a new node would add to a DDS participant, that pair could broadcast UserData that would include the ROS level discovery info for that respective node?

Yes.

If a node subscribes to the same topic as another node in the same context, is a new DataReader still added to the participant? Similarly for publishing ad DataWriters as well?

Yes.

I'm just trying to figure out if the UserData from DDS DataWriter/DataReader would be unique to a node, or from a collection of nodes that share that DataWriter/DataReader instance by virtue of being in the same DDS participant or ROS context.

Currently, most implementations are creating a DDS Publisher/Subscriber for each ROS Publisher/Subscription. Some are just creating a DDS Publisher/Subscriber, and creating one DataWriter/DataReader per ROS Publisher/Subscription. The last solution is probably the ideal one.


I will correct I previous mistake I made.
The idea to combining Participant/Endpoint userData doesn't avoid it to be needed to be updated. Particularly, Participant userData would still be needed to be updated:

  • Participant userData would be a list of node names/namespace + context name: Updated when a new node is created/destroyed.
  • DataWriter/DataReader userData is the node name and namespace of the creator. There's no need to resend it.

The advantage of this combination is that less information has to be resent, and "messages" (userData content) is simpler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea to combining Participant/Endpoint userData doesn't avoid it to be needed to be updated. Particularly, Participant userData would still be needed to be updated:

  • Participant userData would be a list of node names/namespace + context name: Updated when a new node is created/destroyed.
  • DataWriter/DataReader userData is the node name and namespace of the creator. There's no need to resend it.

Can all of these out-of-band info be encrypted ? if not which ones can't be?
I'm trying to evaluate if we would end up leaking system information in a similar manner to ros2/sros2#172 (and maybe leaking topic info as well)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can all of these out-of-band info be encrypted ? if not which ones can't be?

I guess DDS build-in topics are all encrypted if you're using security (including userData info), I would be extremely surprised if not.


After reading ros2/sros2#172, it seems that that information is not being encrypted.
I'm not completely sure if that's a bug in fastrtps or the DDS security specification doesn't cover that. But this info could (and should) be encrypted.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a node subscribes to the same topic as another node in the same context, is a new DataReader still added to the participant? Similarly for publishing ad DataWriters as well?

I guess if the two nodes have different QoS settings for the same topic, then such might need to be the case. > But if the QoS setting are the same, would separate DataReader/DataWriters still be necessary for other reasons like ownership of messages in the message queue history for callbacks.

The DDS interface pretty much requires a DataReader for each ROS2 subscription, regardless of QoS settings, because you take the data. And in the current ROS2 model, publishers can't share DataWriters because the publisher GID is accessible to the subscribers. (If it weren't, for obvious reasons it'd still only be the keep-all writers that one could combine.)

Naturally one could multiplex everything on a single reader/writer by re-implementing a ton of things and putting additional information in the generated IDL, &c. — but that would be a bit silly.

I guess DDS build-in topics are all encrypted if you're using security (including userData info), I would be extremely surprised if not.

After reading ros2/sros2#172, it seems that that information is not being encrypted.
I'm not completely sure if that's a bug in fastrtps or the DDS security specification doesn't cover that. But this info could (and should) be encrypted.

From the way I read the specs, participant discovery data must always be sent in the clear because it is what bootstraps the protocol, and this unfortunately includes the “user data” QoS. Reader and writer discovery is encrypted (when required by the governance file) and that also protects the topic/group/user data fields. (Checked it with Cyclone DDS; @ruffsl, @vmayoral, re ros2/sros2#172, I’d be interested in knowing via which path the type information was leaked, I can’t find any type name in a capture.)

The idea to combining Participant/Endpoint userData doesn't avoid it to be needed to be updated. Particularly, Participant userData would still be needed to be updated:

  • Participant userData would be a list of node names/namespace + context name: Updated when a new node is created/destroyed.
  • DataWriter/DataReader userData is the node name and namespace of the creator. There's no need to resend it.

The advantage of this combination is that less information has to be resent, and "messages" (userData content) is simpler.

It seems to me there is an alternative to modifying the participant user data (although it is spec’d feature and pretty widely supported by DDS implementations): simply infer the nodes from the reader/writer info. There are a few downsides I can see to this, but all seem rather minor:

  • Getting a list of nodes directly from DDS discovery info requires iterating over all readers & writers. Maintaining a better index is easy — and the PR on rmw_dds_common already implements that, right?
  • A ROS2 node without readers or writers would not be visible in the network. It won’t participate either, so its being invisible doesn’t seem like a big deal.
  • If a single process/context/DDS participant may have multiple instantiations of the same ROS2 node, merely putting the node names in the reader/writer info is not enough. That can solved by adding a unique node identifier.

You could put this info in the “user data” QoS of each reader/writer, or you could require that no DDS Publisher/Subscriber be used for more than one ROS2 node and put it in the “group data” QoS of the publisher/subscriber. Each reader/writer’d effectively end up distributing a copy of it anyway in the discovery, but I’d say it is slightly more elegant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naturally one could multiplex everything on a single reader/writer by re-implementing a ton of things and putting additional information in the generated IDL, &c. — but that would be a bit silly.

100% agreed.

From the way I read the specs, participant discovery data must always be sent in the clear because it is what bootstraps the protocol, and this unfortunately includes the “user data” QoS

That's good to know.

I’d be interested in knowing via which path the type information was leaked, I can’t find any type name in a capture.

It was leaked via Participant userData.

Getting a list of nodes directly from DDS discovery info requires iterating over all readers & writers. Maintaining a better index is easy — and the PR on rmw_dds_common already implements that, right?

To clarify, the current PR uses a topic where this data is published.
This is discussing one of the alternatives.

To answer the question: No, the list of nodes would be available from the userData field of all Participants (which would be a list of node names and namespaces).

i.e.:
Participant user data is used to share a list of nodes, from where you can then print a node list.
Reader/Writer userData is used to specify which node created it.

A ROS2 node without readers or writers would not be visible in the network. It won’t participate either, so its being invisible doesn’t seem like a big deal.

No, same as above.

If a single process/context/DDS participant may have multiple instantiations of the same ROS2 node, merely putting the node names in the reader/writer info is not enough. That can solved by adding a unique node identifier.

Not exactly. If each context checks that there's not repeated node name (which can be easily done), then from both the Participant guid and node name you have an uniquely identified node (though adding a unique node identifier sounds pretty reasonable).

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/improving-ros-2-cpu-by-simplifying-the-ros-2-layers/13808/1

@dirk-thomas
Copy link
Member

@ivanpauno Can this be closed? (If there is anything pending it might be better to summarize that in a new ticket.)

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-foxy-fitzroy-released/14495/1

@ivanpauno
Copy link
Member Author

@ivanpauno Can this be closed? (If there is anything pending it might be better to summarize that in a new ticket.)

Closed or merged?
I was planning to do some modifications, as the current description isn't supper accurate.

If there's a desire to have a rendered version of this documentation together with the release, feel free to merge and I will follow-up in another PR.

@dirk-thomas
Copy link
Member

Closed or merged?

Certainly merged.

I was planning to do some modifications, as the current description isn't supper accurate.

👍

If there's a desire to have a rendered version of this documentation together with the release, feel free to merge and I will follow-up in another PR.

Just depends on the time frame. It can wait until next week. If it takes you longer please feel free to merge this and follow up with a separate PR.

Signed-off-by: Ivan Santiago Paunovic <[email protected]>
@ivanpauno
Copy link
Member Author

@hidmic I've cleaned up the document, can you take another look?

That information would be enough to satisfy ros2 required graph API.
This mechanism will also avoid to have a topic where all nodes need to have write and read access.

This alternative wasn't implemented because of lack of support in some of the currently supported DDS vendors.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember when I started the implementation of this, some of the user data qos weren't implemented in FastRtps.

It seems that now they are, @richiware can you clarify? If it's supported now, I want to make that clear in the document.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. I missed the mention.
Yes, FastDDS supports Participant/Reader/Writer UserData QoS.

Copy link
Contributor

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivanpauno I left a bunch of comments here and there, mostly related to phrasing. Also, I noticed we use a variety of styles for the same terms (e.g. capitalized, lowercase, uppercase, literal, non-literal). Consider standardizing that.

articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
Signed-off-by: Ivan Santiago Paunovic <[email protected]>
@ivanpauno
Copy link
Member Author

@hidmic PTAL, I think I've addressed all your comments.

@ivanpauno
Copy link
Member Author

@hidmic friendly ping

articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
Signed-off-by: Ivan Santiago Paunovic <[email protected]>
@ivanpauno
Copy link
Member Author

@hidmic all comments have been addressed, PTAL

articles/151_node_to_participant_mapping.md Outdated Show resolved Hide resolved
Signed-off-by: Ivan Santiago Paunovic <[email protected]>
@ivanpauno ivanpauno merged commit c1b9943 into gh-pages Jun 11, 2020
@delete-merged-branch delete-merged-branch bot deleted the ivanpauno/node-to-participant-mapping branch June 11, 2020 13:30
mlanting pushed a commit to mlanting/design that referenced this pull request Aug 17, 2020
mlanting pushed a commit to mlanting/design that referenced this pull request Aug 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in review Waiting for review (Kanban column)
Projects
None yet
Development

Successfully merging this pull request may close these issues.