Conversation
fujitatomoya
left a comment
There was a problem hiding this comment.
@eboasson thanks for the PR. although i do not understand every single details, i did what i could do for review. just a few minor comments, could you check them?
lgtm with green CI.
|
And a big thanks for reviewing @fujitatomoya ! |
|
@fujitatomoya I have been struggling a bit with getting good CI runs. What I would like to do is two runs:
So far I have only tried this with passed to A lot of my attempts failed because of unrelated problems (e.g., https://ci.ros2.org/job/ci_linux/26903/console where I don't know what the problem is, and https://ci.ros2.org/job/ci_linux/26876/console where it failed because I didn't realise I should pass the same What worked:
Given that Linux CI with Cyclone What didn't work: Windows with master fails to build because it sets It looks to me like this is an Iceoryx version that is not really fully compatible with Windows yet. The "old" Cyclone used the Iceoryx C binding, the "new" one has a plugin written in C++. (The Cyclone CI doesn't build the Iceoryx plugin on Windows, perhaps I can reproduce it and fix it.) I googled and there doesn't seem to be a way in So it is looking quite like one expects and the problems with Cyclone + Windows on CI can be deferred because there is no immediate need to update Cyclone, but I do suspect you'd like to a see a little more green in the CI ... |
|
@eboasson thanks for sharing the situation. about colcon build and test configuration.
those should build the all dependent packages above rmw_cyclonedds_cpp, and tests all the pacakges with colcon.
unfortunately i also see this unstable behavior in CI.
you mean, enable the @claraberendsen @cottsay @mjcarroll @christophebedard any thoughts?
both Iceoryx and CycloneDDS define i may be missing some context here, please correct me if i got anything wrong. |
|
With this PR and Cyclone
It looks like we get the same/expected interoperability errors as before, the changed return code in The With this PR and Cyclone Both CI runs are with
|
I don't think ci.ros2.org has a good way to pass arbitrary environment variables to builds. You could absolutely do this with an environment hook from a package, though. If you know which tests you'd like to have the variable, you could also modify the tests specifically to export the additional variable. |
Needs cyclonedds later than 0.10.
Cyclone doesn't do WStrings, ROS2 does and so those we can't translate. Signed-off-by: Erik Boasson <eb@ilities.com>
Signed-off-by: Erik Boasson <eb@ilities.com>
Signed-off-by: Erik Boasson <eb@ilities.com>
There are currently two serializers: - one in TypeSupport_impl.hpp - one in Serializer.cpp The first one is the one originally used in the Cyclone RMW layer and walks the ROS message definition directly. The second one was built to speed up serialization and operates on a data structure derived from the ROS message definition. For several years, the second serializer has been used. The deserializer never got rewritten, leading to a very messy situation. This introduces a new deserializer written to match the serializer actually being used.
Cyclone is fine without key hashes and generating them is optional (except for the one security configuration that mandates key hashes), and so leaving key hash generation out is attractive as it keeps things simple. Unfortunately, it turns out that the Fast-DDS version used by ROS2 late 2025 (commit 33bb952a0e80e2b158571cb2d64ed6b2003609db on the 3.2.x branch) requires them. This adds the code to generate them, Cyclone's existing Internal/GenerateKeyHash setting can be used to force the generation.
cpplint complains about a .h file being included with a path, but . is forbidden. It doesn't mind a .hpp file, so now we have a C header file with a .hpp suffix.
This fixes
test_rmw_implementation.
TestPublisherUseLoan.
borrow_loaned_message_with_bad_arguments
There isn't necessarily an exception to throw. This throws std::bad_alloc instead.
|
@fujitatomoya I force-pushed it to fix the merge conflict because it was so trivial: My CI run for |
|
Thanks @cottsay ! And @fujitatomoya, of course 🙂 I have done the trick with the hooks to set The linux and RHEL failures are unrelated, the Linux aarch64 one is as I would expect (the wide strings incompatibility is deliberate, see the description of this PR). This simply provides evidence for my claim that the settings I mentioned help workaround bugs in FastDDS. As they are workarounds for other people's bugs, I don't think they should be made a (semi-)permanent feature of Cyclone or its RMW layer. Do you agree that the results look good for merging this PR? Once we merge this, we can decide when to update the used Cyclone version as well. (Once put a tag on it, but the paperwork for that has been done and there is no obstacle to do so.) |
|
Had a lot of CI worker turbulence in the last fortnight. Just re-triggered a CI run of 18035. It might take a bit to complete because there is a large backlog of workers. |
|
Pulls: #550 |
|
This pull request has been mentioned on Open Robotics Discourse. There might be relevant details there: https://discourse.openrobotics.org/t/ros-pmc-minutes-for-februrary-10-2026/52550/1 |
|
For the windows builds, I would l like to add here that this PR in combination with the latest master of cyclone dds, solves a lot of the build warning we had with MSVC 2022. The ROS PMC would like to transition from MSVC 2019 to MSVC 2022 but when we build that in the test ci jobs, we saw over 1800 warnings. When we build it (with other fixes) together with cyclone-dds master and this PR, it got reduced to only 37 warnings. Builds of this PR:
We would very much benefit from this PR being merged AND having a brand new release of cyclone dds (or have rolling point to master at least) |
|
We also need the key feature introduced in this PR. |
|
@trittsv Shortage of maintainers all over the place. You are more than welcome to join us and help out with bug fixes and PR reviews. |
| { | ||
| auto u = reinterpret_cast<uint16_t *>(x); | ||
| *u = static_cast<uint16_t>((*u >> 8) | (*u << 8)); | ||
| } |
There was a problem hiding this comment.
This looks like Undefined Behavior
The correct implementation is
| { | |
| auto u = reinterpret_cast<uint16_t *>(x); | |
| *u = static_cast<uint16_t>((*u >> 8) | (*u << 8)); | |
| } | |
| { | |
| uint16_t tmp; | |
| memcpy(&tmp, x, sizeof(tmp)); | |
| tmp = ((tmp >> 8) | (tmp << 8)); | |
| memcpy(x, &tmp, sizeof(tmp)); | |
| } |
Note, the compiler will completely optimize the memcopy out, and the result will be as efficient as the pointer version. The big difference here is that you don't mess up the life time tracking of the compiler.
Here is a godbolt example were you can see the resulting assembler code for both versions.
https://godbolt.org/z/ra9roYE9W
C++ Weekly episode explaining this also in depth
https://youtu.be/L06nbZXD2D0
|
@eboasson I am working on a optimized and bug fixed version of this, can you create some performance tests for the serialization part ? |
|
I did not finish the rework of the PR today, I'll try to look further into it on monday. As a general remark, there is a lot of reinterpret cast all this PR, which is undefined behavior and need to be fixed. I already fixed most of the stuff in the Serialization.cpp |
|
@eboasson |
Thank you! I'm looking at it and I'm trying to make some simple serialization performance test as well. It should at least be possible to verify that in some simple cases the "trivially serializable" optimizations do happen, but verifying this in an automated test seems somewhat tricky.
Thank you for that, too. I think often it ends up not being UB, based on:
and for type-accessibility:
The serialized representation is (or at least should be) an array of Outside the serializer there are indeed also a bunch, but I think those are cases where |
Description
This rather massive PR:
The first two are really mostly the old (draft) PR #501, but with a few more fixes.
The final two are because the situation was a mess, albeit one that worked. It basically required no attention for a long time and was left to rot for a long time. Adding support for keys forced me to work on it, and I decided to try and reduce the mess. It is still far from perfect, but I kept running into not being allowed to do partial template specialization when trying to unify the serializer and the deserializer ... and not liking C++ to begin with, gave up on that.
Is this user-facing behavior change?
Not really, except for adding support for keys and providing type information in DDS. Cyclone moved ahead quite a bit since 0.10.x but it retains backwards compatibility for configuration.
Those changes do include things that should improve the user experience in many cases. In particular on networking and discovery options it improves things quite a bit. For example, auto-detecting whether multicast on loopback actually works, being able to do multicast discovery over loopback while restricting discovery over ethernet to unicast, tooling improvements, moving Iceoryx/Iceoryx2 support into plugins so there are fewer problems in building things.
In my understanding, the paperwork required by the Eclipse Foundation for a release is sorted. So tagging it for a new release is finally possible.
Did you use Generative AI?
No.
Additional Information
So far, really only tested by running tests on macOS (thanks https://github.com/IOES-Lab/ROS2_Jazzy_MacOS_Native_AppleSilicon !). The Cyclone code seems to work fine, but the compatibility with Fast-DDS is terrible. I've done enough digging to say that the problem is with Fast-DDS.
If you want it to work with Fast-DDS, and you Fast-DDS hasn't been yet fixed ... there are some workarounds possible:
Internal/GenerateKeyHashtotrue, because Fast-DDS crashes if it doesn't receive the optional DDSI KeyHash for keyed topics. (I probably would've skipped generating key hashes in this PR if it weren't for this.)Compatibility/IgnoreTypeInformationto1.15