Skip to content

Conversation

iv461
Copy link
Contributor

@iv461 iv461 commented Sep 15, 2025

Description

Fixes #2951

Is this user-facing behavior change?

When users publish custom message types using type adapters, they must now publish by unique_ptr if the custom message type overloads operator new. This is to prevent new/delete mismatch.

Did you use Generative AI?

No

Additional Information

Tested, and correctly fails to compile if I try to publish a pcl::PointCloud by value AND I'm pulling the dependency on PCL via pcl_ros.

According to cppreference, there are 22 overloads of operator new. I only implement detecting the class-specific ones 15 and 17. I do not implement the array-operators (16 and 18) since the detection becomes ambiguous and causes compile error.
I do not think it is reasonable to guard against overloaded global operator new/delete, i.e. overloads 1-10.

Also, overloads 11-14 and 19-22 accept arbitrary arguments and are therefore likely to be undetectable.

Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although I think this is a right thing to do to avoid undefined behavior by new/delete mismatch when the data object is copied by global allocator, this possibly breaks the application not to be able to compile once this fix is merged. again, i think this is also good to let the user application know the correct approach to publish the data in this particular case. but i would like to have other opinions from other developers.

@fujitatomoya
Copy link
Collaborator

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Sep 17, 2025

rebase

✅ Branch has been successfully rebased

@iv461
Copy link
Contributor Author

iv461 commented Sep 17, 2025

Yes, this is undefined behavior, more speficially it might cause heap corruption and with this random crashes.
Therefore, I think it is reasonable to aim to fix all packages that have this new/delete mismatch bug and now would fail to compile.

But you are right, the problem is that there might be many places that need to be fixed, i.e. many packages that would fail to compile.

To fix the compilation, generally it suffices to replace publish(my_message) with publish(std::make_unique<MyMessage>(my_message)). Since rclcpp would copy the message anyway, this also has no performance impact.

Two ideas:

  • Bump pcl_ros to use C++17 on rolling to prevent pcl::PointCloud from using custom allocators.
  • Make an exception for the pcl::PointCloud type, (i.e. allow it to be published by value) if we can prove that calling the Eigen aligned free function does not cause heap corruption when freeing memory allocated from the regular global new operator. But unfortunately, it looks like it would cause heap corruption.

By the way, I looked at the CI and it failed again due to an apparently unrelated issue in rclcpp_action.

@alsora
Copy link
Collaborator

alsora commented Oct 3, 2025

We discussed this during the client library working group.
Overall we are in favor of this.

If possible it would be better to make it a static warning rather than an assertion, to have a deprecation cycle.
If not possible, we think it would be ok to merge as is.

Ivo Ivanov added 11 commits October 7, 2025 21:42
This reverts commit f1917a35a5037220be2f537031e3ef5b65b5be3e.

Signed-off-by: Ivo Ivanov <[email protected]>
Signed-off-by: Ivo Ivanov <[email protected]>
…nstead of an static_assert that would cause a compile error.

Signed-off-by: Ivo Ivanov <[email protected]>
…late specialization. This forces the compiler to evaluate the condition later, not as early as in the lexial stage

Signed-off-by: Ivo Ivanov <[email protected]>
Signed-off-by: Ivo Ivanov <[email protected]>
…ept GCC: We need two patterns to avoid struct redeclaration error

Signed-off-by: Ivo Ivanov <[email protected]>
Signed-off-by: Ivo Ivanov <[email protected]>
…g: The warning simply does not trigger when called from the rclcpp publish method for some reason

Signed-off-by: Ivo Ivanov <[email protected]>
@iv461
Copy link
Contributor Author

iv461 commented Oct 7, 2025

Thanks for the update. I would also be fine with a deprecation warning instead of a compile error. However, I was unable to to implement a deprecation warning. I tried an approach in df1e782 which generally works, but for some strange reason I can't get it to work inside rclcpp. Therefore, I'll leave it as it is, with the static_assert.

@alsora
Copy link
Collaborator

alsora commented Oct 17, 2025

@iv461 can you share more details about the issue?
Your proposed approach seems good

@iv461
Copy link
Contributor Author

iv461 commented Oct 17, 2025

Hey @alsora, sure, here is what I did: I've implemented an approach that works in isolation as you can see in this godbolt example:
https://godbolt.org/z/9rsdh96ra

This example triggers a deprecation warning with GCC 13.4 — the same compiler I use to build rclcpp.

However, if I put this exact code inside rclcpp, as I did in this commit: df1e782, the warning does not trigger at all. I've ruled out it being a build/setup issue: if I compile the same minimal Godbolt code as a ROS node, the warning triggers. I've also ruled out the possibility that I'm pulling the rclcpp headers from /opt/rclcpp by mistake – this is not the case.
This is really strange. I'm usually not quick to say it's a compiler bug, but in this case it looks like one.

@fujitatomoya
Copy link
Collaborator

Pulls: #2954
Gist: https://gist.githubusercontent.com/fujitatomoya/a307f5c28e8b85ba94b63088b8305b13/raw/7f748fa60a7f1046c747cc15f2c8c022840643b1/ros2.repos
BUILD args: --packages-above-and-dependencies rclcpp
TEST args: --packages-above rclcpp
ROS Distro: rolling
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/17312

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Linux-rhel Build Status
  • Windows Build Status

@ahcorde ahcorde merged commit 8499404 into ros2:rolling Oct 20, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Potential new/delete mismatch when publishing a custom type that has overloaded new/delete operators (e.g. pcl::PointCloud) using type adapters

6 participants