Skip to content

[Galactic] Backport pybind11 changes#803

Merged
sloretz merged 10 commits intogalacticfrom
backport_pybind11_galactic
Aug 11, 2021
Merged

[Galactic] Backport pybind11 changes#803
sloretz merged 10 commits intogalacticfrom
backport_pybind11_galactic

Conversation

@sloretz
Copy link
Copy Markdown
Contributor

@sloretz sloretz commented Jun 23, 2021

This backports the Pybind11 changes that were intended for Galactic but couldn't be merged due to the Galactic freeze. #777

ahcorde and others added 10 commits June 23, 2021 16:17
* Convert Publisher and Subscription to C++ Classes

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Fixed destructor issues

Signed-off-by: ahcorde <ahcorde@gmail.com>

* [pub/sub 756 addition] use std::shared_ptr holder type and shared_from_this (#757)

* Use std::shared_ptr holder type

Signed-off-by: Shane Loretz <sloretz@openrobotics.org>

* Use shared_from_this for pub/sub

Signed-off-by: Shane Loretz <sloretz@openrobotics.org>

* Make Timer keep Clock wrapper

Signed-off-by: Shane Loretz <sloretz@openrobotics.org>

* Fixed publisher and subscription destructors

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Added feedback

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Fixed docs

Signed-off-by: ahcorde <ahcorde@gmail.com>

Co-authored-by: Shane Loretz <sloretz@openrobotics.org>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
* Class object

Signed-off-by: Greg Balke <greg@openrobotics.org>

* Python adjustments

Signed-off-by: Greg Balke <greg@openrobotics.org>

* Adjustments to server and enum

Signed-off-by: Greg Balke <greg@openrobotics.org>

* Adjust destructor calls

Signed-off-by: Greg Balke <greg@openrobotics.org>

* Clear out destructor

Signed-off-by: Greg Balke <greg@openrobotics.org>

* Do what rclcpp did

Signed-off-by: Greg Balke <greg@openrobotics.org>

* Remove iostream (used for debugging)

Signed-off-by: Greg Balke <greg@openrobotics.org>

* Matching upstream changes

Signed-off-by: Greg Balke <greg@openrobotics.org>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
* Convert ActionClient to use C++ classes

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Added feedback

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Added convert_from_py to utils

Signed-off-by: ahcorde <ahcorde@gmail.com>

* improved docs

Signed-off-by: ahcorde <ahcorde@gmail.com>

* nits to adapt the code to pybind11

Signed-off-by: ahcorde <ahcorde@gmail.com>

* throw a exception in convert_from_py

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Fixed docs

Signed-off-by: ahcorde <ahcorde@gmail.com>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
* Convert ActionServer to use C++ Classes

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Fixed Test_action_server

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Improved docs

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Improved docs and minor fixes

Signed-off-by: ahcorde <ahcorde@gmail.com>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
* Convert WaitSet to use C++ Classes

Signed-off-by: ahcorde <ahcorde@gmail.com>

* added feedback

Signed-off-by: ahcorde <ahcorde@gmail.com>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
Signed-off-by: ahcorde <ahcorde@gmail.com>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
* Convert Guardcondition to use C++ classes

Signed-off-by: ahcorde <ahcorde@gmail.com>

* make destroy guard condition capsule function private

Signed-off-by: ahcorde <ahcorde@gmail.com>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
* Add GoalEvent back to module

Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>

* self._goal -> self._goal_handle

Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>

* Remove unused API

Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>

Co-authored-by: ahcorde <ahcorde@gmail.com>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
Signed-off-by: ahcorde <ahcorde@gmail.com>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
* convert Node and Context to use C++ Classes

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Added feedback

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Added feedback

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Implement solution 1

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Prevent double early destruction of destroyable

This along with an extra call to `destroy_node()` in `Node.__del__` was
causing a segfault in the test_publisher test. The problem is
`destroy_when_not_in_use()` was being called twice, and the second time
is now a problem because rcl_ptrs_ is dereferenced in the qos_event
destroy() method.

Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>

* Remove unnecessary Node.__del__

This should not be required. If it is, it means there's a bug that shows
when the Node is garbage collected.

Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>

* Fix order of pointers in RclPtrs structs

The order of the pointers in the RclPtrs structs is important. Things at
the bottom are destroyed first, so children must be placed underneath
the parent's that must be kept alive.

Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>

* Intentionally copy rclpy:: objects instead of having RclPtrs structs (#773)

* Rough pass converting everything

Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>

* Fixes to get copy destruction order working

Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>

* Fix qos_event test flakiness

Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
Signed-off-by: ahcorde <ahcorde@gmail.com>

* MacOS warning

Signed-off-by: ahcorde <ahcorde@gmail.com>

* added feedback and removed some files

Signed-off-by: ahcorde <ahcorde@gmail.com>

* Create only one context otherwise throw a exception

Signed-off-by: ahcorde <ahcorde@gmail.com>

Co-authored-by: Shane Loretz <sloretz@osrfoundation.org>
Co-authored-by: Shane Loretz <sloretz@openrobotics.org>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
@sloretz sloretz self-assigned this Jun 23, 2021
@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented Jun 23, 2021

Full CI (repos URL: https://raw.githubusercontent.com/ros2/ros2/galactic/ros2.repos)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented Jun 24, 2021

CI LGTM! ros2param.ros2param.test.test_verb_list.test_verb_list and ros2param.ros2param.test.test_verb_load.test_verb_load are known flaky tests - Here's a Galactic nigthly where they failed https://build.ros2.org/view/Gci/job/Gci__nightly-release_ubuntu_focal_amd64/8/

@sloretz sloretz requested a review from ivanpauno July 23, 2021 18:20
@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented Jul 23, 2021

@ivanpauno May I ask you to double check this PR before I merge it?

Copy link
Copy Markdown
Member

@ivanpauno ivanpauno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only reviewed modifications in .py files, including tests, to detect possible API breaking changes.
I don't have time to look at all the changes in the cpp files, it would be good if someone that originally worked in these changes can take another look to this PR.

I'm okay with the backport if we're not breaking API and CI is green.

Comment on lines -147 to +146
with sub.handle:
node.destroy_subscription(sub)
# handle valid because it's still being used
with sub.handle:
pass
node.destroy_subscription(sub)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this test changed?
is the subscription still valid after the node.destroy_subscription() call?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question applies to the similar changes in this file and others

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was changed to prevent a case where destruction could be blocked forever.
There's a longer description in the body of #739. In short, the subscription is still valid to existing users, but no new users are allowed. This prevents new users extending an object's lifetime indefinitely.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm, I don't understand exactly, I will have to look at the code a bit more.

What does happen with the old code snippet now?
Will it throw an InvalidHandle exception? Will it block?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the inner with sub.handle to raise InvalidHandle.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I understand the change now.
Do we still the Handle class? It seems to have been replaced with shared ptrs and Destroyable (I'm not sure if completely though).

There's a minimal chance this breaks pre-existing code, but I don't think anybody was relying on this behavior.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On master the Handle class is gone, but I'm not sure which PR to backport. The only use I see of it on this branch is in test_handle.py.

Copy link
Copy Markdown
Member

@ivanpauno ivanpauno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with green full CI.

I would also prefer if someone involved in the original work can left an approval.

@hidmic hidmic self-requested a review August 9, 2021 17:06
Copy link
Copy Markdown
Contributor

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skimmed through the PR. It looks good. I left a few comments, nothing to address here.

I do have to ask though: are we absolutely sure we want to backport over ~3000 lines of code into an stable distribution?

try:
if self.__context is not None:
raise RuntimeError
except AttributeError:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sloretz nit: I know this is in master, but it's still unusual. Why not set self.__context to None in Context.__init__()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this is strange. I opened #812 to try setting it to None in __init__()

rcl_get_error_string().str);
rcl_reset_error();
}
delete action_client;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sloretz meta: in retrospect, I wonder if we should've used a Python allocator-backed std::allocator for this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah! You mean like the allocator rclpy::Client is using? Looks lt's the only one using it.

rcl_client_ = std::shared_ptr<rcl_client_t>(
PythonAllocator<rcl_client_t>().allocate(1),
[node](rcl_client_t * client)
{
// Intentionally capture node by value so shared_ptr can be transferred to copies
rcl_ret_t ret = rcl_client_fini(client, node.rcl_ptr());
if (RCL_RET_OK != ret) {
// Warning should use line number of the current stack frame
int stack_level = 1;
PyErr_WarnFormat(
PyExc_RuntimeWarning, stack_level, "Failed to fini client: %s",
rcl_get_error_string().str);
rcl_reset_error();
}
PythonAllocator<rcl_client_t>().deallocate(client, 1);
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed.

}
}
PyMem_FREE(context);
rcl_context_.reset();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sloretz meta: one thing that's probably not great is that, if it weren't for the consistent with obj: where obj is a Destroyable, you could easily crash the process by calling APIs after an explicit destroy() -- even in a single threaded setting where there's no risk for concurrent destruction making that with block somewhat unnecessary.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah, maybe that with obj: logic could be moved into the APIs themselves. Maybe a DestroyableScopeGuard or something with https://pybind11.readthedocs.io/en/stable/advanced/functions.html#call-guard that did the equivalent of __enter__ and __exit__ could replace that, and get rid of one of the jumps between Python and C++ at the same time.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An implicit call guard would do, yeah.

@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented Aug 11, 2021

I do have to ask though: are we absolutely sure we want to backport over ~3000 lines of code into an stable distribution?

That's a great question to ask. I think it's important in this case since it enables more backports, and makes the destruction handling of entities consistent.

@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented Aug 11, 2021

CI (galactic repos file + this branch, build: --packages-above-and-dependencies rclpy test: --packages-above rclpy)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented Aug 11, 2021

CI LGTM, merging 🎉

@sloretz sloretz merged commit 4233111 into galactic Aug 11, 2021
@delete-merged-branch delete-merged-branch bot deleted the backport_pybind11_galactic branch August 11, 2021 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants