[JTC] Fix race condition & interpolation bug #410

mechwiz · 2022-08-10T13:36:59Z

This PR addresses a few bugs:

Race condition - Ideally if a new goal comes in from the action server, we should see a different new_external_msg when reading from traj_msg_external_point_ptr_.readFromRT(); and an active goal when reading from *rt_active_goal_.readFromRT(); within the same update cycle. However I noticed based on a project I'm working on that after a new goal comes in, sometimes when traj_msg_external_point_ptr_.readFromRT(); is called, it still returns the old trajectory for a cycle (most likely since the new one is still being written to the RT thread from the nonRT thread. However, because *rt_active_goal_.readFromRT(); is called later within the update cycle, it actually returns that there is an active goal causing the JTC to return success immediately since the trajectory still cached within the JTC has indeed been completed (now for a second time). See log output below. Note I added the "new message received" log to note when the new_external_msg within the update callback had actually been changed. That log should always occur between the logs for "Goal request accepted!" and "Goal reached, success" but here, you can see it occurred after both. Another sanity check is that the time difference between "Goal request accepted!" and "Goal reached, success!" is super small (like less than 0.1ms) implying that the goal returned success immediately.

[ros2_control_node-9] [INFO] [1656865754.199093651] [RightArm.position_traj_controller]: Received new action goal
[ros2_control_node-9] [INFO] [1656865754.199622192] [RightArm.position_traj_controller]: Accepted new action goal
[move_group-1] [INFO] [1656865754.199809730] [moveit.simple_controller_manager.follow_joint_trajectory_controller_handle]: /RightArm/position_traj_controller started execution
[move_group-1] [INFO] [1656865754.199831511] [moveit.simple_controller_manager.follow_joint_trajectory_controller_handle]: Goal request accepted!
[ros2_control_node-9] [INFO] [1656865754.200516294] [RightArm.position_traj_controller]: Goal reached, success!
[ros2_control_node-9] [WARN] [1656865754.203480867] [RightArm.position_traj_controller]: new message received
[move_group-1] [INFO] [1656865754.204146901] [moveit.simple_controller_manager.follow_joint_trajectory_controller_handle]: Controller '/RightArm/position_traj_controller' successfully finished
[move_group-1] [INFO] [1656865754.212873413] [moveit_ros.trajectory_execution_manager]: Completed trajectory execution with status SUCCEEDED ...

This PR therefore moves when we read from *rt_active_goal_.readFromRT(); to be at the same point in time as when we read from traj_msg_external_point_ptr_.readFromRT(); (which makes sense I think just in general in that it's probably good practice to query all relevant values from the RT thread at the same point in time). This fixes the race condition since I did not see this problem come up again when testing my project.

Due to moving when we query *rt_active_goal_.readFromRT(); to be earlier than it originally was in the update callback I noticed that some of the JTC tests were failing. Turns out it was because after a new goal came in, new_external_msg had been updated but the active_goal had not yet been (though it would have if it was called later in the code in the same cycle). To handle this, I added a check to see if there is a mismatch between when querying the active goal status earlier in the code and later in the code and if so, to check whether the new_external_msg had also been updated that cycle. If the new_external_msg had not been updated that cycle and is still the same as the current_external_msg (i.e. the old one), then we skip that cycle. Otherwise we can continue the cycle since we have already updated the active_goal status when we checked the second time.
There is an interpolation bug that occurs in the following case:

Open loop control (so the last_commanded_state_ is prepended when a new trajectory comes in)
Continuous joint on hardware that returns values between -PI and PI and doesn't wrap around (i.e. does not give feedback above PI or below -PI)

This can lead to cases where the commanded state can be -PI but the feedback the joint gives is PI due to encoder resolution. When a new trajectory comes, it usually is based on the feedback (so PI in this case). When the trajectory is prepended with the last commanded state, you then can encounter the JTC trying to interpolate between -PI and PI really really quickly causing faults on the hardware. This is because interpolation does not currently take into account the shortest-angle. This PR fixes this.

mechwiz · 2022-12-15T21:01:47Z

@AndyZe @JafarAbdi

bmagyar · 2023-05-01T20:43:16Z

Closing in favour of #565 if you can recreate the issue with a test after the above is merged, feel free to open a new PR

github-actions bot requested review from TomoyaFujita2016, VX792, aprotyas, bmagyar, destogl, kasiceo and livanov93 August 10, 2022 13:37

mechwiz force-pushed the mwiz/fix_jtc_bugs branch from 775d073 to 87d43a7 Compare August 10, 2022 13:57

mechwiz changed the title ~~[JTC] Fix race condition & interpolation bug, and re-enable tests~~ [JTC] Fix race condition & interpolation bug Aug 10, 2022

mechwiz force-pushed the mwiz/fix_jtc_bugs branch from 5f8e8a4 to e309281 Compare November 23, 2022 22:46

Michael Wiznitzer added 5 commits December 15, 2022 13:40

use shortest angle when interpolating

7030e75

fix race condition and enable tests

8baff6e

add size check

b160948

keep tests disabled

5429597

clang

85e1b18

mechwiz force-pushed the mwiz/fix_jtc_bugs branch from e309281 to 85e1b18 Compare December 15, 2022 18:40

handle edge case more elegantly

1c07c20

mechwiz closed this Jan 4, 2023

mechwiz reopened this Jan 4, 2023

github-actions bot requested review from anfemosa, duringhof, jaron-l and progtologist and removed request for TomoyaFujita2016 and kasiceo January 4, 2023 17:11

mechwiz mentioned this pull request Apr 10, 2023

Fix JTC from immediately returning success #565

Merged

bmagyar closed this May 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[JTC] Fix race condition & interpolation bug #410

[JTC] Fix race condition & interpolation bug #410

Uh oh!

mechwiz commented Aug 10, 2022 •

edited

Loading

Uh oh!

mechwiz commented Dec 15, 2022

Uh oh!

bmagyar commented May 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[JTC] Fix race condition & interpolation bug #410

[JTC] Fix race condition & interpolation bug #410

Uh oh!

Conversation

mechwiz commented Aug 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mechwiz commented Dec 15, 2022

Uh oh!

bmagyar commented May 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mechwiz commented Aug 10, 2022 •

edited

Loading