Skip to content

Conversation

@mechwiz
Copy link
Contributor

@mechwiz mechwiz commented Aug 10, 2022

This PR addresses a few bugs:

  1. Race condition - Ideally if a new goal comes in from the action server, we should see a different new_external_msg when reading from traj_msg_external_point_ptr_.readFromRT(); and an active goal when reading from *rt_active_goal_.readFromRT(); within the same update cycle. However I noticed based on a project I'm working on that after a new goal comes in, sometimes when traj_msg_external_point_ptr_.readFromRT(); is called, it still returns the old trajectory for a cycle (most likely since the new one is still being written to the RT thread from the nonRT thread. However, because *rt_active_goal_.readFromRT(); is called later within the update cycle, it actually returns that there is an active goal causing the JTC to return success immediately since the trajectory still cached within the JTC has indeed been completed (now for a second time). See log output below. Note I added the "new message received" log to note when the new_external_msg within the update callback had actually been changed. That log should always occur between the logs for "Goal request accepted!" and "Goal reached, success" but here, you can see it occurred after both. Another sanity check is that the time difference between "Goal request accepted!" and "Goal reached, success!" is super small (like less than 0.1ms) implying that the goal returned success immediately.
[ros2_control_node-9] [INFO] [1656865754.199093651] [RightArm.position_traj_controller]: Received new action goal
[ros2_control_node-9] [INFO] [1656865754.199622192] [RightArm.position_traj_controller]: Accepted new action goal
[move_group-1] [INFO] [1656865754.199809730] [moveit.simple_controller_manager.follow_joint_trajectory_controller_handle]: /RightArm/position_traj_controller started execution
[move_group-1] [INFO] [1656865754.199831511] [moveit.simple_controller_manager.follow_joint_trajectory_controller_handle]: Goal request accepted!
[ros2_control_node-9] [INFO] [1656865754.200516294] [RightArm.position_traj_controller]: Goal reached, success!
[ros2_control_node-9] [WARN] [1656865754.203480867] [RightArm.position_traj_controller]: new message received
[move_group-1] [INFO] [1656865754.204146901] [moveit.simple_controller_manager.follow_joint_trajectory_controller_handle]: Controller '/RightArm/position_traj_controller' successfully finished
[move_group-1] [INFO] [1656865754.212873413] [moveit_ros.trajectory_execution_manager]: Completed trajectory execution with status SUCCEEDED ...

This PR therefore moves when we read from *rt_active_goal_.readFromRT(); to be at the same point in time as when we read from traj_msg_external_point_ptr_.readFromRT(); (which makes sense I think just in general in that it's probably good practice to query all relevant values from the RT thread at the same point in time). This fixes the race condition since I did not see this problem come up again when testing my project.

  1. Due to moving when we query *rt_active_goal_.readFromRT(); to be earlier than it originally was in the update callback I noticed that some of the JTC tests were failing. Turns out it was because after a new goal came in, new_external_msg had been updated but the active_goal had not yet been (though it would have if it was called later in the code in the same cycle). To handle this, I added a check to see if there is a mismatch between when querying the active goal status earlier in the code and later in the code and if so, to check whether the new_external_msg had also been updated that cycle. If the new_external_msg had not been updated that cycle and is still the same as the current_external_msg (i.e. the old one), then we skip that cycle. Otherwise we can continue the cycle since we have already updated the active_goal status when we checked the second time.

  2. There is an interpolation bug that occurs in the following case:

  • Open loop control (so the last_commanded_state_ is prepended when a new trajectory comes in)
  • Continuous joint on hardware that returns values between -PI and PI and doesn't wrap around (i.e. does not give feedback above PI or below -PI)

This can lead to cases where the commanded state can be -PI but the feedback the joint gives is PI due to encoder resolution. When a new trajectory comes, it usually is based on the feedback (so PI in this case). When the trajectory is prepended with the last commanded state, you then can encounter the JTC trying to interpolate between -PI and PI really really quickly causing faults on the hardware. This is because interpolation does not currently take into account the shortest-angle. This PR fixes this.

@mechwiz mechwiz force-pushed the mwiz/fix_jtc_bugs branch from 775d073 to 87d43a7 Compare August 10, 2022 13:57
@mechwiz mechwiz changed the title [JTC] Fix race condition & interpolation bug, and re-enable tests [JTC] Fix race condition & interpolation bug Aug 10, 2022
@mechwiz
Copy link
Contributor Author

mechwiz commented Dec 15, 2022

@AndyZe @JafarAbdi

@bmagyar
Copy link
Member

bmagyar commented May 1, 2023

Closing in favour of #565 if you can recreate the issue with a test after the above is merged, feel free to open a new PR

@bmagyar bmagyar closed this May 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants