ros2 bag play - start/pause play next interaction by MichaelOrlov · Pull Request #680 · ros2/rosbag2

MichaelOrlov · 2021-03-16T07:30:57Z

Added new API for play/pause, change playback ratio and play next in pause.
Added key press handling for newly added APIs.
Key press handling implemented for POSIX compatible OS like Linux.
play/pause = "space bar", increase playback speed = "cursor up", decrease playback speed = "cursor down", play next message = "cursor forward".
Made Player class publicly available to be able use it in other applications and directly in tests.
Add small "bugfix"/workaround (add delay after playback) found on CI at Apex.AI related to the issue when publishers created in player class got destroyed before they sent last messages on the wire. The rational for adding this workaround in this PR is that it's small (1-2 lines) and all other logic depends from it in this PR.
Made some cleanup and major overhaul in test_play_timing.cpp

Special consideration and pay attention was made to keep original performance and deterministic playback in runtime i.e. avoiding locking mutexes and busy loops in runtime phase of the playback. Also keeping original concept for avoiding accumulation of errors in playback time.

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

…f playback. Without delay publishers could be destroyed before they will sent last message on wire. Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

…lay next Bloc Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

…ming to play. Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Karsten1987 · 2021-03-16T18:28:34Z

thanks @MichaelOrlov for the PR. While I appreciate to have such a feature-rich PR, I believe we should break these down in individual PRs. It's great that you've found a bug for the publisher clean up, I believe this should be contributed anyhow independent from the pause/resume feature.

We've discussed on #443 (comment) that at least the keyboard handling shall be dealt with external to the player. Ideally, in the ros2bag python package to ease the cross-platform keyboard handling.
Also, we were playing with the idea of featuring a rclcpp_lifecycle Node, which implicitly provides services and further command line tools integration for handling a start/stop functionality.

Can you elaborate why the player API has to be public? You mentioned you wanted to use this in other applications. What are these applications?

MichaelOrlov · 2021-03-16T20:43:56Z

Hi @Karsten1987.
When you mention about breaking this down on several PSs. What divisions you had in mind?

As regards to the small bugfix/workaround with delay at the end of playback. It's literally one line of code and one line of comment to it with explanation why. I can do a separate PR for it if you want. Please let me know.

As regards to the proposal to moving keyboard handling outside of the player in python script. In theory it looks reasonable and easy way. On practice we have extra layer Rosbag2Transport::play() which we are calling from python script. Keyboard handling imply interaction with 2 threads running in parallel doing this from python will require to use Player class directly without Rosbag2Transport layer and deal with multi-threading in python - it will cause major redesign in python script and Rosbag2Transport.
With the giving design of the Rosbag2 hierarchy and class layers, handling keyboard pressing inside Rosbag2Transport::play() is the only option. BTW. I've spent some effort to implement it in a good/safe way.
I would propose in future transform Rosbag2Transport class and all his member functions to the namespace and make API self contained without any constructors/destructors and init/fini methods. If you will look on the API, all methods considered as independent entities and doesn't imply to be dependent from each other, like Play and Record. Also propose to rename file rosbag2_transport.cpp to the rosbag2_infrastructure.cpp because it's better describes what it's holding and doing inside. But this is for separate PRs and activities.
So well, while we are looking for the best cross-platform design with keyboard handling, people struggling without this essential functionality. I would suggest to accept current implementation with keyboard handling and improve/extend it in future.

As regards to the idea of featuring a rclcpp_lifecycle Node, which implicitly provides services and further command line tools integration for handling a start/stop functionality.
It might be ok if you are planing to have only play/pause API. And I think this is not going to work when you will add possibility to change playback rate in pause and "play next" in pause.
Consider the case:

Rosbag recorded messages publishing on some topic with inter-message interval in 100 mSec.
Replaying this recorded messages with original speed and going in pause mode.
Increasing playback speed 2x.
Playing a few messages manually while in pause via play-next message API.
Resume playing messages.

After resuming, unplayed messages shall start playing with 2x speed i.e. 50 mSec interval and time spend in pause should be taken in to account to avoid playing messages without required interval. The complication arise that you can't just simply sleep for 50 mSec in playback routine. Because it will cause accumulation of errors in time delay. Need to respect start time and time at what time particular message was recorded relative to the start of the recording and time spend in pause and also different playback speed.
I don't think that it's possible to implement this scenario with rclcpp_lifecycle Node. At least in a good way without spaghetti style in design and code.
My implementation successfully handle all these complicated cases and playing messages deterministically in time domain.

As regards to your question:

Can you elaborate why the player API has to be public?

We are at Apex.AI considering to use it in Rosbag Coordinator application to be able deterministically replay complicated scenarios with high bandwidth from multiple nodes and perhaps in future port of rosbag rqt.

emersonknapp · 2021-03-17T17:59:59Z

When you mention about breaking this down on several PSs. What divisions you had in mind?

You've listed 5 independent changes in your PR description - this sounds like 5 PRs.

EDIT: actually, based on point 1 - which says "play/pause, change playback ratio, and play next" - this itself could even be 3 PRs introducing the features one at a time. I'd also like to find out how this approach works with the proposal in #675 . It is an important feature set, I think we need to have a consensus on the design before just merging new code

As regards to the small bugfix/workaround with delay at the end of playback. It's literally one line of code and one line of comment to it with explanation why. I can do a separate PR for it if you want. Please let me know.

One single line of code may look small, but it needs independent verification, and one wrong line is all it takes to make a program work incorrectly. If it is in a separate PR/commit - there is a chain of review and git blame, and is individually able to be reverted.

MichaelOrlov · 2021-03-17T21:03:38Z

Hi @emersonknapp and @Karsten1987 as per initial issue (feature request) initiated by you ros2 bag play - start/pause interaction #55

Like in ROS 1, there should be a chance to control the replaying of a bag file.
spacebar should toggle the start/pause replaying.
s should solely spin the bag once.

Generally, as for now, there is no status feedback given when replaying a bag file. As a first step, there should be a message >printed, whether the bag file is currently publishing data or not.

You listed "play/pause, play next, and keyboard handling" in this feature and we already have had interface for change playback ratio on start. I only implemented runtime API for it to be able to changed it from keyboard.

I've implemented everything what you have been asking for in this feature.
And now you are trying to consider these changes as a separate features and forcing me to split them?

Perhaps I will be able to split keyboard handling and bugfix with delay to the separate PRs. As a consequences they will have dependencies in the order of merging them.
But please don't ask me to split play/pause, play-next and changing playback ratio. They all have dependencies from each other and tightly coupled together.

When I was developing/upstreaming this feature I came across to the situation that I can't proceed without those bugfix and I have to add it on the early stage. Then I found out that I can't proceed without refactoring testing code. Otherwise it would be even more spaghetti-sh. I made it and it became backed in changes with dependencies in this PR.

As regards to the

I'd also like to find out how this approach works with the proposal in #675 .

Long story short:
It's perfectly align to each other. #675 could be implemented above my changes and I don't see any possible issues with it. I tried to keep original concept of playing messages. And implementation of "Rosbag2 time, driven by external clock source" feature above my changes will not be even harder if you would implement it without my changes. It's just different features for different use cases.

Ok. A little bit longer story and details about #675.
We have

These are the time situations to handle:

Steady Time - Rosbag2 keeps time internally with reference to a monotonic system clock, and neither publishes or subscribes to /clock

This is the default behavior

Rosbag2 acts as ROS Time Source by publishing to the /clock topic (a simple extension of case 1)

Chosen by --clock option to ros2 bag play

Rosbag2 is driven by an external ROS Time Source - most commonly the Gazebo simulator

In this case, Rosbag2 cannot use "time-control" features presented above, since it is a passive consumer of time

Chosen by --use-sim-time argument to ros2 bag play

We mostly concern about item 3. "Rosbag2 is driven by an external ROS Time Source - most commonly the Gazebo simulator" - right?

Though we are going to change how time flowing on low level. My implementation will live above it and will not know that something got changed. If someone will change slow down time flow twice. My playback rate will be relative to it. All logic will be valid and will be working as nothing got changed. Even pause-resume will be working the same way.

And here are the reasons for this:
Following statement will be still true after my changes:

Notes on current implementation (as of writing this design)

rosbag2_transport::Player currently uses std::chrono::system_clock to query time, and std::this_thread::sleep_until to wait between publishing messages, with explicit handling of playback rate.

And proposal:

Pass a PlayerClock instance to rosbag2_transport::Player

Use PlayerClock::now to query starting time

Use PlayerClock::sleep_until between messages

rclcpp::Clock does not yet implement sleep_until - as noted in Add sleep_for and sleep_until for rcl_clock_t as per "Clock and Time" design rcl#898

Is till a valid good to go plan with addition that it will need to use PlayerClock::now instead of std::chrono::steady_clock::now(); in a few more places. It's literal two more replacements in one Player::pause_resume() method.

To summarize:
I think these two approaches for play/pause and changing playback speed is for two different use cases and purposes and should be both present in Rosbag2.
My implementation giving direct access to these features via exposing API in Player class and intend to be used in applications which want to use Player class and corresponding library directly and as opposed feature "Rosbag2 is driven by an external ROS Time Source" described in #675 is mostly for simulator and remote synchronization.

One more factor to consider is that my approach is ready to be used now. Approach from #675 is only planing to be implemented and have some dependencies and have high uncertainty when it will be ready (may be year(s) after).

@emersonknapp and @Karsten1987 Giving that details above I would like to ask you to take a look on my changes and give concrete proposals what need to change and how to split them on different PRs.

Hope on understanding.
I appreciate your time and hope you will do the same for me.

emersonknapp · 2021-03-17T21:42:59Z

We mostly concern about item 3. "Rosbag2 is driven by an external ROS Time Source - most commonly the Gazebo simulator" - right?

No. #675 considers that as just one of the cases being handled, but it's not the primary concern of the design. That design is concerned with gracefully handling all three cases. But it is largely focused on the "rosbag2 controls time" scenario of cases 1 and 2 - determining how "pause and resume time", "change rate of playback", and "seek to a specific time" work.

In case 3, I am proposing that the user cannot do time control on the rosbag2 playback. The ROS Time Source is external, so the playback is completely driven by the ROS Time Source. There may be other features that can happen in this case, such as "mute playback" - where messages are not published until rosbag2 is "un-muted", but this is not the same feature as "pause/resume".

I think these two approaches for play/pause and changing playback speed is for two different use cases and purposes and should be both present in Rosbag2.

Based on what I have said above, I don't see this to be the case. I think you're right that implementing this functionality in the Player is complex with a lot of edge cases - and I think that the PlayerClock design reduces and manages that complexity by approaching the problem in a different way.

Perhaps I am missing some use cases as to why these features would need to exist side by side, please let me know if that's the case (and that is useful input to a design document). The only thing I see from these features living side by side is this:

"Gazebo controls time, but rosbag2 wants to play back in sync, but at a different rate. For example, Gazebo is simulating at 2x speed, and we want rosbag2 to play back at double the simulation speed (effectively 4x, but looks like 2x) - pausing and resuming when the simulation does so."

And even on considering the above, the PlayerClock could handle this case (if it is wanted) by having a rate multiplier on top of using the external ROS Time Source.

MichaelOrlov · 2021-03-18T22:01:00Z

Hi @emersonknapp Thank you for expressing your opinion and decision in timely manner.
We need to discuss it internally in Apex.AI first.
I will come back to it a little bit latter on.

MichaelOrlov · 2021-04-26T20:33:05Z

Closing this PR as wouldn't merge, since the same functionality will be implemented in the scope of the time control feature #696

MichaelOrlov added 13 commits March 16, 2021 00:40

Add runtime API for playback rate

9dfd008

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Add pause-resume and play next API to the Rosbag2 player

5d56a15

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Add delay after playback as workaround for lost messages at the end o…

9d4e88d

…f playback. Without delay publishers could be destroyed before they will sent last message on wire. Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Respect playback time and speed after pause and play next

dc0fb3b

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Add key press handling for the Rosbag2 player on Linux/POSIX OS

b93e122

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Reduce execution time and general overhaul in test_play_timing.cpp

43c4574

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Add play_pause_resume_respect_play_time test

43730b1

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Make Player play next() synchronous and return true on success

4fd6b13

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Add play_pause_resume_respect_play_time test

b0a1081

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Use steady clock instead of system_clock

6cace90

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Rename playback_time to the playback_duartion

29850b0

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Add time correction for pause begin time

e06a52d

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Add deterministic replay after pause-replay with changing speed and p…

4465b39

…lay next Bloc Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

MichaelOrlov force-pushed the add_pause-resume_play-next_to_player branch from 773eb49 to 4465b39 Compare March 16, 2021 07:41

MichaelOrlov marked this pull request as ready for review March 16, 2021 08:29

MichaelOrlov requested a review from a team as a code owner March 16, 2021 08:29

MichaelOrlov requested review from Karsten1987 and jaisontj and removed request for a team March 16, 2021 08:29

MichaelOrlov mentioned this pull request Mar 16, 2021

Pause/resume ros2 bag play with keyboard controls #55

Closed

4 tasks

Return true from pause_resume() when put on pause and false when resu…

dd24609

…ming to play. Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

MichaelOrlov force-pushed the add_pause-resume_play-next_to_player branch from f467933 to dd24609 Compare March 16, 2021 08:33

MichaelOrlov closed this Apr 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ros2 bag play - start/pause play next interaction#680

ros2 bag play - start/pause play next interaction#680
MichaelOrlov wants to merge 14 commits intoros2:masterfrom
MichaelOrlov:add_pause-resume_play-next_to_player

MichaelOrlov commented Mar 16, 2021 •

edited

Loading

Uh oh!

Karsten1987 commented Mar 16, 2021

Uh oh!

MichaelOrlov commented Mar 16, 2021 •

edited

Loading

Uh oh!

emersonknapp commented Mar 17, 2021 •

edited

Loading

Uh oh!

MichaelOrlov commented Mar 17, 2021

Notes on current implementation (as of writing this design)

Uh oh!

emersonknapp commented Mar 17, 2021 •

edited

Loading

Uh oh!

MichaelOrlov commented Mar 18, 2021

Uh oh!

MichaelOrlov commented Apr 26, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

MichaelOrlov commented Mar 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Karsten1987 commented Mar 16, 2021

Uh oh!

MichaelOrlov commented Mar 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

emersonknapp commented Mar 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaelOrlov commented Mar 17, 2021

Notes on current implementation (as of writing this design)

Uh oh!

emersonknapp commented Mar 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaelOrlov commented Mar 18, 2021

Uh oh!

MichaelOrlov commented Apr 26, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MichaelOrlov commented Mar 16, 2021 •

edited

Loading

MichaelOrlov commented Mar 16, 2021 •

edited

Loading

emersonknapp commented Mar 17, 2021 •

edited

Loading

emersonknapp commented Mar 17, 2021 •

edited

Loading