Skip to content

Use reliable QoS for ros2topic tests#555

Merged
cottsay merged 1 commit intomasterfrom
reliable_tests
Jul 31, 2020
Merged

Use reliable QoS for ros2topic tests#555
cottsay merged 1 commit intomasterfrom
reliable_tests

Conversation

@cottsay
Copy link
Copy Markdown
Member

@cottsay cottsay commented Jul 30, 2020

The TestROS2TopicCLI tests perform feature testing of the ros2topic command line interface. If the system is under stress during these tests, messages may be lost (by design). If that happens, there is a fairly high likelihood that the test_topic_pub_once test will fail because there is only one opportunity for the message to be successfully transported. We're likely dropping other messages in this suite, but the other tests continuously publish until one of the messages is received (or a timeout occurs), making them significantly less likely to fail.

Using the 'reliable' setting for QoS reliability seems to make the tests consistently pass, even when the system is placed under additional stress.

Closes #552

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status
  • Linux-repeated Build Status

The TestROS2TopicCLI tests perform feature testing of the ros2topic
command line interface. If the system is under stress during these
tests, messages may be lost (by design). If that happens, there is a
fairly high likelihood that the test_topic_pub_once test will fail
because there is only one opportunity for the message to be successfully
transported. We're likely dropping other messages in this suite, but the
other tests continuously publish until one of the messages is received
(or a timeout occurs), making them significantly less likely to fail.

Using the 'reliable' setting for QoS reliability seems to make the tests
consistently pass, even when the system is placed under additional
stress.

Signed-off-by: Scott K Logan <logans@cottsay.net>
@cottsay cottsay added the bug Something isn't working label Jul 30, 2020
@cottsay cottsay self-assigned this Jul 30, 2020
@cottsay cottsay requested a review from hidmic July 30, 2020 21:29
Copy link
Copy Markdown
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cottsay cottsay merged commit 674791c into master Jul 31, 2020
@delete-merged-branch delete-merged-branch Bot deleted the reliable_tests branch July 31, 2020 17:15
@dirk-thomas
Copy link
Copy Markdown
Member

@cottsay Not sure if this is a coincident but in the past the dev / PR have been mostly green. As of this change they are reliably failing on multiple pending PRs.

@cottsay
Copy link
Copy Markdown
Member Author

cottsay commented Jul 31, 2020

Maybe using reliable QoS pushes the system load a little bit further, causing other tests to suffer in the same way these did?

I can't see how these changes could have directly affected any other tests.

sloretz pushed a commit that referenced this pull request Sep 10, 2020
The TestROS2TopicCLI tests perform feature testing of the ros2topic
command line interface. If the system is under stress during these
tests, messages may be lost (by design). If that happens, there is a
fairly high likelihood that the test_topic_pub_once test will fail
because there is only one opportunity for the message to be successfully
transported. We're likely dropping other messages in this suite, but the
other tests continuously publish until one of the messages is received
(or a timeout occurs), making them significantly less likely to fail.

Using the 'reliable' setting for QoS reliability seems to make the tests
consistently pass, even when the system is placed under additional
stress.

Signed-off-by: Scott K Logan <logans@cottsay.net>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
sloretz added a commit that referenced this pull request Sep 10, 2020
…nsient_local and longer keep-alive for pub tests (#546) Use reliable QoS for ros2topic tests (#555) (#565)

* add --keep-alive option to 'topic pub' (#544)

Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>

* Give kwarg default value for backporting

keep_alive is made a keyword argument with a default value of 0.1 so the
pull request can be backported.

Signed-off-by: Shane Loretz<sloretz@openrobotics.org>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>

* use transient_local and longer keep-alive for pub tests (#546)

* use transient_local and longer keep-alive for pub tests

Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>

* add comment to document unit of --keep-alive

Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>

* Use reliable QoS for ros2topic tests (#555)

The TestROS2TopicCLI tests perform feature testing of the ros2topic
command line interface. If the system is under stress during these
tests, messages may be lost (by design). If that happens, there is a
fairly high likelihood that the test_topic_pub_once test will fail
because there is only one opportunity for the message to be successfully
transported. We're likely dropping other messages in this suite, but the
other tests continuously publish until one of the messages is received
(or a timeout occurs), making them significantly less likely to fail.

Using the 'reliable' setting for QoS reliability seems to make the tests
consistently pass, even when the system is placed under additional
stress.

Signed-off-by: Scott K Logan <logans@cottsay.net>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>

Co-authored-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>
Co-authored-by: Scott K Logan <logans@cottsay.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

'test_topic_pub_once' test flaky

3 participants