Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preparation for E2E latency tuning #786

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

baranovmv
Copy link
Member

In order to be able to tune receiver's latency
relying on timestamp mapping that we get from
RTCP feedback, and UDP::Receive_timestamp,
adding these features:

  • RTCP improvements #674: Use receive timestamp (RTS) as report time when processing RTCP report;

  • RTT dumping for debugging (csvplotter ts_offset branch);

  • SCHED_RR for network io thread (run with root privs).

In order to be able to tune receiver's latency
relying on timestamp mapping that we get from
RTCP feedback, and UDP::Receive_timestamp,
adding these features:

* roc-streaminggh-674: Use receive timestamp (RTS) as report time
  when processing RTCP report;

* RTT dumping for debugging (csvplotter ts_offset branch);

* SCHED_RR for network io thread (run with root privs).
@baranovmv baranovmv requested a review from gavv December 4, 2024 23:10
@github-actions github-actions bot added the ready for review Pull request can be reviewed label Dec 4, 2024
@baranovmv baranovmv force-pushed the feature/rtcp_rts branch 5 times, most recently from c92d155 to bc3295d Compare December 12, 2024 12:58
So that control endpoints on receiver and sender
could mark unmarked packets with a timestamp. By
doing so we avoid necessity of mocking core::timestamp.

This note relates to the pipeline test mostly.
@baranovmv baranovmv force-pushed the feature/rtcp_rts branch 2 times, most recently from a475567 to 5e85289 Compare December 19, 2024 21:41
Copy link

🤖 The latest upstream change made this pull request unmergeable. Please resolve the merge conflicts.

@github-actions github-actions bot added the needs rebase Pull request has conflicts and should be rebased label Jan 16, 2025
Copy link
Member

@gavv gavv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is big.

I like the approach you use for RTS (set early when possible, set later as a fallback).

@@ -57,6 +57,7 @@ bool Thread::enable_realtime() {
errno_to_str(err).c_str());
return false;
}
roc_log(LogDebug, "thread: set realtime priority");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this log before actually calling pthread_setschedparam, so that error will go after, like this:

    roc_log(LogDebug, "thread: setting realtime priority");

    if (int err = pthread_setschedparam(pthread_self(), SCHED_RR, &param)) {
        roc_log(LogError,
                "thread: can't set realtime priority: pthread_setschedparam(): %s",
                errno_to_str(err).c_str());
        return false;
    }

Also let's use LogError instead of LogDebug in the error branch.

@@ -438,13 +438,16 @@ void expect_recv_report(const RecvReport& report,
}
}

packet::PacketPtr read_packet(packet::FifoQueue& source) {
packet::PacketPtr read_packet(packet::FifoQueue& source, core::nanoseconds_t ts = -1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Let's rename ts to recv_ts (too many timestamps).

roc_log(LogInfo,
"network loop: can't set realtime priority of network thread. May need "
"to be root");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be called from NetworkLoop::run() because enable_realtime() enables SCHED_RR for current thread. Ctor is running on main thread.

Also, I think a few more changes are needed.

  1. Add a CLI option (e.g., --rt) that enables real-time priority and set SCHED_RR only if it's provided. SCHED_RR is not desired or expected by all users (especially if you have many things running on your computer), so automatically enabling it when running as root (which is common e.g. when using ALSA) would be surprising.

  2. Ideally CLI option should also accept integer priority which we should use instead of sched_get_priority_max(SCHED_RR). E.g. --rt=99. Because you may want to give different priorities to different tools on system.

  3. If the option was given, but enable_realtime() failed, we should report error and exit. If user asked us to run with SCHED_RR, it would be surprising to silently ignore errors.

    Since enable_realtime() should be called from run(), in ctor we can wait until run() notifies us that it's up and running or it's failed and exited. On error, we'll return error from init_status().

  4. Another thread sensitive to lags is pipeline thread that processes samples and writes them to e.g. ALSA. I think if the option is given, we should enable SCHED_RR there too. We can do it in IoPump ctor.

@@ -51,7 +53,7 @@ struct RttMetrics {
class RttEstimator {
public:
//! Initialize.
RttEstimator(const RttConfig& config);
RttEstimator(core::IArena& arena, const RttConfig& config, dbgio::CsvDumper* dumper);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Usual argument order pattern is "...., arena, dumper".

return code;
}
}

if (repair_endpoint_) {
if ((code = repair_endpoint_->pull_packets(current_time)) != status::StatusOK) {
if ((code = repair_endpoint_->pull_packets(0)) != status::StatusOK) {
return code;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as for receiver slot.

return code;
}
}

if (repair_endpoint_) {
if ((code = repair_endpoint_->pull_packets(current_time)) != status::StatusOK) {
if ((code = repair_endpoint_->pull_packets(0)) != status::StatusOK) {
return code;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since current_time is used only as a fallback when RTS isn't set, we can safely pass it to all calls I guess? Because why not to guarantee RTS for source and repair packets as well.

, last_report_ts_(0)
, dumper_(dumper)
, rtt_stats_(arena, 15, 0.5)
, clock_offset_stats_(arena, 100, 0.5) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this constants to be RttConfig fields with default values.

@@ -3818,97 +3832,6 @@
}
}

// Same, but there is a persistent clock drift between sender and receiver
TEST(communicator, rtt_clock_drift) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this test doesn't work due to smoothing of RTT, then we can disable smoothing specifically in this test by setting RttConfig fields (in some comment above I suggested to move smoothing settings to config).

gavv

This comment was marked as duplicate.

@gavv
Copy link
Member

gavv commented Jan 16, 2025

In test_communicator.cpp, there are a several calls to read_packet() which provide timestamp argument, but that makes little sense.

Since it's receive timestamp, it has meaning only when we're "delivering" packet to a communicator (and it should correspond to it's "local" time). But in cases when we're reading packet just to inspect it and throw away, timestamp is not actually used and specifying it looks confusing IMHO.

Diff below removes timestamps that are not needed.

diff --git a/src/tests/roc_rtcp/test_communicator.cpp b/src/tests/roc_rtcp/test_communicator.cpp
index fbe1e0d5..227dd902 100644
--- a/src/tests/roc_rtcp/test_communicator.cpp
+++ b/src/tests/roc_rtcp/test_communicator.cpp
@@ -3107,7 +3107,7 @@ TEST(communicator, report_to_address_sender) {
     CHECK_EQUAL(1, send_comm.total_destinations());
     CHECK_EQUAL(1, send_queue.size());
 
-    pp = read_packet(send_queue, send_time);
+    pp = read_packet(send_queue);
     expect_has_dest_address(pp, send_dest_addr);
     expect_has_orig_ssrc(pp, SendSsrc, true);
     expect_has_dest_ssrc(pp, Recv1Ssrc, true);
@@ -3279,7 +3279,7 @@ TEST(communicator, report_back_sender) {
     CHECK_EQUAL(1, send_comm.total_destinations());
     CHECK_EQUAL(1, send_queue.size());
 
-    pp = read_packet(send_queue, send_time);
+    pp = read_packet(send_queue);
     expect_has_dest_address(pp, recv1_addr);
     expect_has_orig_ssrc(pp, SendSsrc, true);
     expect_has_dest_ssrc(pp, Recv1Ssrc, true);
@@ -3320,13 +3320,13 @@ TEST(communicator, report_back_sender) {
     CHECK_EQUAL(2, send_comm.total_destinations());
     CHECK_EQUAL(2, send_queue.size());
 
-    pp = read_packet(send_queue, send_time);
+    pp = read_packet(send_queue);
     expect_has_dest_address(pp, recv1_addr);
     expect_has_orig_ssrc(pp, SendSsrc, true);
     expect_has_dest_ssrc(pp, Recv1Ssrc, true);
     expect_has_dest_ssrc(pp, Recv2Ssrc, false);
 
-    pp = read_packet(send_queue, send_time);
+    pp = read_packet(send_queue);
     expect_has_dest_address(pp, recv2_addr);
     expect_has_orig_ssrc(pp, SendSsrc, true);
     expect_has_dest_ssrc(pp, Recv1Ssrc, false);
@@ -3423,7 +3423,7 @@ TEST(communicator, report_back_receiver) {
     CHECK_EQUAL(1, recv_comm.total_destinations());
     CHECK_EQUAL(1, recv_queue.size());
 
-    pp = read_packet(recv_queue, send1_time);
+    pp = read_packet(recv_queue);
     expect_has_dest_address(pp, send1_addr);
     expect_has_orig_ssrc(pp, RecvSsrc, true);
     expect_has_dest_ssrc(pp, Send1Ssrc, true);
@@ -3469,13 +3469,13 @@ TEST(communicator, report_back_receiver) {
     CHECK_EQUAL(2, recv_comm.total_destinations());
     CHECK_EQUAL(2, recv_queue.size());
 
-    pp = read_packet(recv_queue, recv_time);
+    pp = read_packet(recv_queue);
     expect_has_dest_address(pp, send1_addr);
     expect_has_orig_ssrc(pp, RecvSsrc, true);
     expect_has_dest_ssrc(pp, Send1Ssrc, true);
     expect_has_dest_ssrc(pp, Send2Ssrc, false);
 
-    pp = read_packet(recv_queue, recv_time);
+    pp = read_packet(recv_queue);
     expect_has_dest_address(pp, send2_addr);
     expect_has_orig_ssrc(pp, RecvSsrc, true);
     expect_has_dest_ssrc(pp, Send1Ssrc, false);
@@ -3606,14 +3606,14 @@ TEST(communicator, report_back_combine_reports) {
     CHECK_EQUAL(2, recv_comm.total_destinations());
     CHECK_EQUAL(2, recv_queue.size());
 
-    pp = read_packet(recv_queue, send3_time);
+    pp = read_packet(recv_queue);
     expect_has_dest_address(pp, send1_addr);
     expect_has_orig_ssrc(pp, RecvSsrc, true);
     expect_has_dest_ssrc(pp, Send1Ssrc, true);
     expect_has_dest_ssrc(pp, Send2Ssrc, true);
     expect_has_dest_ssrc(pp, Send3Ssrc, false);
 
-    pp = read_packet(recv_queue, send3_time);
+    pp = read_packet(recv_queue);
     expect_has_dest_address(pp, send3_addr);
     expect_has_orig_ssrc(pp, RecvSsrc, true);
     expect_has_dest_ssrc(pp, Send1Ssrc, false);
@@ -4269,7 +4269,7 @@ TEST(communicator, rtt_network_reordering) {
                 recv_reorder_countdown = ReorderBurst;
             }
             recv_reorder_countdown--;
-            packet::PacketPtr pp = read_packet(recv_queue, recv_time);
+            packet::PacketPtr pp = read_packet(recv_queue);
             recv_packet_stash.push_back(*pp);
         }

@gavv gavv added needs revision Pull request should be revised by its author and removed ready for review Pull request can be reviewed labels Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs rebase Pull request has conflicts and should be rebased needs revision Pull request should be revised by its author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants