Fix any possible fillBuffer_ race conditions by debouncing all fillBuffers_ #959

imbcmdth · 2017-01-13T23:55:28Z

Description

On a rendition switch, SegmentLoader seems to be able to enter a state where it fetches the same segment twice at the same time causing bandwidth starvation. The requests then timeout and an emergency rendition switch which can trigger the same behavior again.

What's gone wrong?

The root problem is that handleUpdateEnd_ can trigger a rendition switch (by emitting the 'progress' event). Eventually, a rendition switch results in load being called on the SegmentLoader which has the side effect of calling fillBuffer_ and starting a request for a segment. At the same time, the original handleUpdateEnd_ function calls fillBuffer_ itself triggering a second request.

Why doesn't this happen every time we change renditions?

The issue is "masked" by two different mechanisms each of which can stop the second request from happening:

If the rendition we are switching to has never been loaded before, the PlaylistLoader first must make a request for the renditions playlist/manifest. That introduces asynchronicity that results in the SegmentLoader being in a state (waiting) so that fillBuffer_ is never called by SegmentLoader#load.
If there is buffered data more than 30 seconds before the playhead, the first call to fillBuffer_ will, via loadSegment_, result in a SourceUpdater.remove that causes the SourceBuffer#updating to be true. When the next fillBuffer_ runs, it will exit early because the sourceBuffer is updating.

Specific Changes proposed

Convert all calls to fillBuffer_ into calls to monitorBuffer_
Rename monitorBuffer_ to monitorBufferTick_ which becomes the 500ms buffer check timer-loop
Make a new monitorBuffer_ that schedule an immediate (1 ms) timer for monitorBufferTick_

Requirements Checklist

Feature implemented / Bug fixed
If necessary, more likely in a feature request than a bug fix
- Unit Tests updated or fixed
- Docs/guides updated
- Example created (starter template on JSBin)
Reviewed by Two Core Contributors

…ffers_ - Convert all calls to fillBuffer_ to calls to monitorBuffer_ - Rename monitorBuffer_ to monitorBufferTick_ which becomes the 500ms buffer check timer loop - Make monitorBuffer_ schedule an immediate timer for monitorBufferTick_

gesinger · 2017-01-19T19:34:15Z

src/segment-loader.js

+   *
+   * @private
+   */
+  monitorBuffer_() {


Why do we need this in addition to monitorBufferTick_? It seems that by changing calls from fillBuffer_ to monitorBuffer_, we already get the guard that checks whether the state is READY.

fillBuffer_, if it starts a segment request, will synchronously get to the WAITING state. This means that any subsequent calls to monitorBuffer would be safe with the READY guard.

The real difference is in how long in the future things are scheduled. Calling monitorBuffer_ forces the tick to happen next (it either schedules or reschedules the timer tick to be just one millisecond). Whereas monitorBufferTick_ always schedules the next time to be 500ms in the future.

Originally, I had a piece of code with an optional parameter to monitorBuffer_ to set/reset different timer durations in one function but I decided that was "too clever".

As a followup, spoke with @imbcmdth , and he pointed out that monitorBuffer_ contextually is async, and is safer as async when there are events that are being listened to that may happen around a segment's end of processing (as an example, if monitorBuffer_ was synchronous, by the time listeners received the events, a new segment may have already been requested and a lot of state changed).

gesinger · 2017-01-19T21:11:29Z

test/segment-loader.test.js


  // so the loader should try the next segment
  assert.equal(this.requests[0].url, '1.ts', 'moved ahead a segment');

  // verify stats
  assert.equal(loader.mediaBytesTransferred, 20, '20 bytes');
-  assert.equal(loader.mediaTransferDuration, 2, '2 ms (clocks above)');
+  assert.equal(loader.mediaTransferDuration, 1, '1 ms (clocks above)');


Why did this decrease?

Prior to this change, the request happened immediately. As a result, there were two clock ticks between the request and the response. Now, the request happens later and only one tick happens now (the first tick becomes the tick that triggers the request).

gesinger

LGTM

forbesjo · 2017-01-24T15:07:45Z

LGTM

gesinger reviewed Jan 19, 2017

View reviewed changes

gesinger approved these changes Jan 19, 2017

View reviewed changes

forbesjo assigned forbesjo and unassigned forbesjo Jan 20, 2017

forbesjo self-requested a review January 24, 2017 15:07

forbesjo approved these changes Jan 24, 2017

View reviewed changes

mjneil merged commit 409313c into master Jan 24, 2017

mjneil deleted the fill-buffer-fix branch January 24, 2017 18:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix any possible fillBuffer_ race conditions by debouncing all fillBuffers_ #959

Fix any possible fillBuffer_ race conditions by debouncing all fillBuffers_ #959

imbcmdth commented Jan 13, 2017 •

edited

Loading

gesinger Jan 19, 2017

imbcmdth Jan 19, 2017

gesinger Jan 19, 2017

gesinger Jan 19, 2017

imbcmdth Jan 20, 2017

gesinger left a comment

forbesjo commented Jan 24, 2017

Fix any possible fillBuffer_ race conditions by debouncing all fillBuffers_ #959

Fix any possible fillBuffer_ race conditions by debouncing all fillBuffers_ #959

Conversation

imbcmdth commented Jan 13, 2017 • edited Loading

Description

What's gone wrong?

Why doesn't this happen every time we change renditions?

Specific Changes proposed

Requirements Checklist

gesinger Jan 19, 2017

Choose a reason for hiding this comment

imbcmdth Jan 19, 2017

Choose a reason for hiding this comment

gesinger Jan 19, 2017

Choose a reason for hiding this comment

gesinger Jan 19, 2017

Choose a reason for hiding this comment

imbcmdth Jan 20, 2017

Choose a reason for hiding this comment

gesinger left a comment

Choose a reason for hiding this comment

forbesjo commented Jan 24, 2017

imbcmdth commented Jan 13, 2017 •

edited

Loading