Skip to content

Broadcast Reconnect#1031

Merged
rryan merged 38 commits intomixxxdj:masterfrom
daschuer:bc_reconnect
Jan 28, 2017
Merged

Broadcast Reconnect#1031
rryan merged 38 commits intomixxxdj:masterfrom
daschuer:bc_reconnect

Conversation

@daschuer
Copy link
Copy Markdown
Member

This PR adds some preference options to control the reconnect behavior in case of broadcast issues.
https://bugs.launchpad.net/mixxx/+bug/1080981
You can now set a retry counter and define a delay between the connection retries.
bcprefs

This PR also fixes a deadlock issue tracked here:
https://bugs.launchpad.net/mixxx/+bug/1532461

@Be-ing
Copy link
Copy Markdown
Contributor

Be-ing commented Oct 26, 2016

I'm confused what the "Use maximum retries" option does. I'd assume that Mixxx would always try to reconnect up to the maximum times defined above that option. What does it do when that is not checked?

@daschuer
Copy link
Copy Markdown
Member Author

If the checkbox is not set Mixxx retries until eternity.
Do you have a good idea for better strings?

@Be-ing
Copy link
Copy Markdown
Contributor

Be-ing commented Oct 27, 2016

I think it would be clearer to flip it around. Label the checkbox "Limit number of reconnection attempts". Gray out the maximum connections field if that box isn't checked.

ConfigKey(BROADCAST_PREF_KEY, "reconnect_delay")).toInt());

// Maximum Retries
spinBoxMaximumReties->setValue(m_pConfig->getValueString(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typos: spinBoxMaximumReties-> spinBoxMaximumRetries

<item row="2" column="0">
<widget class="QLabel" name="label_17">
<property name="text">
<string>Maximum reties</string>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: reties -> retries

<item row="3" column="0" colspan="2">
<widget class="QCheckBox" name="checkBoxUseMaximumRetries">
<property name="text">
<string>Use maximum retires</string>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: retires -> retries

<widget class="QLabel" name="label_16">
<property name="text">
<string>Host</string>
<string>Format </string>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blanks

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new retries block should align with the existingcoding block in the grid layout. Currently the coding block is misaligned.

@esbrandt
Copy link
Copy Markdown
Contributor

If you hammer the streaming server with login attempts, you risk being rate limited or banned. Increasing the delay for each reconnection attempt sounds right, but imo there should be an timeout, since we had an established connection before, and temporary network errors clear quickly.

@daschuer
Copy link
Copy Markdown
Member Author

Now it looks like this:
bcprefs2

@daschuer
Copy link
Copy Markdown
Member Author

@esbrandt:

If you hammer the streaming server with login attempts, you risk being rate limited or banned. Increasing the delay for each reconnection attempt sounds right, but imo there should be an timeout, since we had an established connection before, and temporary network errors clear quickly.

I have not fully understand your comment. Do you propose a change?

Lets have a look at the use-cases:

  1. Unattended broadcasting: Mixxx should do everything a user would do to bring the stream back online without a long interrupt.
  2. Mixxx should not try to reconnect, if the stream was intentional closed by the server.

The current implementation was a result of a discussion: http://www.mixxx.org/forums/viewtopic.php?f=1&t=8613
IMHO these additional settings of this PR are somehow exposing internals to the user, but it was requested like that and I am not able to able to pick good fixed defaults.

Do you have an idea to improve it?

I will try to get this verified by streaming services recommending Mixxx.

@Be-ing
Copy link
Copy Markdown
Contributor

Be-ing commented Oct 27, 2016

Okay, that makes it clearer what the new options do :)

@sblaisot
Copy link
Copy Markdown
Member

if I understand correctly, "Maximum retries" are only used when you check the box "limit number of attempts". So why not just remove the checkbox and use "0" in maximum retries to say "unlimited", adding a tooltip or some hint on that as a label in the interface ?

also, at first, it is not clear if "reconnect delay" is only an int or can be a float. Can we use 0.1 as default to show that it can be a float (assuming it can be, 1s step seems too large) ?

@Be-ing
Copy link
Copy Markdown
Contributor

Be-ing commented Oct 28, 2016

if I understand correctly, "Maximum retries" are only used when you check the box "limit number of attempts". So why not just remove the checkbox and use "0" in maximum retries to say "unlimited", adding a tooltip or some hint on that as a label in the interface ?

I thought about this, but it would be confusing. 0 maximum retries would imply it does not try to reconnect, rather than it continuously tries to reconnect.

@daschuer
Copy link
Copy Markdown
Member Author

daschuer commented Oct 28, 2016 via email

@daschuer
Copy link
Copy Markdown
Member Author

daschuer commented Oct 29, 2016 via email

@daschuer
Copy link
Copy Markdown
Member Author

daschuer commented Nov 2, 2016

I have evaluated other software and it seams that the current implementation is somehow common.
There is a recommended reconnection delays of 5 .. 30 s
However, there is a demand for immediately reconnect.

How about add an additional check-box:
[x] No delay for the first reconnect attempt

and pick a default reconnection delay of 15 s

@esbrandt: Does this fit your needs?

@rryan
Copy link
Copy Markdown
Member

rryan commented Jan 2, 2017

Hm, what was wrong with the mac build that removing getValueString fixes?

@daschuer
Copy link
Copy Markdown
Member Author

daschuer commented Jan 2, 2017

This is the original clang error message I am trying to fix:

[CXX] src/preferences/configobject.cpp
src/preferences/configobject.cpp:342:36: error: explicit specialization of 'getValue' after instantiation
QString ConfigObject<ConfigValue>::getValue(
                                   ^
src/preferences/configobject.h:132:16: note: implicit instantiation first required here
        return getValue(key, default_value);

gcc compiles that without complaining. IMHO gcc is right, but since the getValueString with a type conversion on the users side looks ugly anyway, I have decided to remove it in favor of the type aware
getValue<>() calls.

The pending issue now is that I have not catched all occurrences.

@daschuer
Copy link
Copy Markdown
Member Author

daschuer commented Jan 2, 2017

I have also send some time in investigating the crasher.

Now we have some safety checks around the possible recursion with flush().

It turns out that your crasher and the provided logs, cannot be caused by a flush() recursion, since it crashes in a straight forward EngineBroadcast::process() call:

Thread 30 Crashed:: EngineBroadcast
0   libmp3lame.dylib              	0x000000011aabc090 quantize_lines_xrpow + 383
1   libmp3lame.dylib              	0x000000011aabb2c4 count_bits + 765
2   libmp3lame.dylib              	0x000000011aab59db outer_loop + 127
3   libmp3lame.dylib              	0x000000011aab6c39 CBR_iteration_loop + 493
4   libmp3lame.dylib              	0x000000011aaa6635 lame_encode_mp3_frame + 2096
5   libmp3lame.dylib              	0x000000011aaac7dd lame_encode_buffer_template + 859
6   libmp3lame.dylib              	0x000000011aaac8a4 lame_encode_buffer_float + 36
7   mixxx                         	0x000000010ee9cb84 EncoderMp3::encodeBuffer(float const*, int) + 308 (encodermp3.cpp:306)
8   mixxx                         	0x000000010efb3d02 EngineBroadcast::process(float const*, int) + 98 (enginebroadcast.cpp:665)
9   mixxx                         	0x000000010efb5081 EngineBroadcast::run() + 705 (enginebroadcast.cpp:897)
10  QtCore                        	0x000000010fcb6a62 QThreadPrivate::start(void*) + 386
11  libsystem_pthread.dylib       	0x00007fff8633e99d _pthread_body + 131
12  libsystem_pthread.dylib       	0x00007fff8633e91a _pthread_start + 168
13  libsystem_pthread.dylib       	0x00007fff8633c351 thread_start + 13

Possible crash causes due to mixxx is an invalid encoder pointer or invalid buffer provided.
I have not found a reason for and invalid encoder. If the provided buffer is invalid, it would crash earlier.

What else might has happen during the crash?

Copy link
Copy Markdown
Member

@rryan rryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great! Thanks for the changes. I'll give it another test.

Comment thread src/preferences/broadcastsettings.cpp Outdated
QString BroadcastSettings::getMetadataFormat() const {
return m_pConfig->getValue(
ConfigKey(kConfigKey, kMetadataFormat),
// No tr() here, see https://bugs.launchpad.net/mixxx/+bug/1419500
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could probably remove the comment since it's also up at kDefaultMetadataFormat.

Comment thread src/preferences/broadcastsettings.cpp Outdated

bool BroadcastSettings::getOggDynamicUpdate() const {
return m_pConfig->getValue(
ConfigKey(kConfigKey, kOggDynamicUpdate), false);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace false with getDefaultOggDynamicUpdate()

Comment thread src/preferences/broadcastsettings.cpp Outdated
}

int BroadcastSettings::getDefaultPort() const {
return -1;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not BROADCAST_DEFAULT_PORT?

Comment thread src/preferences/broadcastsettings.cpp Outdated

bool BroadcastSettings::getStreamPublic() const {
return m_pConfig->getValue(
ConfigKey(kConfigKey, kStreamPublic), false);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

false -> getDefaultStreamPublic()

spinBoxMaximumRetries->setValue(m_settings.getDefaultMaximumRetries());
spinBoxMaximumRetries->setEnabled(true);
stream_name->setText(m_settings.getDefaultStreamName());
stream_website->setText(m_settings.getDefaultStreamName());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getDefaultStreamWebsite()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ups ...

mountpoint->setText(m_settings.getDefaultMountpoint());
host->setText(m_settings.getDefaultHost());
int iPort = m_settings.getDefaultPort();
if (iPort != 0 && iPort <= 0xffff) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a DEBUG_ASSERT instead? Seems like this would only happen when a programmer makes an error changing the default.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, now we have a valid port instead of -1.

int ConfigObject<ConfigValue>::getValue(const ConfigKey& key,
const int& default_value) const {
int ConfigObject<ConfigValue>::getValue(
const ConfigKey& key, const int& default_value) const {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this was some auto-reformat of the file?

Hanging indent of function arguments is how our style guide specifies:
https://google.github.io/styleguide/cppguide.html#Function_Calls

Personally, I find it much more readable since my eyes don't have to leave the right-side of the screen.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The QString version requires the second format option to not exceed column 80.
I have changed the others as well because I want to have all getValue functions sharing the same line brakes.

@daschuer
Copy link
Copy Markdown
Member Author

daschuer commented Jan 6, 2017

Notes Addressed.

@rryan
Copy link
Copy Markdown
Member

rryan commented Jan 20, 2017

Sorry for the conflicts -- could you please resolve them?

@Astrochicken1
Copy link
Copy Markdown

I would also need this feature to start using Mixxx for broadcasting with Shoutcast. Otherwise it's impossible to switch between between DJ sets. This is indispensable.

@rryan
Copy link
Copy Markdown
Member

rryan commented Jan 22, 2017

I merged this with master locally to test it out. I killed my SC2 server and Mixxx segfaulted. The SC2 log showed Mixxx had reconnected right before it crashed.
https://paste.debian.net/909915/
OSX 10.12.2 connected to SHOUTcast DNAS/posix(linux x64) v2.5.1.723 (Sep 30 2016)

@rryan
Copy link
Copy Markdown
Member

rryan commented Jan 22, 2017

Another, this time with LLDB backtraces and log:
https://paste.debian.net/909917/

* thread #45: tid = 0x14c986, 0x0000000156916fe5 libmp3lame.dylib`quantize_lines_xrpow + 212, name = 'EngineBroadcast', stop reason = EXC_BAD_ACCESS (code=1, address=0x22a937d8c)
  * frame #0: 0x0000000156916fe5 libmp3lame.dylib`quantize_lines_xrpow + 212
    frame #1: 0x00000001569162c4 libmp3lame.dylib`count_bits + 765
    frame #2: 0x00000001569109db libmp3lame.dylib`outer_loop + 127
    frame #3: 0x0000000156911c39 libmp3lame.dylib`CBR_iteration_loop + 493
    frame #4: 0x0000000156901635 libmp3lame.dylib`lame_encode_mp3_frame + 2096
    frame #5: 0x00000001569077dd libmp3lame.dylib`lame_encode_buffer_template + 859
    frame #6: 0x00000001569078a4 libmp3lame.dylib`lame_encode_buffer_float + 36
    frame #7: 0x0000000100166f2b mixxx`EncoderMp3::encodeBuffer(this=<unavailable>, samples=<unavailable>, size=<unavailable>) + 315 at encodermp3.cpp:304 [opt]
    frame #8: 0x0000000100287cd2 mixxx`EngineBroadcast::process(this=0x0000000112060b90, pBuffer=0x0000000117b9d000, iBufferSize=94048) + 98 at enginebroadcast.cpp:621 [opt]
    frame #9: 0x0000000100289229 mixxx`EngineBroadcast::run(this=<unavailable>) + 889 at enginebroadcast.cpp:856 [opt]
    frame #10: 0x000000010107438a QtCore`QThreadPrivate::start(arg=0x0000000112060b90) + 362 at qthread_unix.cpp:341 [opt]
    frame #11: 0x00007fffc07d9aab libsystem_pthread.dylib`_pthread_body + 180
    frame #12: 0x00007fffc07d99f7 libsystem_pthread.dylib`_pthread_start + 286
    frame #13: 0x00007fffc07d91fd libsystem_pthread.dylib`thread_start + 13

@rryan
Copy link
Copy Markdown
Member

rryan commented Jan 22, 2017

I'm pretty confused at where this is coming from. Building with asan to see if I can pick up any memory safety issues.

@rryan
Copy link
Copy Markdown
Member

rryan commented Jan 22, 2017

On broadcast connect (don't have to kill my server) -- asan notices a heap buffer overflow in libshout. Looks like it could be shoutcast-specific (shout_parse_xaudiocast_response).

=================================================================
==38754==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000deda02 at pc 0x0001046e3de7 bp 0x70000e2d53a0 sp 0x70000e2d4b60
READ of size 23 at 0x603000deda02 thread T46
    #0 0x1046e3de6 in StrstrCheck(void*, char*, char const*, char const*) (libclang_rt.asan_osx_dynamic.dylib+0xfde6)
    #1 0x1046e3a6c in wrap_strstr (libclang_rt.asan_osx_dynamic.dylib+0xfa6c)
    #2 0x10468bac7 in shout_parse_xaudiocast_response (libshout.3.dylib+0x6ac7)
    #3 0x1046892df in parse_response (libshout.3.dylib+0x42df)
    #4 0x104686719 in try_connect (libshout.3.dylib+0x1719)
    #5 0x104687fb9 in shout_get_connected (libshout.3.dylib+0x2fb9)
Debug [Main]: keyboard press:  "D"
    #6 0x1006b78f5 in EngineBroadcast::processConnect() enginebroadcast.cpp:475
    #7 0x1006bf8dd in EngineBroadcast::run() enginebroadcast.cpp:818
    #8 0x1023e7389 in QThreadPrivate::start(void*) qthread_unix.cpp:341
    #9 0x7fffc07d9aaa in _pthread_body (libsystem_pthread.dylib+0x3aaa)
    #10 0x7fffc07d99f6 in _pthread_start (libsystem_pthread.dylib+0x39f6)
    #11 0x7fffc07d91fc in thread_start (libsystem_pthread.dylib+0x31fc)

Process 38754 stopped
* thread #46: tid = 0x15700b, 0x0000000104724985 libclang_rt.asan_osx_dynamic.dylib`__asan::DescribeAddress(unsigned long, unsigned long, char const*) + 453, name = 'EngineBroadcast', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000104724985 libclang_rt.asan_osx_dynamic.dylib`__asan::DescribeAddress(unsigned long, unsigned long, char const*) + 453
libclang_rt.asan_osx_dynamic.dylib`__asan::DescribeAddress:
->  0x104724985 <+453>: movzbl (%rbx), %eax
    0x104724988 <+456>: cmpl   $0x5f, %eax
    0x10472498b <+459>: jne    0x1047249a9               ; <+489>
    0x10472498d <+461>: movzbl 0x1(%rbx), %eax

@daschuer
Copy link
Copy Markdown
Member Author

https://github.com/xiph/Icecast-libshout/blob/2dd6cfb7190bdfd4cb5a1fb663f00e630462e9e1/src/proto_xaudiocast.c#L88

The heap overflow happens here when response is not null terminated.
But since it is only a read access, this should not crash the encoder.

And this happens when self->rqueue.head is invalid:
https://github.com/xiph/Icecast-libshout/blob/2dd6cfb7190bdfd4cb5a1fb663f00e630462e9e1/src/queue.c#L129

Conflicts:
	src/preferences/configobject.cpp
	src/preferences/dialog/dlgprefwaveform.cpp
	src/skin/legacyskinparser.cpp
@rryan
Copy link
Copy Markdown
Member

rryan commented Jan 28, 2017

I finally reproduced the same crash on reconnect in master.

Debug [EngineBroadcast 1]: NetworkStreamWorker state: 2 7 37
Debug [EngineBroadcast 1]: Connection pending. Waiting...
Debug [EngineBroadcast 1]: NetworkStreamWorker state: 2 7 37
Debug [EngineBroadcast 1]: Connection pending. Waiting...
Debug [EngineBroadcast 1]: NetworkStreamWorker state: 2 7 37
Debug [EngineBroadcast 1]: Connection pending. Waiting...
Debug [EngineBroadcast 1]: NetworkStreamWorker state: 2 7 37
Debug [EngineBroadcast 1]: ***********Connected to streaming server...
Debug [EngineBroadcast 1]: NetworkStreamWorker state: 4 7 37
Debug [EngineBroadcast 1]: EngineBroadcast::processConnect() returning true
Debug [Engine]: EngineNetworkStream::write() buffer full, loosing samples
Debug [Engine]: NetworkStreamWorker state: 4 1 38
Debug [EngineBroadcast 1]: EncoderMp3::encodeBuffer 32768 48161
Debug [EngineBroadcast 1]: EncoderMp3::encodeBuffer 81920 109601
zsh: segmentation fault  ./mixxx --debugLevel 4

https://paste.debian.net/911047/

Thread 37 Crashed:: EngineBroadcast
0   libmp3lame.dylib              	0x0000000155937fe5 quantize_lines_xrpow + 212
1   libmp3lame.dylib              	0x00000001559372c4 count_bits + 765
2   libmp3lame.dylib              	0x00000001559319db outer_loop + 127
3   libmp3lame.dylib              	0x0000000155932c39 CBR_iteration_loop + 493
4   libmp3lame.dylib              	0x0000000155922635 lame_encode_mp3_frame + 2096
5   libmp3lame.dylib              	0x00000001559287dd lame_encode_buffer_template + 859
6   libmp3lame.dylib              	0x00000001559288a4 lame_encode_buffer_float + 36
7   mixxx                         	0x0000000107ecd8db EncoderMp3::encodeBuffer(float const*, int) + 491 (encodermp3.cpp:308)
8   mixxx                         	0x0000000107ff1212 EngineBroadcast::process(float const*, int) + 98 (enginebroadcast.cpp:638)
9   mixxx                         	0x0000000107ff2399 EngineBroadcast::run() + 713 (enginebroadcast.cpp:864)
10  org.qt-project.QtCore         	0x0000000108d549ba 0x108d2a000 + 174522
11  libsystem_pthread.dylib       	0x00007fffa74b6aab _pthread_body + 180
12  libsystem_pthread.dylib       	0x00007fffa74b69f7 _pthread_start + 286
13  libsystem_pthread.dylib       	0x00007fffa74b61fd thread_start + 13

@rryan
Copy link
Copy Markdown
Member

rryan commented Jan 28, 2017

@rryan rryan merged commit 2055c37 into mixxxdj:master Jan 28, 2017
@rryan
Copy link
Copy Markdown
Member

rryan commented Jan 28, 2017

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants