Skip to content

Fix MQTT retained message not being re-dispatched#12004

Merged
balloob merged 21 commits intohome-assistant:devfrom
OttoWinter:fix-mqtt-retained
Feb 11, 2018
Merged

Fix MQTT retained message not being re-dispatched#12004
balloob merged 21 commits intohome-assistant:devfrom
OttoWinter:fix-mqtt-retained

Conversation

@OttoWinter
Copy link
Copy Markdown
Member

@OttoWinter OttoWinter commented Jan 28, 2018

Description:

Fix MQTT retained message not being re-dispatched. See #12003 for a detailed explanation.

Fixes #12003

Unfortunately, I have very limited knowledge of the test framework Home Assistant uses and I haven't found a way to create tests for this PR - if somebody with more knowledge could help out that would be great.

Checklist:

  • The code change is tested and works locally.

If the code communicates with devices, web services, or third-party tools:

  • Local tests with tox run successfully. Your PR cannot be merged unless tests pass.
  • Tests have been added to verify that the new code works.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using wanted_topics here, because it wouldn't make any sense to call the paho-mqtt subscribe method as it will soon be called anyway.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know too much about Home Assistant's style preferences concerning usage of tuples. Creating a namedtuple/struct/... here would probably be unnecessary since it's only really used in a single other place.

@balloob
Copy link
Copy Markdown
Member

balloob commented Jan 29, 2018

Easy to reproduce with a test:

  1. subscribe to a topic with listener A
  2. mimic a retained message being send to that topic
  3. assert listener A received message
  4. subscribe to a topic with listener B
  5. assert listener B received message

Copy link
Copy Markdown
Member

@balloob balloob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You wrote the one and only correct solution to this problem in #12003: don't re-use subscriptions but let paho-mqtt handle this.

If we handle it ourselves instead of paho-mqtt, we need to duplicate ALL retain logic that paho-mqtt already contains. That's not our responsibility.

@OttoWinter
Copy link
Copy Markdown
Member Author

@balloob You're right - at first when I tried the "clean" solution with the paho client, I think I saw a subscription not being re-sent and I didn't look into it any further. Though now I've tested this in more detail and the clean solution seems to work fine.

Copy link
Copy Markdown
Member

@balloob balloob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is unsubscription handled on the server? If we subscribe twice to the same topic, do we need to unsubscribe twice to not receive messages anymore? Or do we still need to handle that inside Home Assistant?

Right now if we subscribe a second time to the same topic, the retained message will also be sent to the existing subscriber. That is not correct.

@OttoWinter
Copy link
Copy Markdown
Member Author

@balloob Good catch! Looking at the code, this probably also means that, in the current Home Assistant, if we make two consecutive subscriptions to a topic and one of them is then unsubscribed, the remaining subscription is not re-established on a reconnect. I will look into this tomorrow.

@balloob
Copy link
Copy Markdown
Member

balloob commented Jan 29, 2018

Correct, that is also a bug. Good thing that we don't unsubscribe a lot yet (will come when we start adding "unload component" functionality.

Can you add some test cases to test these situations?

@OttoWinter
Copy link
Copy Markdown
Member Author

OttoWinter commented Jan 30, 2018

I've tried to characterize paho-mqtt's (un)subscribe behavior and have come up with the following rules. But first, let's set up a notation to more easily describe the behavior:

  • A message looks like this: <topic> → <payload> [qos = 0];
    an ▶ arrow denotes a retained message.
    For example 'test/status' ▶ 'online' is a retained message on the topic 'test/status' with payload 'online' and the default QoS 0.
  • A subscribe call looks like this: subscribe(<topic>, [qos=0])
  • Similarily, an unsubscribe call looks like this: unsubscribe(<topic>)
  • A (forced) reconnect is denoted by reconnect()
  • Publishing a message is done by publish(<message>)
  • The retained message 'test/status' ▶ 'online' always exists.

1. Unsubscribe behavior

unsubscribe(<topic>) deletes all subscriptions with the exact topic string.

For example, if we do subscribe('test/beer'), subscribe('test/beer'), subscribe('test/wine'), unsubscribe('test/beer'), then we will receive 'test/wine' → '🍷' but not 'test/beer' → '🍺'
However, when calling subscribe('test/beer'), subscribe('test/+'), unsubscribe('test/beer'), 'test/beer' → '🍺' will still be received through subscribe('test/+').

2. Message callback behavior

Every new message will always create at most one on_message callback call. An exception to this are re-sent retained messages on a subscribe().

So even though subscribe('test/beer') and subscribe('test/+'), we will only get one on_message for 'test/beer' → '🍺'.

3. Subscription behavior

All subscriptions with the same topic string behave as if they were one after setup.

This is caused by 1&2.

Comments

Right now if we subscribe a second time to the same topic, the retained message will also be sent to the existing subscriber. That is not correct.

Is this really a problem? I mean, currently all retained messages are re-sent to all subscribers on reconnect(). Additionally, this would be very difficult to fix since we don't know whether a on_message call was triggered by a real new message or a subscribe() retained message. (If this is required, a fix might be to handle retained messages ourselves, though this is probably not the way to go)

My idea right now would have been to only allow unsubscribing through the return value of async_subscribe and modify wanted_topics to keep track of a list of active subscribers. Then we would only call _mqttc.unsubscribe when the list for a topic becomes empty.

Test cases: Working on that right now. Unfortunately I know very little of python's unittest test framework so I first need to get a hang of it.

Comment thread tests/components/mqtt/test_init.py Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line too long (80 > 79 characters)

@balloob
Copy link
Copy Markdown
Member

balloob commented Jan 31, 2018

I guess it's not a problem for retained messages to be sent twice. It should not happen too often.

@OttoWinter
Copy link
Copy Markdown
Member Author

That's good. I now also added some test cases that should cover the previous bug(s) - please tell me if I should change something there for example how I used the mocks.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't include these "style" fixes in PRs. It makes it difficult to see what this PR actually adds.

If you want to do those, do it in a separate PR. Just know that generally, style fixes best case will make the code more readable, worst case introduce bugs that didn't exist before.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this PR it's fine, just don't do it in the future please.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I'll try to remember this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a guard clause to save on indentation:

if subscriptions:
    return

# Only unsubscribe …

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's messy that now we have 2 places where subscriptions live, that feels weird.

Probably out of scope, but it would almost make more sense to get rid of the dispatcher and move topic matching and calling listeners into our MQTT client.

Something like this:

sub_info = namedtuple('SubscriptionInfo', 'callback,qos,topic,encoding')

for sub in self._subscriptions.items():
    if not _match_topic(sub.topic, topic):
        returnself.hass.async_run_job(sub.callback, topic, payload, qos)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's probably a good time to clean up the subscription model a bit. 🎨

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent failure (not raising when something is wrong) is usually a source of future bugs. The application clearly got into a wrong state or has a bug if we try to unsubscribe a subscription twice.

I know the old code did it too, but I think we should remove it. What do you think?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're completely right - this has annoyed me with other components too so I don't know why I repeated this mistake. It's definitely better to fail fast than to have some unexpected behavior around - so I suppose raising an exception would be good to indicate a failed state.

Anyway, this could also be fixed by introducing a Subscription object as you mentioned. Working on that 🛠

@balloob
Copy link
Copy Markdown
Member

balloob commented Feb 7, 2018

Your tests look great 👍

@OttoWinter OttoWinter changed the title Fix MQTT retained message not being re-dispatched [WIP] Fix MQTT retained message not being re-dispatched Feb 7, 2018
@OttoWinter
Copy link
Copy Markdown
Member Author

Thanks for the detailed review!

"Try to avoid brackets and additional quotes around the output to make it easier for users to parse the log."
 - https://home-assistant.io/developers/development_guidelines/
Tests still need to be updated
... And fix issues

Accessing the config manually at runtime isn't ideal
* Updated usage of Mocks
* Moved tests that were testing subscriptions out of the MQTTComponent test, because of how mock.patch was used
* Adjusted the remaining tests for the MQTT clients new behavior - e.g. self.progress was removed
* Updated the async_fire_mqtt_message helper
* Re-introduce the MQTT subscriptions through the dispatcher for tests - quite ugly though...  🚧
* Update fixtures to use our new MQTT mock 🎨
Apparently test_mqtt_json.py and test_mqtt_template.py were written on Windows. In order to not mess up the diff, I'll just redo the carriage return.
What's very interesting is that 3.4 didn't fail on travis...
@OttoWinter
Copy link
Copy Markdown
Member Author

Done 🎉

@balloob balloob merged commit b1c0cab into home-assistant:dev Feb 11, 2018
@balloob balloob mentioned this pull request Feb 22, 2018
@OttoWinter OttoWinter deleted the fix-mqtt-retained branch March 13, 2018 20:19
@home-assistant home-assistant locked and limited conversation to collaborators Jul 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants