Skip to content

Improve MQTT topic validation#14099

Merged
pvizeli merged 3 commits into
home-assistant:devfrom
OttoWinter:mqtt-topic-validation
Apr 27, 2018
Merged

Improve MQTT topic validation#14099
pvizeli merged 3 commits into
home-assistant:devfrom
OttoWinter:mqtt-topic-validation

Conversation

@OttoWinter
Copy link
Copy Markdown
Member

@OttoWinter OttoWinter commented Apr 26, 2018

Description:

In order to learn more about the MQTT protocol, I recently created my own MQTT server and client in Rust. This PR takes some of the lessons learned from doing that into the topic name and topic filter validation in Home Assistant.

Specifically (see MQTT 3.1.1 spec):

  • The topic must a valid UTF-8 encodable string. ("The character data in a UTF-8 encoded string MUST be well-formed UTF-8 [...] In particular this data MUST NOT include encodings of code points between U+D800 and U+DFFF.")
  • The topic must not be longer than 65535 encoded bytes (not characters, "they MUST NOT encode to more than 65535 bytes")
  • Topic filters: Multi-level wildcard # must be the last character and must be followed by a topic level separator if it isn't by its own. ("The multi-level wildcard character MUST be specified either on its own or following a topic level separator.")
  • Topic filters: Single-level wildcard + must be encapsulated by topic level separators. ("Where it is used it MUST occupy an entire level of the filter")

These are all only the RFC2119 "MUST NOT" parts. There are some SHOULD NOT specifications which I left out as they are not mandatory

  • $SYS as publish topic ("The Server SHOULD prevent Clients from using such Topic Names to exchange messages with other Clients.")
  • special unicode characters ("The data SHOULD NOT include encodings of the Unicode [Unicode] code points listed below.")

Checklist:

  • The code change is tested and works locally.
  • Local tests pass with tox. Your PR cannot be merged unless tests pass

If the code does not interact with devices:

  • Tests have been added to verify that the new code works.

Copy link
Copy Markdown
Member

@pvizeli pvizeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pvizeli
Copy link
Copy Markdown
Member

pvizeli commented Apr 26, 2018

We support also the old 3.1 / if that also okay for this?

raw_value = value.encode('utf-8')
except UnicodeError:
raise vol.Invalid("MQTT topic name/filter must be valid UTF-8 string.")
vol.Length(min=1, max=65535)(raw_value)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels weird. Rather have us just check the length and raise if it's outside of our boundaries.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Fixed it now :)

@OttoWinter
Copy link
Copy Markdown
Member Author

@pvizeli Theoretically, MQTT 3.1 doesn't specify that topic names must be UTF-8 encoded unicode codepoints. However, there's no mention what other encoding should be used and paho-mqtt encodes all topic names/filters with utf-8 anyway:

https://github.com/eclipse/paho.mqtt.python/blob/e9914a759f9f5b8081d59fd65edfd18d229a399e/src/paho/mqtt/client.py#L1062

@pvizeli pvizeli merged commit 9d1f9fe into home-assistant:dev Apr 27, 2018
@balloob balloob mentioned this pull request May 11, 2018
@home-assistant home-assistant locked and limited conversation to collaborators Jul 26, 2018
@OttoWinter OttoWinter deleted the mqtt-topic-validation branch September 29, 2018 10:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants