Skip to content

Add as_mutable for Jinja templating#88626

Closed
depoll wants to merge 10 commits into
home-assistant:devfrom
depoll:jinja-add-mutable
Closed

Add as_mutable for Jinja templating#88626
depoll wants to merge 10 commits into
home-assistant:devfrom
depoll:jinja-add-mutable

Conversation

@depoll
Copy link
Copy Markdown
Contributor

@depoll depoll commented Feb 22, 2023

Breaking change

Proposed change

Breaking out #88575

Here, I've added the ability to create a mutable copy of a list or dictionary in jinja that you can directly modify (rather than rebuilding the list/dictionary element-by-element, which can turn an O(1) operation into an O(n) or O(n^2) operation). This requires two additions:

  • as_mutable filter so that one can efficiently mutate a list or dictionary without rebuilding it from scratch in a loop
  • Enabling the 'Expression Statement' do extension to make it possible to mutate lists/dicts.

All of this enables the following code:

{% set mutableDict = {} | as_mutable %}
{% do mutableDict.update({'foo': 'bar'}) %}
{% set mutableList = [] | as_mutable %}
{% do mutableList.append("baz") %}

{{ mutableDict }}
{{ mutableList }}

To produce the following output:

{'foo': 'bar'}
['baz']

I'm happy to update the documentation as well but wanted to first clear that this would be a welcome change.

Type of change

  • [] Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Black (black --fast homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.
  • Untested files have been added to .coveragerc.

To help with the load of incoming pull requests:

@depoll depoll requested a review from a team as a code owner February 22, 2023 18:45
@depoll depoll mentioned this pull request Feb 22, 2023
19 tasks
@MartinHjelmare MartinHjelmare changed the title Adds as_mutable for Jinja templating Add as_mutable for Jinja templating Feb 22, 2023
@depoll
Copy link
Copy Markdown
Contributor Author

depoll commented Mar 3, 2023

Added docs.

Copy link
Copy Markdown
Contributor

@emontnemery emontnemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the motivation for the PR is clear, the ergonomics is maybe not so great.

Instead of adding the as_mutable filter, could we explore adding our own Jinja SandboxedEnvironment which would modify a copy of the source data instead of modifying it directly?

Another option could be to let Jinja [] and {} operators create _MutableDict and _MutableList respectively so user defined lists and dicts can be modified?

@frenck frenck marked this pull request as draft March 13, 2023 10:45
@depoll
Copy link
Copy Markdown
Contributor Author

depoll commented Mar 13, 2023

Happy to keep iterating. A few thoughts:

  • I considered copy-on-write, but I think it gets really confusing really quickly. Suppose I make a change to a nested dictionary that resulted from some other API -- either the containing dictionary needs to also be copied (even though we're not writing to it directly) or the next time I access that element, the changes will be lost. Then, if we do transparently copy, under what circumstances do those changes get reused. E.g. if I modify states.light.foo.attributes.some_dictionary_attribute, does the change stick if I access it again? What if I use state_attr() to get the same value? And change the result of that? IMO, there is a reasonable case to be made for every combination of answers to those questions, but it makes the APIs very difficult to predict.
  • Unfortunately, without as_mutable, script variables will also have this problem, since you might have a variable that's a dictionary or list, and that won't be mutable by default. If you mutate them in-place (e.g. using copy-on-write), that won't get written back to the script variable, which may confuse folks as well.
  • Totally open to letting the {} and [] literals be automatically mutable if I can figure out how. That at least gets rid of the {} | as_mutable thing. On the other hand, as_mutable is still useful for dealing with dictionaries that come back from other APIs. In principle you could probably do {}.update(immutable_dict), but I'm not sure that's a better experience.

So -- that's how I landed on being explicit about making things mutable -- everywhere else it quickly gets confusing and difficult to predict which things will be automatically mutable and what changes may stick. I do like the idea of making user-defined lists and dicts automatically mutable.

@depoll
Copy link
Copy Markdown
Contributor Author

depoll commented Mar 13, 2023

@emontnemery Do you have any idea how to override the {} and [] behaviors in jinja? I'm having trouble finding a good hook, but will keep looking. Since they're not operators AFAICT (they're a literal syntax built into the language), I'm not sure how to adjust it.

@depoll
Copy link
Copy Markdown
Contributor Author

depoll commented Mar 13, 2023

After a little digging, I couldn't find any way to adjust the behavior of literal creation (I think it's just compiling down to raw Python literals, which also don't look overridable). Happy to keep going back and forth -- marking as Ready for review so it ends up back on your radar, but if that's the wrong thing to do since I haven't actually changed anything, apologies in advance :)

@depoll depoll marked this pull request as ready for review March 13, 2023 17:57
@depoll depoll requested a review from emontnemery March 14, 2023 04:05
@depoll
Copy link
Copy Markdown
Contributor Author

depoll commented Mar 20, 2023

@emontnemery no rush, just thinking it would be nice to get this in with all the other Jinja changes I made in this cycle. Any thoughts on what I mentioned above? Seems easier to follow a world where everything is immutable unless explicitly marked otherwise then one where it's unclear where changes will persist, but definitely interested in any ideas you have.

@emontnemery
Copy link
Copy Markdown
Contributor

emontnemery commented Apr 4, 2023

@depoll thanks for looking into the copy on modify as well as overriding operators 👍

The concern with this PR is still that as_mutable will be difficult to find and understand, so it will end up not being used a lot. Can you give a few realistic examples of use cases where as_mutable is used, as well as the same functionality but without having access to as_mutable to better illustrate how it's helpful?

@depoll
Copy link
Copy Markdown
Contributor Author

depoll commented Apr 5, 2023

Here's an example straight out of my setup. I have a template sensor that I use to aggregate BLE info I'm gathering on a bunch of ESPHome devices (the built in presence tools don't work very well for my use cases). It receives events and updates its state:

ble:
  template:
    - trigger:
        - platform: event
          event_type: esphome.beacon_detected
          id: detected
        - platform: time_pattern
          minutes: "*"
          id: cleanup
      sensor:
        name: BLE Beacons
        unique_id: ble_beacons
        state: "{{ this.attributes.beacons | length }}"
        attributes:
          beacons: >-
            {%- macro cleanup(beacons) -%}
              {%- set ns = namespace(beacons=[]) -%}
              {%- for b in beacons -%}
                {%- if b.last_seen | as_datetime >= now() - timedelta(seconds=10) -%}
                  {%- set ns.beacons = ns.beacons + [b] -%}
                {%- endif -%}
              {%- endfor -%}
              {{ ns.beacons }}
            {%- endmacro -%}
            {%- set old_beacons = this.attributes.beacons or [] -%}
            {%- set old_beacons = old_beacons | from_json if old_beacons is string else old_beacons -%}
            {%- if trigger.id == "detected" -%}
              {%- set event_data_id = [trigger.event.data.uuid | lower, trigger.event.data.major | int, trigger.event.data.minor | int] | join('-') -%}
              {%- set ns = namespace(sensor=None, canonical_id=event_data_id, sensors=[]) -%}
              {%- set ns.sensors = states.sensor | selectattr('state', 'in', ['Transmitting', 'Stopped']) | list -%}
              {%- for it in states.input_text | selectattr('state', 'match', 'ibeacon:') -%}
                {%- set ns.sensors = ns.sensors + [{
                  "entity_id": it.entity_id,
                  "attributes": {
                    "id": it.state['ibeacon:' | length:]
                  }
                }] -%}
              {%- endfor -%}
              {%- for s in ns.sensors -%}
                {%- set id_parts = (s.attributes.id | lower).split('_') -%}
                {%- set id = [id_parts[0], id_parts[1] | int, id_parts[2] | int] | join('-') -%}
                {%- if event_data_id == id -%}
                  {%- set ns.sensor = s -%}
                  {%- set ns.canonical_id = s.attributes.id -%}
                {%- endif -%}
              {%- endfor -%}
              
              {%- if ns.sensor is not none -%}
                {%- set cur_beacon = {
                  "id": ns.canonical_id,
                  "rssi": trigger.event.data.rssi,
                  "distance": trigger.event.data.distance,
                  "source": trigger.event.data.source,
                  "source_id": trigger.event.data.device_id,
                  "area": area_id(trigger.event.data.device_id),
                  "sensor": ns.sensor.entity_id,
                  "last_seen": now() | string
                } -%}
                {%- set ns = namespace(beacons=[], found=false) -%}
                {%- for b in old_beacons -%}
                  {%- if b.source == cur_beacon.source and b.id == cur_beacon.id -%}
                    {%- set ns.beacons = ns.beacons + [cur_beacon] -%}
                    {%- set ns.found = true -%}
                  {%- else -%}
                    {%- set ns.beacons = ns.beacons + [b] -%}
                  {%- endif -%}
                {%- endfor -%}
                {%- if not ns.found -%}
                  {%- set ns.beacons = ns.beacons + [cur_beacon] -%}
                {%- endif -%}
                {{ cleanup(ns.beacons) }}
              {%- else -%}
                {{ cleanup(old_beacons) }}
              {%- endif -%}
            {%- else -%}
              {{ cleanup(old_beacons) }}
            {%- endif -%}

as_mutable produces lots of opportunities for optimization here as on every event (which happen a lot and make up like 50% of all of my instance's CPU usage per the profiler) I have to rebuild lists.

But let's just take the last for loop as an example. Here, I'm doing crazy amounts of copying immutable lists to rebuild them item-by-item to potentially replace a single entry. This could instead be rewritten:

{%- set ns = namespace(beacons=old_beacons | as_mutable, found=false) -%}
{%- for b in old_beacons -%}
  {%- if b.source == cur_beacon.source and b.id == cur_beacon.id -%}
    {%- do ns.beacons.__setitem__(loop.index0, cur_beacon) -%}
    {%- set ns.found = true -%}
    {%- break -%}
  {%- endif -%}
{%- endfor -%}
{%- if not ns.found -%}
  {%- do ns.beacons.append(cur_beacon) -%}
{%- endif -%}

This turns an O(n^2) operation (because of repeated copying of lists) into an O(n) operation (because there is a single copy which is then modified in place).

I have a number of similar examples with dicts where e.g. removing an item requires rebuilding the dict item-by-item, making copies at every iteration, when a single pop() call would do it in O(1). An example includes keeping track of things coming and going (e.g. if I have enter/exit events for devices in a room and want to keep track of some data about who is there). as_mutable allows this:

{% set ns = namespace(new_dict={}) %}
{% for k, v in dict.items() %}
  {% if k != to_delete %}
    {% set ns.new_dict = dict(ns.new_dict, **{k: v}) %}
  {% endif %}
{% endfor %}

To become this:

{% set new_dict = dict | as_mutable %}
{% do new_dict.pop(to_delete) %}

@balloob
Copy link
Copy Markdown
Member

balloob commented Apr 13, 2023

You should really write your beacon template as a custom integration in Python and don't use jinja2 for this.

@depoll
Copy link
Copy Markdown
Contributor Author

depoll commented Apr 13, 2023

@balloob fair enough, though (at least for me) a custom integration is fairly heavyweight to get started with by comparison to a template sensor and vastly more painful to debug with live data (because every change requires restarting HA instead of just reloading templates).

Anyway, the use case exists in far smaller settings -- I have a number of bookkeeping cases where I'm getting an event and doing simple accumulation in a dict based on those events. Both adding and removing items are more painful and resource-intensive than they need to be.

@frenck
Copy link
Copy Markdown
Member

frenck commented May 22, 2023

We've discussed this functionality extensively a couple of times with a couple of core developers, and we have decided not to accept this change/feature.

It is too confusing to use and explain, and the given use case is very specific (and shouldn't really be a template).

Nevertheless, thanks for being willing to contribute @depoll 👍

../Frenck

@frenck frenck closed this May 22, 2023
@github-actions github-actions Bot locked and limited conversation to collaborators May 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants