Skip to content

Improve performance of fetching and storing history and events with the database#84870

Merged
bdraco merged 52 commits into
home-assistant:devfrom
bdraco:time_as_float
Jan 2, 2023
Merged

Improve performance of fetching and storing history and events with the database#84870
bdraco merged 52 commits into
home-assistant:devfrom
bdraco:time_as_float

Conversation

@bdraco
Copy link
Copy Markdown
Member

@bdraco bdraco commented Dec 30, 2022

Breaking change

This breaking change only affects custom integrations that do not use the recorder history API and access the database directly.

The underlying database schema has changed. The documentation for the events and states have been updated and is available at the Data Science Portal.

Proposed change

Note: The bulk of the lines here are copies of the old tests to make sure that history queries still work during the short (unless they have a giant db) live migration from schema 30->32. We still have support for migrating from db schema 0 (circa 2016) and are already maintaining a few tests for that.

Currently we store all timestamps as strings in UTC isotime. This makes searches slow since we have to search strings to find time ranges as it make the indices much larger (also the reason why InnoDB is currently required when using MySQL ..because of the very large index keys). Every single row had to store at least one string which was always in UTC time.

This change makes returning data via the newer apis much faster as well since we avoid many conversions in the process as they mostly return timestamps as floats

While not the primary goal, this also reduces database size by ~10-30% (on the higher side if you have lots of energy/frequently updating sensors and on the lower side if there are a lot of statistics) as well as size of backups. The database size reductions will become apparent after the next second sunday of the month as we only repack the database monthly. For a real world estimate, my primary production went from 1155682304 bytes to 836296704 bytes

As with previous migrations, History queries will keep working during migration. Logbook will be unavailable until the migration completes

There is effectively no more conversion overhead when fetching history with the newer apis
Screenshot 2022-12-31 at 3 48 06 PM
Screenshot 2022-12-31 at 3 48 12 PM

There is also a nice reduction in commit overhead since there are no complex conversions happening in the ORM anymore which will reduce idle CPU and disk I/O (from the size reduction)
Screenshot 2023-01-01 at 12 44 20 PM

Screenshot 2023-01-01 at 11 36 53 AM

Dev testing TODO:

  • sqlite
  • mysql
  • postgresql

Production testing TODO:

  • sqlite
  • mysql
  • postgresql

Migration testing

  • sqlite
  • mysql
  • postgresql

Performance testing

  • blue
  • rpi3
  • x86
  • mac laptop

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Black (black --fast homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.
  • Untested files have been added to .coveragerc.

To help with the load of incoming pull requests:

@home-assistant
Copy link
Copy Markdown
Contributor

Hey there @home-assistant/core, mind taking a look at this pull request as it has been labeled with an integration (recorder) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of recorder can trigger bot actions by commenting:

  • @home-assistant close Closes the issue.
  • @home-assistant rename Awesome new title Change the title of the issue.
  • @home-assistant reopen Reopen the issue.
  • @home-assistant unassign recorder Removes the current integration label and assignees on the issue, add the integration domain after the command.

Comment thread homeassistant/components/recorder/history.py Outdated
Comment thread homeassistant/components/recorder/db_schema.py Outdated
Comment thread homeassistant/components/recorder/migration.py Outdated
@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Dec 31, 2022

This is working well in testing. Need to build the migration code and add tests next

Comment thread homeassistant/components/recorder/db_schema.py Outdated
@bdraco bdraco changed the title Improve performance of fetching history and events from the database Improve performance of fetching and storing history and events from the database Jan 1, 2023
@bdraco bdraco changed the title Improve performance of fetching and storing history and events from the database Improve performance of fetching and storing history and events with the database Jan 1, 2023
Comment thread homeassistant/components/recorder/history.py
bdraco added a commit to bdraco/data.home-assistant that referenced this pull request Jan 2, 2023
@bdraco bdraco marked this pull request as ready for review January 2, 2023 20:24
@bdraco bdraco requested a review from a team as a code owner January 2, 2023 20:24
Comment thread homeassistant/components/recorder/db_schema.py
@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Jan 2, 2023

I think this is good to go but I want to migrate two or three more test blues that I have lying around to be 💯

Will do that later tonight

  • Houston ble text box
  • Houston rpi test
  • Houston blue test

@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Jan 2, 2023

Thanks

@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Jan 2, 2023

All 3 of the additional test migrations went well.

@bdraco bdraco merged commit b8a1537 into home-assistant:dev Jan 2, 2023
@bdraco bdraco deleted the time_as_float branch January 2, 2023 23:26
@frenck
Copy link
Copy Markdown
Member

frenck commented Jan 2, 2023

For the breaking change aimed at developers, would be nice to have a small dev blog on that.

@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Jan 2, 2023

For the breaking change aimed at developers, would be nice to have a small dev blog on that.

home-assistant/developers.home-assistant#1606

mkmer pushed a commit to mkmer/HomeAssistant_Core that referenced this pull request Jan 3, 2023
@github-actions github-actions Bot locked and limited conversation to collaborators Jan 4, 2023
@bdraco bdraco added the noteworthy Marks a PR as noteworthy and should be in the release notes (in case it normally would not appear) label Jan 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

breaking-change cla-signed core integration: recorder noteworthy Marks a PR as noteworthy and should be in the release notes (in case it normally would not appear) Quality Scale: internal

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants