Remove automatic sqlite vacuuming#11903
Conversation
|
I agree that it's nonsense for every purge. However, if someone (like I did a while ago) discovers that his database has grown to gigabytes in size (because I forgot to turn on purging…) and wants to solve that problem by purging, he won't be able to. The way I see it, vacuuming the database is the only way for SQLite databases to ever shrink? So we should still offer some way of vacuuming. Perhaps we add a flag to the purge service? The user would then have to call that service manually, but it's still better than having to write the SQL by hand. |
|
I kind of expected that reply ;-). I am not fundamentally against a vacuum service. If we want to solve that problem, though, why only for SQLite? Other databases also do not shrink. I tend to think that If the SQLite database does grow too big, I would be happy to recommend simply deleting the database file. Expert users can locate a blog that teaches how to do a vacuum from the command line. |
|
I'm in favor of removing VACUUM. I am a little more hesitant to set a default value for purging. |
|
I agree that having to vacuum manually probably won't be a huge problem for many users, but if we can keep the possibility in, I think it would be nice. How about adding two buttons to the Settings panel in the UI, one saying "Clean up the Database" (and having some explanatory text: "This will remove all events and states from your database that are older than X"), the other one saying "Clean up & shrink the Database"? The second one then calls the purge service with |
|
One thing to consider is that when rows are deleted from Sqlite the space occupied by those rows is only marked deleted. The space that was used by those deleted rows will not be recovered. The database will continue to grow until a vacuum is run. Database performance will continue to degrade until a vacuum is run. In a busy HA instance this will be a continual maintenance issue on small systems like RaspberryPi.
Putting the responsibility of doing database maintenance on the user isn’t usually a good thing IMHO.
Refer to: http://www.sqlitetutorial.net/sqlite-vacuum/ <http://www.sqlitetutorial.net/sqlite-vacuum/>
… On Jan 24, 2018, at 12:55 PM, Anders Melchiorsen ***@***.***> wrote:
Why are we doing VACUUM for SQLite (only)?
The vacuum will rewrite the entire database, requiring double disk space to be available.
I beleive the database is locked during the entire rewrite (which can last for minutes).
Big writes potentially wear down SD cards even quicker than usual.
The heavy operation gives "timer out of sync" errors on my (Raspberry Pi) Hass.io install.
For a normal periodic purge, any recovered space will be filled up with new events very soon anyway.
SQLite databases in particular are likely to run on low-powered devices where this huge operation will hurt the most.
The vacuum was not introduced due to performance issues but has been present since purging was implemented (#1681 <#1681>), likely "because we can". However, vacuum was apparently broken from 02-Jul-2016 (#2377 <#2377>) to 17-Sep-2017 (#9469 <#9469>) so it appears that missing out is not a big deal.
You can view, comment on, or merge this pull request online at:
#11903 <#11903>
Commit Summary
Remove sqlite vacuuming
File Changes
M homeassistant/components/recorder/purge.py <https://github.com/home-assistant/home-assistant/pull/11903/files#diff-0> (11)
Patch Links:
https://github.com/home-assistant/home-assistant/pull/11903.patch <https://github.com/home-assistant/home-assistant/pull/11903.patch>
https://github.com/home-assistant/home-assistant/pull/11903.diff <https://github.com/home-assistant/home-assistant/pull/11903.diff>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#11903>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA1W1NwlIw-fG8jYXwT2KnauHAvRWB85ks5tN4rPgaJpZM4Rr1s1>.
|
|
@balloob Can you explain your hesitation? I have not looked much at @terrycarlin I did not find data to back up most of those assertions. I do agree on the point of maintenance responsibility though. |
|
The data to back up my assertions is stated in the first paragraph of the URL I included.
… On Jan 25, 2018, at 11:07 AM, Anders Melchiorsen ***@***.***> wrote:
@balloob <https://github.com/balloob> Can you explain your hesitation? I have not looked much at recorder before but now I find its default settings quite a gotcha that I would like to improve.
@terrycarlin <https://github.com/terrycarlin> I did not find data to back up most of those assertions. I do agree on the point of maintenance responsibility though.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#11903 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA1W1PanGlHH9GpPaVzWgYnO03CPDX0_ks5tOMLhgaJpZM4Rr1s1>.
|
|
@terrycarlin as @amelchio stated correctly in his initial post, the article you linked says that the space is "marked for future use" - i.e., will be "recovered" as soon as new stuff gets written to the database. The only situation where that is not sufficient is when your database has for some reason grown really huge, beyond what you ever want to store in the database. |
|
So first off: we should make a choice. We should not offer 2 buttons to the user in the UI. If space is marked for future use and thus will be reused, I don't see a big need for a VACUUM. It will sort itself out. Pro-users can indeed VACUUM themselves from the command-line. @amelchio I thought about it more, I'm fine with a 10 day default purge. We should make sure to announce it in the release notes. I wonder if instating a default purge is a reason for a VACUUM though ;-) |
|
Why not start purging after a certain size? |
|
Because we can't know what a "big" size is. We don't know how big the host machine is or what DB system is in use. I think that it would be a good idea to see if we can show the purge settings on the history page in the frontend. |
|
@balloob this is some problems yes! They are solveable however! I do not like the history component at all, prometheus or influx can handle this data 100 times better. |
|
what I mean to say is that maybee its time to rebuild history to use a good statistic database? |
|
no toes intended to walk on ;) |
|
SQLite is part of Python and thus 0 dependencies. Only way we can offer history out of the box. I know it's not perfect. Anyway, this is discussion for another issue. Let's stick on topic. |
|
So how about this schedule (I can make the required PRs):
|
|
👍 sounds good to me. |
|
Okay, I will close this PR while waiting for its turn. |
Why are we doing VACUUM for SQLite (only)?
The vacuum was not introduced due to performance issues but has been present since purging was implemented (#1681), likely "because we can". However, vacuum was apparently broken from 02-Jul-2016 (#2377) to 17-Sep-2017 (#9469) so it appears that missing out is not a big deal.