Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make EXTREMELY CLEAR that the Auto-remove options take priority over all the "Keep one snapshot for/per" options / Rename "Smart remove" #1679

Open
TiagoTiago opened this issue Apr 6, 2024 · 26 comments
Labels
Cosmetics appearance, icons, themes Discussion decision or consensus needed Documentation GOOD FIRST ISSUE Used by 24pullrequests.com to suggest issues Medium

Comments

@TiagoTiago
Copy link

I assumed I could always count on the older snapshots sticking around. Then yesterday I fucked up and mount --bind a huge folder from another drive inside one of the folders that gets regularly backuped, and today I found out I just got the single backup made yesterday, no versioning for anything left :(

I hope nothing important got wiped; but I do worry since there have been times I only discovered some days or even weeks later that I something got messed up and I managed to get it restored from the version from a few snapshots ago....

ps: I am already beating myself for not keeping up with 3-2-1, no need to pour any more salt on that wound

@TiagoTiago
Copy link
Author

Or even better, let the user choose whether any of the "one per" is sacred and survives even running out of space.

@TiagoTiago
Copy link
Author

TiagoTiago commented Apr 6, 2024

Ugh, and while it was creating today's backup with that folder excluded, it eliminated yesterday's, I guess because it checked whether should erase whole snapshots before noticing the option of just erasing the excluded folder from all snapshots... *sighs*

edit: Hm, wait, where is that option? I could swear it was there before.... Maybe it was a manually added option in the Expert tab got lost on on upgrade or let it slip when migrating to a new hardware years ago and never noticed? O r am I just not looking in the right place? Or is it expected to do that by default and it wouldn't show as an option?

edit2: Ok, I'm not insane, --delete-excluded is a real thing and it does look like it's present in the parameters BiT is passing to rsync.... Now the question is; with the versioning trickery BiT does, does that option do anything?

@buhtz
Copy link
Member

buhtz commented Apr 7, 2024

Dear Tiago,
Thanks for this report. You have to help me a bit with understanding.

Please try to explain the problem in more details and steps. I did not understand your problem description.

Also please describe the situation and steps that triggered that problem.

And please provide some more basic information. Run backintime --diagnostics and post the complete output here. If this command do not work on your system with your version of BIT then please give us the name and version of your operating system, the version of BIT and where you installed BIT from.

Best,
Christian

@TiagoTiago
Copy link
Author

Dear Tiago, Thanks for this report. You have to help me a bit with understanding.

Please try to explain the problem in more details and steps. I did not understand your problem description.

Also please describe the situation and steps that triggered that problem.

And please provide some more basic information. Run backintime --diagnostics and post the complete output here. If this command do not work on your system with your version of BIT then please give us the name and version of your operating system, the version of BIT and where you installed BIT from.

Best, Christian

It's two problems:

  • The original problem that led to this report:

    • There exists settings that say it will "keep one snapshot per" some time unit (per day, per week etc), for the most recent N of those units, under Smart remove. That gives the user the impression that when those settings are used, at least one for each of those is essentially set in stone and won't be removed ever. But, I accidentally added a very big folder as a subfolder to one of the folder being backed up and it blew up the free-space limit, which apparently ignores the "Smart remove" options and does delete EVERYTHING.
  • And the other one is the problem I discovered while I was making a new backup after the disaster mentioned above:

    • Here, I had expected that, the command-line parameter --delete-excluded that is used when running rsync (and I can see it is indeed being used when BiT runs it), would mean that when I exclude a folder (in this case, the big-ass one I had added as a subfolder of one of the included folders) that was present in previous snapshots, it would retroactively be removed from all snapshots since it's now excluded. But it seems, from the way things turned out, once again the free-space limit caused BiT to freak out and delete ALL previous snapshots, instead of first deleting that newly excluded folder. Had it behaved the way I expected in regards to --delete-excluded, and that newly excluded folder would get deleted from all snapshots, there would've once again been enough free space and the snapshot that previously had the big-ass folder would not have been deleted whole.
$ backintime --diagnostics
usage: backintime [-h] [--config PATH] [--share-path PATH] [--debug]
                  [--profile NAME | --profile-id ID] [--quiet] [--version] [--license]
                  {backup,backup-job,benchmark-cipher,check-config,decode,last-snapshot,last-snapshot-path,pw-cache,remove,remove-and-do-not-ask-again,restore,shutdown,smart-remove,snapshots-list,snapshots-list-path,snapshots-path,unmount}
                  ...
backintime: error: Unknown Argument(s): --diagnostics

$ backintime --version
backintime 1.2.1

I don't remember where I got it from, had it setup starting many years ago. sudo apt install backintime-common backintime-qt I would guess (or maybe thru Synaptic since I like using GUIs; it pretty much runs the same command in the background from what I understand).

And at the moment I'm running Linux Mint 20.3, Kernel 5.15.0-101-generic.

@buhtz
Copy link
Member

buhtz commented Apr 8, 2024

Thanks a lot for the details. One further question. You wrote: "I accidentally added a very big folder as a subfolder to one of the folder being backed up and it blew up the free-space limit".

This is about the bind-mount you mentioned in your initial post, right? Do you mean that you added this folder to your backup source or to your backup destination?

@buhtz buhtz self-assigned this Apr 8, 2024
@TiagoTiago
Copy link
Author

Thanks a lot for the details. One further question. You wrote: "I accidentally added a very big folder as a subfolder to one of the folder being backed up and it blew up the free-space limit".

This is about the bind-mount you mentioned in your initial post, right? Do you mean that you added this folder to your backup source or to your backup destination?

On the source side. From what I understand, most software will just see that as the same thing as the folder and it's contents actually being there.

On a somewhat unrelated note, I found the bind trick when looking for a work-around for git not handling symlinks well, and I need one single file inside that folder to be watched by git, but the rest of the folder has way too much data so it's added to .gitignore with that single file as exception, and the actual data lives on another drive which is where I put all the big things I generally don't need backups of.

@buhtz
Copy link
Member

buhtz commented Apr 9, 2024

Hello TiagoTiago,
thanks for your patience. Please allow me to rephrase your problem to see if I clearly understood all details. So please correct me if I am wrong.

Problem A: Free space rule over Smart remove rules

  1. A new folder appeared in your backup source. That folder was big and need much storage volume.
  2. Because of that folder the next backup/snapshot become also big and used to much storage volume on the backup destination.
  3. This triggered the Auto-remove rule "If free space is less than XXX".
  4. Old snapshots where removed because of that free-space rule.
  5. "Smart remove" rules where ignored in this case. So some snapshots are removed that you would expect not to be removed.

Am I right so far?

Problem B: Excluded folders not deleted in previous snapshots

The second problem is about --delete-excluded rsync switch.

  1. You added the accidentally added big folder to the Exclude TAB in the BIT GUI.
  2. The next snapshot did not included this folder anymore? Am right?
  3. But you expected that his folder would be removed from previous snapshots, too?

@TiagoTiago
Copy link
Author

Your summary of Problem A seems to cover it well.

As for Problem B:

  1. Correct
  2. Correct
  3. Yes, but also since it would have been removed, there would not be need to delete the whole first post-disaster snapshot to take care of the free space issue , and I would at that point have the first, and the second post-disaster snapshots, and not just the second one.

So Problem B involved a manifestation of Problem A triggered by not removing the now excluded big folder from the previous snapshot and just nuking the whole thing.

And now that I'm thinking about it; maybe I might not even have reached the point of Problem A getting triggered the first time if the --delete-excluded setting had been working as expected, as over time I had been adding more and more unnecessary folders to the exclusion list and just assumed they would be gone from all snapshots, but maybe they were still there taking up space. But yeah, I still consider a problem that I was mislead about those Smart Remove snapshots being kept safe from automatic removal; so indeed these are two separate problems that happened to interact in this situation, but would still be problems by themselves without needing each other.

@buhtz
Copy link
Member

buhtz commented Apr 9, 2024

About B

Still not sure if I understand everything needed about Problem B. My argument would be that old snapshots never modified (except they are completely removed). It does not matter for previous and still existing snapshots what is in the include and exclude list. IMHO it would be a desaster for other users if new Exclude folders will remove this folder in previous snapshots.

About A

If you have no other suggestion I would say this is a documentation issue. The behavior is expected. But the GUI doesn't make it clear enough that the space-rule has higher priority than the smart-remove rules.

From your users perspective do you have any suggestions how the snapshots dialog could be improved to make it clear enough? A simple text box is a bit to lazy.

@buhtz buhtz added Documentation Discussion decision or consensus needed Feedback needs user response, may be closed after timeout without a response Cosmetics appearance, icons, themes GOOD FIRST ISSUE Used by 24pullrequests.com to suggest issues labels Apr 9, 2024
@buhtz buhtz added this to the Upcoming release (1.5.0) milestone Apr 9, 2024
@TiagoTiago
Copy link
Author

TiagoTiago commented Apr 9, 2024

I can see how some people might prefer fresh snapshots over preserving history; so a solution for problem A would likely involve letting the user choose what has more priority, and maybe producing some warning when there is nothing left that is allowed to be deleted while there is still not enough free space left.

As for problem B, I don't feel it's likely people would wanna keep excluded folders in old snapshots when --delete-excluded is active; but maybe there could be some reasoning I haven't thought of and might be wise to get more people's opinions on whether this should just be fixed in the form of deleting even retroactively, or provide the option to just exclude things on snapshots made after the addition to the exclude list.

@buhtz
Copy link
Member

buhtz commented Apr 9, 2024

A would likely involve letting the user choose what has more priority

This would increase complexity of BIT. But why not just disable the space rule? Having not much space left is a rare case for most users.

On the other hand there could be an option/behavior that BIT shouldn't start a new snapshot but warn if there isn't enough space left instead of removing old snapshots. We have an open issue (#569) about calculating the remaining space and showing it in the GUI. Also relevant #374.

Also I found an old and still open Issue related to Problem A: #711

Can you suggest how the dialog should look like to indicate the real behavior?

image

I don't like to much text in a dialog but can't imagine something else:
image

@TiagoTiago
Copy link
Author

TiagoTiago commented Apr 10, 2024

I had assumed the behavior was

  1. All snapshots usually stay in place
  2. But when one of the Auto-Remove thresholds gets hit, it would go thru all the fully non-protected backups first, from old to new
    2.1. And then, if necessary, for the snapshots that at some point were spared by the Smart Remove exceptions, it would check for any that fell out of all the active exceptions before deciding if it should be removed.
    2.2. And then if it can't find anything that is allowed to be deleted, trigger the behavior the software already does when the drive doesn't have enough space because of data not under the control of BiT.

That should at least be something the user should be allowed to decide.

Maybe add a second checkbox saying something along the lines of "Prioritize Smart Preservation over the other rules."

Or if it turns out it's so dumb raw that Smart Remove actually just never gets used at all when the other options are enabled, not even leaving those exceptioned snapshots to be deleted last; then just gray it out altogether when any of the options that kills it is enabled. But again, for me it really makes sense that the options should all work together in harmony, with at the very least, an option to make "Keep" not just a "try to" thing, but actually a trustworthy promise.


Slapping a warning that Smart Remove is superfluous, or doesn't have the last word; seems like a hacky solution, and it's basically just an admission that the bug exist and "it's not a bug, it's a feature". The original suggestion of just clarifying the bad behavior was a heat of the moment thing when I was still immediately pissed off the app deleted stuff it said it would keep. I really think this needs to be fixed and not just acknowledged (but until fixed, people should be made aware it won't really keep the stuff it says it will keep, maybe change the wording to "Try to keep*" and with the asterisk (or the whole trio of words plus the asterisk) in a highlighted color to let the user know it's not just text, explaining on hover what's the order of priorities when performing deletions, and maybe opening a help page when clicked that goes into more wordy details).

Also, the presence of "Don't remove named snapshots" at the bottom seems to reinforce the idea that stuff bellow overrides stuff above. I'm not sure how to address that; maybe could be in a second column, on the same level as the first options? But again, that's just if you're gonna just slap a bandaid on it and not make things better. Or even that option also gets killed by the stuff above and also needs clarification? I don't think I had any named backups so dunno how it goes; I hope it's not the same dumb behavior, cause it sounds like something that would also really upset people if it turns out it's false promise as well.

@buhtz
Copy link
Member

buhtz commented Apr 10, 2024

Thanks for your thoughts and insights. My own opinion is not a final decision. We need to discuss it in the team (here in this discussion thread).

I don't see how to change the behavior. The point is that the smart remove rules are processed after each backup. The consequence is that all existing snapshots are "protected" by the smart remove rules.
When the less-space-rule is triggered there is no room anymore to check which of the existing snapshots could be deleted and which not. All existing snapshots are there because of a smart-remove rule. This means if BIT asks: Can I delete this snapshot or would this violate one of the smart-remove rules? The answer would always be "No".
Because of that a less-space-rule wouldn't make sense because there is no snapshot that could be deleted.

Maybe I don't see all details currently.

About the named snapshots. I have to test it but to my knowledge of the code they are protected against all other rules. One of the bigger problems in BIT is that we do not have clear unit test for the rules and their combinations.

Yes. This code indicates that named snapshots are deleted from the list of snapshots to delete. 😆 In short: Named snapshots are always protected.

backintime/common/snapshots.py

Lines 1902 to 1905 in d33f35e

if self.config.dontRemoveNamedSnapshots():
if snapshots[0].name:
del snapshots[0]
continue

@TiagoTiago
Copy link
Author

So Smart Remove already removes things preemptively, and there is never anything left that fell out of the exclusions? Would it be possible to switch things around, and make it a filter for the Auto-Remove, making the Auto-Remove smart, as the current layout seems to imply; basically any snapshot protected by the Keep rules gets removed from the list of candidates the way it happens with Named Snapshots?

Maybe keep the "Smart" word, but next to it, a drop-down that lets you select between "Remove" and "Preserve"; with Smart-Remove doing this preemptive deletion that is the current behavior, and "Smart Preserve" acting as this filter behavior I expected where it protects snapshots from deletion when the deletion functions get triggered and doesn't by itself perform any deletion?

@TiagoTiago
Copy link
Author

Though, with the dropdown approach; the "Keep" words there would still be somewhat misleading while in the Remove mode for anyone that don't look into it. Hm...

Maybe forget about "Smart", and just have a drop-down that selects either "Instant Remove All Except:" and "Preserve:". And don't start the phrases with "Keep", so just "all snapshots for the last N days" and "one snapshot per *", so the the phrases make sense in both modes ?

@buhtz
Copy link
Member

buhtz commented Apr 15, 2024

Btw: In #1107 there was also a discussion about re-design the dialog.

@buhtz buhtz added the Medium label Apr 15, 2024
@daveTheOldCoder
Copy link

But why not just disable the space rule?

That's my preference.

If BiT or another application runs out of disk space, I want it to issue an error message and stop. Then I'll figure out what to do.

@ceperman
Copy link

Btw: In #1107 there was also a discussion about re-design the dialog.

In that issue & my description of how removal/retention works, I did point out that removal on the basis of space takes place after smart remove happens and not in the order that the options appear in the dialog. But even then I don't think I'd fully appreciated the undesirable consequences of running out of space.

I don't think you should disable the "space" rule, there may be someone who wants it or uses it. I'd suggest either separating the smart from the non-smart options and make it very clear they operate independently, or even make them mutually exclusive. Then do what @daveTheOldCoder says: if BiT runs out of space and the space rule is not set, then stop with an error.

@buhtz
Copy link
Member

buhtz commented Apr 16, 2024

Thanks. I will take that into account.
A question about the wording. Myself I also do not like the term "Smart". But English is not my native language. Can you suggest a better and more realistic term?

@daveTheOldCoder
Copy link

daveTheOldCoder commented Apr 16, 2024

Can you suggest a better and more realistic term?

Some possible replacements for "Smart remove":
Retention
Retention policies
Retention rules

@ceperman
Copy link

Can you suggest a better and more realistic term?

Some possible replacements for "Smart remove": Retention Retention policies Retention rules

Or perhaps "Retention conditions". They are definitely conditions for retention, not removal.

@buhtz buhtz changed the title Make EXTREMELY CLEAR that the Auto-remove options take priority over all the "Keep one snapshot for/per" options Make EXTREMELY CLEAR that the Auto-remove options take priority over all the "Keep one snapshot for/per" options / Rename "Smart remove" Jun 17, 2024
@buhtz buhtz removed the Feedback needs user response, may be closed after timeout without a response label Jun 17, 2024
@buhtz buhtz removed their assignment Jun 17, 2024
buhtz added a commit that referenced this issue Jul 6, 2024
…sions (#1743, #1751)

The global flock context manager (introduced in #1679) now fall back to single-user mode if permissions are insufficient to create an flock file accessible by other users and root. The multi-user mode will be activate later and automatically when BIT is started as root for the first time. It will create an flock file accessible by all users.

That behavior is relevant for Arch-based systems but not relevant on Debian-based systems because regular users have enough permissions to /run/lock.

Additionally fix a bug (introduced with #1679) ignoring the config value of `global.use_flock`.

Additionally Sphinx config is modified and improved..

Fix #1743 
Fix #1751
@emtiu
Copy link
Member

emtiu commented Jul 15, 2024

As for "Problem B": Our dialogs are already quite full of hints and warnings, but if I'm not mistaken, your problem would have been averted with a hint in the Exclude tab of the Settings that says something like: Excluding files and folders only affects future snapshots. For past snapshots, files and folders can be removed from the file viewer.

@DerekVeit
Copy link
Contributor

I think it would help to replace the text "Smart remove" with something like "Remove all previous snapshots except". This would make it clear that the checkbox is enabling another reason for removal, with its own exceptions, rather than just adding the exceptions.

@buhtz
Copy link
Member

buhtz commented Dec 18, 2024

Might not be directly relevant to this issue. But I investigated the current auto-/smart-remove behavior and documented it. I found inconsistent behavior and minor bugs. There is a lot potential to improve and refactoring.

Before starting with this I would like to request your opinions and suggestions about the next steps. Please see the issue #1945 for details and a mockup for the new auto-remove dialog-tab.

Feedback and suggestions welcome. We appreciate your feedback.

@daveTheOldCoder
Copy link

I suggest adding an option to include this in debug/log output:

  1. List of backups considered for removal/keeping.
  2. For each backup in the list, if it's removed, indicate the reason (the rule that was applied). Similarly, if it's kept, indicate the reason.

I added simple debugging code to do that when I was investigating #1094, and could provide that.

@buhtz
Copy link
Member

buhtz commented Dec 18, 2024

Yes, I also had in mind to somehow illustrate (in the GUI) the process and the reasons of actions (remove/keep) taken. This way we could offer a dry-run for this, too.

This is a lot of extra work. And I am not sure about the benefit. If the GUI and its documentation is simple and clear enough (in a prefect world) we wouldn't need such an illustration.

But I will keep it mind while refactoring the code, so we have better opportunities to add a feature like this in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cosmetics appearance, icons, themes Discussion decision or consensus needed Documentation GOOD FIRST ISSUE Used by 24pullrequests.com to suggest issues Medium
Projects
None yet
Development

No branches or pull requests

6 participants