Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"synth prepare-system" sometimes empties the ccache directory #199

Closed
styrsven opened this issue Aug 9, 2021 · 13 comments
Closed

"synth prepare-system" sometimes empties the ccache directory #199

styrsven opened this issue Aug 9, 2021 · 13 comments

Comments

@styrsven
Copy link

styrsven commented Aug 9, 2021

No description provided.

@styrsven
Copy link
Author

styrsven commented Aug 9, 2021

When I use "synth prepare-system" to update my packages on FreeBSD, synth sometimes empties the ccache directory, including the config file.
I verified by using htop (I have no screeenshot unfortunately) that it is synth that performs an "rm -rf" on the ccache directory.
My main zfs pool is a pair of SSD disks, and my home and ccache directories are located in a separate pool on spinning disks.
The ccache directory has a legacy mountpoint and is mounted in fstab.
Any help/workaround is appreciated.

@jrmarino
Copy link
Owner

jrmarino commented Aug 9, 2021

hmm, that doesn't sound right. deleting ccache files obviously defeats the purpose of ccache. I will do a quick check on the code to see if that is even possible.
Is it possible there's a misconfiguration and the ccache directory is assigned to the wrong configuration item?

@styrsven
Copy link
Author

styrsven commented Aug 9, 2021

My synth.ini contains
Directory_ccache= /var/tmp/ccache
and my fstab contains
zhome/ccache /var/tmp/ccache zfs rw,late 0 0

@jrmarino
Copy link
Owner

jrmarino commented Aug 9, 2021

what could be happening (this is just speculation without checking code) is that the mount for ccache directory is failing to unmount, and the rm -rf is occurring in the root mount, and because ccache is Read/Write, it's deleting. The same thing could happen to distfiles if that mount fails to umount.
I'm not sure how to test that theory. Synth should be able to detect an umount failure and maybe skip the rm -rf command in that case (again, speculation that this is what's happening, it's been years since I looked at the code).

@styrsven styrsven changed the title synth "synth prepare-system" sometimes empties the ccache directory Aug 9, 2021
@styrsven
Copy link
Author

That sounds like a reasonable cause for this issue. I might add that I run my system as a desktop, and turn it off at night. In the morning when I start it, anacron starts what should have run during the night. So anything that scans the file systems, like updatedb, might be accessing ccache through synths mount points when the umount happen would probably make umount fail.
Now, I don't know how synth works internally, but would setting number of builders to one and cd a terminal shell to ccache directory via the builders path be enough to provoke this?
I also was thinking that maybe 'umount -f ccache' (and distfiles) would be a possible fix?
I think I will try the "provoking" method above and report back how it goes.

@styrsven
Copy link
Author

It happened again, so I have some observations.
Extract from pstree outout:

 |-+- 57779 sven /usr/local/bin/xfce4-terminal
 | |-+= 58012 sven -zsh (zsh)
 | | \-+= 82392 root synth prepare-system
 | |   \--- 82278 root /bin/rm -rf /usr/obj/synth-live/SL09

I have this in my synth.ini:

Number_of_builders= 4
Max_jobs_per_builder= 24
Tmpfs_workdir= true
Tmpfs_localbase= true

I have 4 builders, and by observation it seems that the building takes place in SL01 - SL04, so I don't know what SL09 does. I have had more builders previously but reduced the numbers some time ago if that matters.

Also the rm takes place before any building starts.

@jrmarino
Copy link
Owner

hmm, I was assuming you weren't using tmpfs based on this behavior. Looking at the code ...

@jrmarino
Copy link
Owner

FYI SL09 is the mount area for single builds (as opposed to a bulk build). the number 9 is not relevant.
So looking at the code:
If a null mount was mounted RW (as is done for distfiles and packages) and it fails to umount, later "rm -rf" could be run on it. It is probably a zfs issue, but synth definitely could be improved to avoid doing this.

@jrmarino
Copy link
Owner

jrmarino commented Aug 18, 2021

There's a log in the logs directory named 05_abnormal_command_output.log that might confirm that umounts are failing.
It's rewritten at the beginning of each new build though.

@jrmarino
Copy link
Owner

if you want to rebuild synth with a patch, maybe this will help. create files/patch-umount with the following contents:

--- src/replicant.adb
+++ src/replicant.adb
@@ -254,7 +254,7 @@ package body Replicant is
    ---------------
    procedure unmount (device_or_node : String)
    is
-      bsd_command : constant String := "/sbin/umount " & device_or_node;
+      bsd_command : constant String := "/sbin/umount -f " & device_or_node;
       sol_command : constant String := "/usr/sbin/umount " & device_or_node;
       lin_command : constant String := "/usr/bin/umount " & device_or_node;
    begin

@styrsven
Copy link
Author

There's a log in the logs directory named 05_abnormal_command_output.log that might confirm that umounts are failing.
It's rewritten at the beginning of each new build though.

This one was tricky to catch, since SL09 is used before the actual building starts.
I have been able to reproduce the issue by starting "synth prepare-system", wait till SL09 mounts and in a shell cd to "/usr/obj/synth-live/SL09/ccache".
This will make the umount fail and clear my ccache.
And by using
gnu-watch -d "cat /var/synth/www/log/05_abnormal_command_output.log " i can see the error text.
This log file is cleared when synth starts to build in SL01-04 (my config) so the error text is only temporarily present and not preserved.

Thanks for the patch, I will test it ASAP and report back.

@styrsven
Copy link
Author

styrsven commented Sep 7, 2021

I have been running synth daily (as an ex gentooer I just can't help my self, I have a severe case of 'upgraditis') with the patch applied since 22 august, i.e. 17 days without this issue happening, and it used to be more frequent than that. I have also not noted any side effects.

@jrmarino
Copy link
Owner

I pushed the patch:
28cb64d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants