Make it fast and easy to do multiple replacements #13

twpayne · 2019-02-01T05:16:02Z

chezmoi uses renameio to ensure that files in the user's home directory are replaced atomically.

renameio.TempDir takes the name of the directory where the file replacement will take place so that it can determine a suitable location for the interim file with which the final destination file will be replaced.

chezmoi calls renameio.TempDir once based on the destination directory, but this implicitly assumes that the user's home directory is on a single filesystem. If the user's home directory spans multiple filesystems then different temporary directories will be needed. If the user's home directory is on a single filesystem then the same temporary directory can be re-used.

The safe path is to call renameio.TempDir for every file replacement, but this is very slow (chezmoi might change thousands of files, there's no need to create thousands of temporary directories when only one is needed).

I think renameio needs a function like:

// MaybeNewTempDir returns either one of existingTempDirs if a member (any of
// the keys) can be used as a temporary directory for replacing dest, or a new
// temporary directory otherwise. It is the caller's responsibility to add the
// return value to future calls to MaybeNewTempDir if needed, i.e.
// MaybeNewTempDir does not modify existingTempDirs.
func MaybeNewTempDir(dest string, existingTempDirs map[string]struct{}) string {
  // ...
}

The text was updated successfully, but these errors were encountered:

stapelberg · 2019-02-01T14:59:25Z

My first reaction is that this probably doesn’t belong into the renameio package, but of course we can revisit that later.

As for how to solve the problem, I’d suggest to get the list of mounts (/proc/self/mountinfo on Linux, not sure if there is a package which provides this info in a portable way) and call renameio.TempDir once for all mounts underneath your target directories.

What do you think?

twpayne · 2019-02-02T23:34:23Z

Hmmm, I think this does fit in the renameio package because it's quite strongly coupled to how renameio works under-the-hood. A different package to renameio could provide the same functionality of atomic file updates but using a different mechanism which would then require a different implementation of MaybeNewTempDir. Therefore, for me, MaybeNewTempDir should be part of renameio.

I don't think it's necessary to parse the mount table. Another way to do it would be to stat(2) the destination and look at the st_dev field and only use a new temporary directory if the value has not been seen before. I'll see if I can rustle something up.

stapelberg · 2019-02-03T11:17:24Z

A different package to renameio could provide the same functionality of atomic file updates

I actually don’t see how? AFAIK, the only portable way across unices is what renameio implements. Other packages (which I have surveyed in golang/go#22397 (comment)) use the same approach.

The implementation you’re suggesting in #15 does not seem to require access to any internals of the renameio package. Hence, I’d prefer it if we could keep it in a separate import path (at least) until we know that it’s useful for a number of people.

I don't think it's necessary to parse the mount table. Another way to do it would be to stat(2) the destination and look at the st_dev field and only use a new temporary directory if the value has not been seen before.

That’s true, but parsing the mount table can be done once at startup, whereas using stat requires one additional system call per file. This might not matter much for the specific use-case you’re interested in, but it does matter for the use-case for which renameio was written (updating https://manpages.debian.org/).

twpayne · 2019-02-03T18:04:37Z

That’s true, but parsing the mount table can be done once at startup, whereas using stat requires one additional system call per file. This might not matter much for the specific use-case you’re interested in, but it does matter for the use-case for which renameio was written (updating manpages.debian.org).

It's only necessary to call stat once per destination directory, not once per file, which is a significant difference. Of course it's up to the code using renameio as to whether it uses the cache or not, so updating manpages.debian.org would be unaffected (unless you wanted it to be).

A problem with parsing the mount table is that you then have to work out the relationship between the destination directory and the mount points. This is a hard problem and simple string prefix matching is definitely insufficient (see the discussion in golang/go#18358).

stapelberg · 2019-02-03T18:25:07Z

A problem with parsing the mount table is that you then have to work out the relationship between the destination directory and the mount points. This is a hard problem and simple string prefix matching is definitely insufficient

That’s a fair point. I suppose the stat-based cache is a good solution. Can you make use of it in chezmoi and we’ll move it to renameio once other people need it, too?

twpayne · 2019-02-03T18:26:17Z

Yup, sounds good.

twpayne · 2019-02-04T08:10:43Z

FWIW, twpayne/chezmoi#208 should fix this in chezmoi.

twpayne mentioned this issue Feb 1, 2019

Handle home directories that span multiple filesystems twpayne/chezmoi#197

Closed

twpayne mentioned this issue Feb 3, 2019

Add TempDirCache to cache TempDir results, fixes #13 #15

Closed

twpayne closed this as completed Feb 3, 2019

twpayne mentioned this issue Feb 4, 2019

Use renameio correctly across multiple filesystems, fixes #197 twpayne/chezmoi#208

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it fast and easy to do multiple replacements #13

Make it fast and easy to do multiple replacements #13

twpayne commented Feb 1, 2019

stapelberg commented Feb 1, 2019

twpayne commented Feb 2, 2019

stapelberg commented Feb 3, 2019

twpayne commented Feb 3, 2019 •

edited

Loading

stapelberg commented Feb 3, 2019

twpayne commented Feb 3, 2019

twpayne commented Feb 4, 2019

Make it fast and easy to do multiple replacements #13

Make it fast and easy to do multiple replacements #13

Comments

twpayne commented Feb 1, 2019

stapelberg commented Feb 1, 2019

twpayne commented Feb 2, 2019

stapelberg commented Feb 3, 2019

twpayne commented Feb 3, 2019 • edited Loading

stapelberg commented Feb 3, 2019

twpayne commented Feb 3, 2019

twpayne commented Feb 4, 2019

twpayne commented Feb 3, 2019 •

edited

Loading