Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: --include / --include-from #235

Closed
brain-freeze opened this issue May 13, 2018 · 18 comments
Closed

Feature Request: --include / --include-from #235

brain-freeze opened this issue May 13, 2018 · 18 comments

Comments

@brain-freeze
Copy link

Hey,
I switched to gocryptfs from encfs and it works great so far. However the performance over sshfs is much worse than it was with encfs (and #35 is still a problem). So I thought reverse mode may be the better idea for backup use cases anyways. The problem here is a missing --include/--include-from option, which works just like the parameter for rsync. This would allow it to create an encrypted view of certain folders instead of an entire directory.

@rfjakob
Copy link
Owner

rfjakob commented May 13, 2018

Hi! Thanks for the report - what version do you run and how much worse is "much worse"?

@brain-freeze
Copy link
Author

I don't have some exact numbers, but I have the impression that it's related to directories with many small files. I will do some testing with this and report back.

@brain-freeze
Copy link
Author

The version was 1.4.3, but I couldn't reproduce any performance shortcomings compared to encfs. Probably I just got the impression after the replacement of encfs, but there isn't any real issue here.
Anyways, it would be great to get rid of that sshfs layer, because this definitely slows down the data transfer rate.

@brain-freeze
Copy link
Author

Lately I switched to reverse mode for better performance with remote syncing. An "exclude-from" command would come in here very handy, too. This could save some time to avoid syncing cache folders or VMs that I don't need backup for.

@charles-dyfis-net
Copy link
Contributor

Might I suggest bind mounts to map the directories you want to exclude through from the underlying storage?

@brain-freeze
Copy link
Author

How would you do an exclude with bind mount? I'm using https://github.com/gburca/rofs-filtered for now.

@charles-dyfis-net
Copy link
Contributor

mount --bind /unencrypted/storage/thing-that-does-not-need-encryption \
             /encrypted/mountpoint/thing-that-does-not-need-encryption

@rfjakob
Copy link
Owner

rfjakob commented Aug 5, 2018

@charles-dyfis-net: interesting idea!

I think brain-freeze wanted to not copy files at all, not only not encrypt them. I guess you could bind mount an empty folder over the folder you want to exclude?

@brain-freeze
Copy link
Author

The --include-from/--exclude-from feature of rsync is crucial for my backup purposes. With reverse mode of gocryptfs it's not possible to use this anymore, because the file/folder names are encrypted as well. The only possibilities to overcome this, which are working with folders and files aswell, are (afaik):

  • Don't use reverse mode, but regular mode with underlying sshfs. This approach has really bad performance with my internet connection.
  • Filtering on the file system layer before data gets encrypted. I discovered rofs-filtered lately and it's exactly providing this feature for a read only view of a folder. For now it looks like if it's doing the job as expected, so that is a viable workaround. However it would be great to have this directly in gocryptfs at some point.

@rfjakob
Copy link
Owner

rfjakob commented Aug 5, 2018

Option number 3 is to get the encrypted file names by searching for the inode number (find -inum) or with https://github.com/rfjakob/gocryptfs/blob/master/contrib/ctlsock-encrypt.bash

But i agree that --exclude would be much more user friendly. I'll see how this could be implemented. Problem is that the stdlib does not support passing multiple --exclude arguments. I would have to use another argument parsing library.

@rfjakob
Copy link
Owner

rfjakob commented Aug 5, 2018

PS: This is how the backintime backup tool handles exclusions with encfs reverse mode.

@rfjakob
Copy link
Owner

rfjakob commented Aug 5, 2018

PPS: looks like cobra is the cli library of choice nowadays.

@brain-freeze
Copy link
Author

Personally I think that --exclude-from, where you can specify a file containing all paths to exclude, is a bit more useful anyways. If you want to exclude stuff, it's likely that you want to exclude more than one folder/file. If you do this with multiple --exclude arguments you end up with a messy command line pretty fast.

rfjakob added a commit that referenced this issue Aug 11, 2018
@rfjakob
Copy link
Owner

rfjakob commented Aug 11, 2018

I have just added an --exclude feature to reverse mode. Please test!

I have chosen --exclude over --exclude-from because it is more generic. You can still put your excludes into a file like this:

gocryptfs -reverse $(cat exclude.txt) /home/user /mnt/user.encrypted

With exclude.txt containing lines like this:

--exclude Movies
--exclude Music
--exclude "Ebook Collection"

@brain-freeze
Copy link
Author

Thanks, works nicely. The performance is better since there is no rofs-filtered anymore.

I will use this:

EXCLUDE=$(while read i; do echo "--exclude "${i}"";done < "excludes.txt")
gocryptfs -reverse $EXCLUDE folder1 folder2`

@rfjakob
Copy link
Owner

rfjakob commented Aug 15, 2018

Thanks for testing! I'll call this ticket done.

@rfjakob rfjakob closed this as completed Aug 15, 2018
@charles-dyfis-net
Copy link
Contributor

charles-dyfis-net commented Aug 15, 2018

The above code cannot possibly work. See BashFAQ #50: Quotes in the result of command substitutions or other expansions are treated as literal, not parsed by the shell as syntax.

Consequently, the quotes added to the EXCLUDE string by the above code would be passed to gocryptfs as literal parts of its argument vector list; and any directory name containing spaces would have the parts on each side passed as a separate argument (My Directory generating three arguments: --exclude, "My and Directory").

Quotes inside $(cat exclude.txt) are not honored for the same reason.

A variant that would work (using an array to collect the arguments, rather than collecting them in a string and expanding it unquoted) follows:

excludes=( )
while read -r i; do
  excludes+=( --exclude "$i" )
done <excludes.txt
gocryptfs -reverse "${excludes[@]}" folder1 folder2

...or, to support baseline POSIX sh (which doesn't support arrays), one can use a function to be able to override its argument list without making changes to the outer scope:

mount_with_excludes() {
  set --
  while read -r i; do
    set -- --exclude "$i" "$@"
  done
  gocryptfs "$@" folder1 folder2
}
mount_with_excludes -reverse folder1 folder2 <excludes.txt

...will prepend the --exclude arguments, generating a command line that behaves akin to gocryptfs --exclude Movies --exclude Music --exclude "Ebook Collection" -reverse folder1 folder2, if given a file that contains Music, Movies and Ebook Collection (no literal quotes!) as separate lines.

@charles-dyfis-net
Copy link
Contributor

charles-dyfis-net commented Aug 15, 2018

Heh. Actually, I'm going to have to withdraw that "cannot possibly work", a little: The quotes are all syntactic, not literal, in

EXCLUDE=$(while read i; do echo "--exclude "${i}"";done < "excludes.txt")

Basically, you have a quoted string "--exclude ", then an unquoted expansion ${i}, then an empty quoted string "". This isn't putting quotes around $i, as was presumably the authorial intent; rather, it's just ending the quotes before the expansion and running the expansion unquoted.

A line with My Directory will thus become --exclude, My, Directory, rather than --exclude, "My, Directory" -- still a wrong result, but not the specific wrong result I claimed above.

What's even more fun is that if you were trying to exclude a directory named My Work * KEEP OUT *, you'd get the whitespace-surrounded *s replaced with a list of names in the location where the script is being run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants