-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor read performance with network backend #92
Comments
After some extra tests it seems that setting the max-read-ahead for
How does |
Looks like gocryptfs is doing something that's pretty slow on Amazon Cloud Drive. I don't really known how ACD works, but can you give me access to an ACD folder filled with garbage to examine this? |
PS: fadvise: no, fadvise is completely ignored. |
@rfjakob I cannot delegate part of an ACD folder sadly has mount is all-or-nothing. IIRC, ACD has a free plan though and it's super easy to set it up. |
@rfjakob I think I found what causes the problem: reads issued by EDIT: |
Can you attach the log?
|
|
It looks like there are two processes reading sequentially at the same time
|
I simply run a cat or cp command for the test. I don't think those two are
threaded in any way.
…On Thu, Mar 16, 2017, 11:37 rfjakob ***@***.***> wrote:
It look like there are two processes reading sequentially at the same time
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#92 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEfKlcQy16cW25h5yBvVo1u6GIaKBAd4ks5rmRDygaJpZM4MegP2>
.
|
The other process may be the kernel performing readahead on it's own. Can you try mounting gocryptfs with
|
No luck there :'(
|
Ok. I probably have time to look at this today in the evening.
|
I know what this problem is... It is almost certainly this issue: hanwen/go-fuse#140 (reported by me!) I was unable to fix this with go-fuse, but https://github.com/bazil/fuse doesn't seem to have this problem so I never switched over to go-fuse in rclone. I tried to fix go-fuse but I didn't succeed and I couldn't convince the author that it was anything other than expected behaviour. |
I tested reading a 10MB file from tmpfs and graphed the read offsets. singleReader, as mentioned by @ncw does not seem enough to get the accesses to serialize. @j-vizcaino No, switching to bazil-fuse is not an option. gocryptfs uses lots of the infrasctructure that go-fuse provides. |
However, looking at this
we see that "cat" never reads more than 131072 bytes in one call, which is the maximum that FUSE can handle in one call. I initially thought the kernel splits bigger requests into 128KB chunks and sends all requests in parallel. But that is not what seems to be happening. Is it kernel readahead? |
That's a real shame,
@rfjakob Can you give a bit more information about that? |
It does not look like go-fuse's fault, and I think the issue is fixable through a number of ways. As for more information about go-fuse vs bazil-fuse, I'll quote an email to another developer asking for advice:
|
Thank you for the explanation: it now makes perfect sense. |
I would be interested in your thoughts here so I can apply them to rclone!
Assuming I'm understanding the terminology correctly, bazil-fuse has a fuse/fs module which does path mapping. I've implemented both fuse systems for rclone - both using the higher level path interface. I started with bazil-fuse which works well, but I thought I would try go-fuse as I thought it might be higher performance - alas because of the issue above development stalled on it. |
I think the problem with bazil-fuse's path mapping is that it does not track renames. This means just doing
fails. Unless you implement rename tracking yourself. At least it was like this when I reported airnandez/cluefs#3 . Cluefs has now implemented it's own rename tracking. |
Due to kernel readahead, we usually get multiple read requests at the same time. These get submitted to the backing storage in random order, which is a problem if seeking is very expensive. Details: #92
I think I got it. @j-vizcaino can you pull the "hkdf" branch, build, and mount with the "-serialize_reads" option? |
Due to kernel readahead, we usually get multiple read requests at the same time. These get submitted to the backing storage in random order, which is a problem if seeking is very expensive. Details: #92
I managed to get the ACD 3-months-trial going. gocryptfs default:
gocryptfs with "-serialize_reads":
directy from the rclone mount:
PS: I have just merged the hkdf branch into master. So please just pull and test master. |
Nice! Thank you for the quick fix!
I will give it a try this evening.
…On Sat, Mar 18, 2017, 17:15 rfjakob ***@***.***> wrote:
I managed to get the ACD 3-months-trial going.
gocryptfs default:
10485760 bytes (10 MB, 10 MiB) copied, 78,4737 s, 134 kB/s
gocryptfs with "-serialize_reads":
10485760 bytes (10 MB, 10 MiB) copied, 3,60109 s, 2,9 MB/s
directy from the rclone mount:
10567698 bytes (11 MB, 10 MiB) copied, 3,60769 s, 2,9 MB/s
PS: I have just merged the hkdf branch into master. So please just pull
and test master.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#92 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEfKlYj7wIrUaC4qoDIdIrQskCHLGme8ks5rnAMigaJpZM4MegP2>
.
|
@ncw If you want to take a look, the logic is in https://github.com/rfjakob/gocryptfs/blob/master/internal/serialize_reads/sr.go . I have tried several simpler schemes as well, like delaying out-of-order requests for a few milliseconds, but this did not get them sorted in all cases, as the kernel submits read-ahead reads pretty aggessively. Oh and I also tried disabling read-ahead. This works and keeps everything nicely ordered and single-threaded, but it cuts down the read request sizes to 4KB (1 page). |
|
I have documented the option in MANPAGE.md. Unfortunately the ordering and serialization forces gocryptfs to wait quite a bit, so I can't make this a default. On my machine, using |
@rfjakob thanks for writing that up. Very interesting about the read ahead setting. I'll warm up my go-fuse branch and have a go with your logic. |
* Doesn't pass the tests yet * Fix seeking for readahead - see * rfjakob/gocryptfs#92 * https://github.com/rfjakob/gocryptfs/blob/master/internal/serialize_reads/sr.go
* Doesn't pass the tests yet * Fix seeking for readahead - see * rfjakob/gocryptfs#92 * https://github.com/rfjakob/gocryptfs/blob/master/internal/serialize_reads/sr.go
We enable FUSE_CAP_ASYNC_READ per default, which means that the kernel can (and does) submit multiple concurrent out-of-order read requests to service userspace reads and kernel readahead. For some backing storages, like Amazon Cloud Drive, out-of-order reads are expensive. gocryptfs has implemented a delay-based workaround with its `-serialize_reads` flag for this case (see rfjakob/gocryptfs#92 for details). Not enabling FUSE_CAP_ASYNC_READ makes the kernel do this for us, as verified by adding debug output to gocryptfs, so expose it as a mount flag in MountOptions. Fixes: #140 Graphs-at: #395 Related: rfjakob/gocryptfs#92 Change-Id: I10f947d71e1453989c4a9b66fbb0407f7163994f
Use case
/tmp/encrypted
is an Amazon Cloud Drive folder, mounted read-only usingrclone
/tmp/clear
is thegocryptfs
deciphered version of the above.Copying a 300MB file (ciphered) from
/tmp/encrypted
, I get around 5-6MBps, which is OK, considering my bandwidth is 10MBps.Copying the same file from
/tmp/clear
yields extremely poor speeds (from 100KBps to 1MBps).Speed is not limited by the CPU.
Tracing read requests in
rclone
shows read sizes from 4KB to 128KB occuring at low frequency (thus low througput)Any advice would be greatly appreciated.
Environment
The text was updated successfully, but these errors were encountered: