Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix truncation of files upon read when using object store and encryption #28389

Merged

Conversation

alanmeeson
Copy link
Contributor

@alanmeeson alanmeeson commented Aug 11, 2021

When using and object store as primary storage and using the default encryption module at the same time, any encrypted file would be truncated when read, and a text error message added to the end.

This was caused by a combination of the reliance of the read functions on on knowing the unencrypted file size, and a bug in the function which calculated the unencrypted file size for a given file.

In order to calculate the unencrypted file size, the function would first skip the header block, then use fseek to skip to the last encrypted block in the file. Because there was a correspondence between the encrypted and unencrypted blocks, this would also be the last encrypted block. It would then read the final block and decrypt it to get the unencrypted length of the last block. With that, the number of blocks, and the unencrypted block size, it could calculate the unencrypted file size.

The trouble was that when using an object store, an fread call doesn't always get you the number of bytes you asked for, even if they are available. To resolve this I adapted the stream_read_block function from lib/private/Files/Streams/Encryption.php to work here. This function wraps the fread call in a loop and repeats until it has the entire set of bytes that were requested, or there are no more to get. (The same sort of fix as in the PR mentioned later on).

This fixes the immediate bug, and should (with luck) allow people to read their encrypted files now. (The problem was purely on the decryption side). In the future it would be nice to do some refactoring here.

I have done some testing of this with image files ranging from 1kb to 10mb using Nextcloud version 22.1.0 (the nextcloud:22.1-apache docker image), with sqlite and a Linode object store as the primary storage.

Does anyone have time to take a look and help nudge this PR into contribution shape?

I think @szaimen has been keeping an eye on the related issue, and @kesselb and @rullzer were involved in a previous related PR.

Copy link
Member

@PVince81 PVince81 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot for the in-depth analysis!

👍 code looks fine to me, it makes sense

@PVince81
Copy link
Member

can you run composer install && composer run cs:fix lib/private/Files/Storage/Wrapper/Encryption.php ?

this should fix CI

@skjnldsv skjnldsv added 4. to release Ready to be released and/or waiting for tests to finish and removed 3. to review Waiting for reviews labels Aug 18, 2021
Copy link
Member

@blizzz blizzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small remark, but looks good as far as i can say.

* @param int $blockSize Length of requested data block in bytes
* @return string Data fetched from stream.
*/
private function fread_block($handle, $blockSize): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private function fread_block($handle, $blockSize): string {
private function fread_block($handle, int $blockSize): string {

add type hint for blockSize?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alanmeeson, can you address those? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type hint and cs:fix comments addressed.

@mashedkeyboard
Copy link

Could I suggest that a test of the encryption functionality be included within https://github.com/nextcloud/server/blob/master/tests/lib/Files/ObjectStore/ObjectStoreTest.php along with this PR to prevent regression? I can try and put one together as a separate PR, but it seems to make more sense to include it in this one. I don't think there are any tests currently that cover object stores with encryption - though, if there are, they clearly need fixing!

@juliusknorr
Copy link
Member

The test cases can be in the encryption related classes and could be annotated with @group PRIMARY-s3 to run then.

@alanmeeson
Copy link
Contributor Author

Thanks all for the reviews and suggestions. Now that I am back to having internet again I can start doing something about them.

I'll get on with setting up an actual development environment (rather than using nano over ssh into a VPS) and sort out the code standards and type hints issues. If anyone has any suggestions for what a good PHP development setup looks like these days I'd be very happy to hear about them; it's been quite a while since I last did anything in PHP.

Regarding the unit tests: I think the simple approach would be to write a test which uses the encryption module with an S3-equivalent store, however that doesn't guarantee that the issue this PR fixes would always occur in the tests. My ideal approach would be to add a unit test to the Encryption tests in which we mock out the fread function so that it artificially hits each of the edge cases described in the fread documentation:

  • fread reads more bytes than requested, as though due to buffer magic.
  • fread reads fewer bytes than requested, despite not being at the end of the stream.
  • fread reads fewer bytes than requested, due to EOF.

Is there a standard approach to mocking used in this project?

On another note: The fix in this PR addresses correctly estimating the unencrypted file size; it doesn't do anything in cases where a file has already had the wrong file size estimated for it. This means that if people have uploaded files with the old version of the code, and had incorrect file size estimates captured in the meta data, then this fix won't help them. I can see of a potential hacky fix that could resolve this (if we fail to decrypt a block, try the "last block" encryption key format, in case we are at the last block but didn't know it). The alternative is for users to do a one off wipe of their file size estimates somehow, so that they are re-estimated with the new code. Does this edge case matter enough to take pre-emptive action, or is it something to just tackle if someone shouts about it?

@alanmeeson
Copy link
Contributor Author

I've added the type hint as suggested and applied composer cs:fix as requested.

I'm not going to have time to figure out the testing framework within the next month or so.

If anyone, (eg: @mashedkeyboard ;-) ) wants to have a go, please have at it.

@szaimen
Copy link
Contributor

szaimen commented Aug 31, 2021

CI is unfortuneately failing again

@szaimen szaimen added 2. developing Work in progress and removed 4. to release Ready to be released and/or waiting for tests to finish labels Aug 31, 2021
@skjnldsv skjnldsv mentioned this pull request Oct 13, 2021
@skjnldsv

This comment has been minimized.

@skjnldsv skjnldsv added 4. to release Ready to be released and/or waiting for tests to finish and removed 2. developing Work in progress labels Oct 22, 2021
@skjnldsv
Copy link
Member

@alanmeeson please rebase or allow us to edit this PR

@alanmeeson
Copy link
Contributor Author

alanmeeson commented Oct 23, 2021

Allow edits by maintainers is already set, I think you should have permissions to edit.

Just in case that doesn't work, I've invited you as a collaborator to my fork of the repo.

@skjnldsv
Copy link
Member

Can you then please rebase your branch?

@skjnldsv

This comment has been minimized.

…ion.

When using and object store as primary storage and using the default
encryption module at the same time,  any encrypted file would be truncated
when read, and a text error message added to the end.

This was caused by a combination of the reliance of the read functions on
on knowing the unencrypted file size,  and a bug in the function which
calculated the unencrypted file size for a given file.

In order to calculate the unencrypted file size,  the function would first
skip the header block, then use fseek to skip to the last encrypted block
in the file.  Because there was a corresponence between the encrypted and
unencrypted blocks, this would also be the last encrypted block.  It would
then read the final block and decrypt it to get the unencrypted length of
the last block.  With that, the number of blocks, and the unencrypted block
size, it could calculate the unencrypted file size.

The trouble was that when using an object store, an fread call doesn't
always get you the number of bytes you asked for, even if they are
available.  To resolve this I adapted the stream_read_block function from
lib/private/Files/Streams/Encryption.php to work here.  This function
wraps the fread call in a loop and repeats until it has the entire set of
bytes that were requested,  or there are no more to get.

This fixes the imediate bug, and should (with luck) allow people to get
their encrypted files out of Nextcloud now.  (The problem was purely on
the decryption side).  In the future it would be nice to do some
refactoring here.

I have tested this with image files ranging from 1kb to 10mb using
Nextcloud version 22.1.0 (the nextcloud:22.1-apache docker image), with
sqlite and a Linode object store as the primary storage.

Signed-off-by: Alan Meeson <[email protected]>
@alanmeeson alanmeeson force-pushed the bugfix/22077/default-encryption-module branch from a5a6b7a to 16f70e8 Compare October 23, 2021 14:12
@alanmeeson
Copy link
Contributor Author

Rebased on top of nextcloud:master & pushed.

I haven't squashed the commits - I couldn't find any indication as to the what the project preference is regarding that in the docs, and in absence of other guidance I tend to err towards avoiding re-writing repo history.

Hopefully that covers the required bits, and you should have ability to change things yourselves if needed so that I'm not a blocker.

@skjnldsv
Copy link
Member

skjnldsv commented Oct 23, 2021

! [remote rejected]       fork/bugfix/22077/default-encryption-module -> bugfix/22077/default-encryption-module
(refusing to allow a Personal Access Token to create or update workflow `.github/workflows/ftp.yml` without `workflow` scope)

Ah, got it! The issue is that rebasing contains a new workflow, which is against github security policy :)
Thanks for rebasing 😉

@skjnldsv skjnldsv merged commit 623ac8c into nextcloud:master Oct 23, 2021
@welcome
Copy link

welcome bot commented Oct 23, 2021

Thanks for your first pull request and welcome to the community! Feel free to keep them coming! If you are looking for issues to tackle then have a look at this selection: https://github.com/nextcloud/server/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants