-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid overflows in JNI allocations #639
Conversation
Themis Core API works with "size_t" for buffer size inputs and outputs, that is uint32_t on 32-bit systems or uint64_t on 64-bit ones. In most cases Themis data structures use uint32_t for data length fields, allowing input data to be up to 4 GB long, theoretically. On the other hand, JVM uses "int" type for its array indices, that is int32_t everywhere, regardless of the host system. Note that it is a *signed* integer, meaning that native JVM byte[] arrays cannot fit more than 2 GB of data, inclusive. There are hacks to overcome this limit, but with byte[] API -- as in Themis -- you are limited to 2 GB. JNI type "jsize" reflects this limitation, it is defined to be "jint" which is typically defined as "signed int", assuming 32-bit "int" types on most modern platforms. Thanks to C being very safe language, sizes bigger than 2^31-1 silently overflow into negative space and then it's up to JNI to handle this situation. Desktop Java systems typically throw a NegativeArraySizeException when trying to allocate an array with negative size, but Android systems typically kill the process due to an assertion failure. In order to have predictable behavior in this case, check all sizes before trying to allocate an array of that size, and exit with an error if the allocation would overflow. This way instead of crashing we will throw an appropriate Themis subsystem exception. Note that in some cases the array sizes do not depend on user input, but we still check just in case the Core library does something silly. In other cases the output can get that big due to input being sufficiently big -- slightly smaller than 2 GB, but enough for Themis data overhead to push that over the 2 GB limit. However, in most cases this situation can be triggered by corrupted input where the data length fields contain values inconsistent with actual input size.
With the JNI changes we can unignore a couple of tests for Token Protect mode which actually discovered the overflows. Now Secure Cell should correctly handle corrupted token values.
Since it affects mostly Android, put the note into that section.
Here we use encrypted data as token which is essentially random. Depending on how broken it turns out, Themis might genuinely believe that this is a valid authentication token and proceed with allocation. You get an exception if the allocation is bigger than JVM heap, which it very well might be on emulated devices. Allow OutOfMemoryError to happen in this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially this leads to incompatibility of decrypting > 2 GB messages on some platform then decrypting it on Android. Narrow case, i'm ok with this, but maybe worth mentioning somewhere in Android docs?
Not really. It was never possible to work with 2+ GB messages on Android in the first place, even before this change. In order to decrypt 2+ GB message you first have to read it in full into a |
Encrypting 2+GB file using pythemis, trying to decrypt using Android. It worth mentioning because we claim to be compatible across platforms, but every platform has caveats. I'd mention smth like "Note: Android has restrictions on file size, messages can't be longer than 2GB. If you need to work with large files, consider splitting them into smaller ones.". |
This restriction has the same nature as 32-bit platforms being fundamentally incapable of handling 2 GB messages. It is implied by the API which requires the entire buffer to be in memory. I believe we don't need to mention that explicitly in our docs. Notes like that belong to the platform guides (that is, “Effective Java”, etc.) The developers will learn there that they cannot allocate that much memory and how to handle that situation by chunking. |
Okay, makes sense |
Fix a possible crash on Android systems when handling corrupted input of Secure Cell (and possibly other APIs as well). Now instead of a crash you will get an appropriate exception.
Technical details
Themis Core API works with
size_t
for buffer size inputs and outputs, that isuint32_t
on 32-bit systems oruint64_t
on 64-bit ones. In most cases Themis data structures useuint32_t
for data length fields, allowing input data to be up to 4 GB long, theoretically.On the other hand, JVM uses
int
type for its array indices, that isint32_t
everywhere, regardless of the host system. Note that it is a signed integer, meaning that native JVMbyte[]
arrays cannot fit more than 2 GB of data, inclusive. There are hacks to overcome this limit, but withbyte[]
API – as in Themis – you are limited to 2 GB.JNI type
jsize
reflects this limitation, it is defined to bejint
which is typically defined assigned int
, assuming 32-bit int types on most modern platforms. Thanks to C being very safe language, sizes bigger than 231 – 1 silently overflow into negative space and then it's up to JNI to handle this situation. Desktop Java systems typically throw a NegativeArraySizeException when trying to allocate an array with negative size, but Android systems typically kill the process due to an assertion failure.In order to have predictable behavior in this case, check all sizes before trying to allocate an array of that size, and exit with an error if the allocation would overflow. This way instead of crashing we will throw an appropriate Themis subsystem exception.
Note that in some cases the array sizes do not depend on user input, but we still check just in case the Core library does something silly. In other cases the output can get that big due to input being sufficiently big — slightly smaller than 2 GB, but enough for Themis data overhead to push that over the 2 GB limit. However, in most cases this situation can be triggered by corrupted input where the data length fields contain values inconsistent with actual input size.
Checklist