-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wiping data after use #152
Comments
Personally, I prefer to use instance-allocators, as opposed to the global allocators supported by old STL. An instance allocator can be completely reclaimed -- and wiped -- after the code using it has completed. It also leads to excellent data-locality. I always have an uphill battle explaining the benefits to people, even though to me the cost is zero if you are a rigorous programmer. It seems like you are someone who understands. However, it's not a trivial change in jsoncpp. I plan to do it someday for the efficiency of the internals (with security as a free side-effect) but not soon. I see the value (pardon the pun) of a What are your thoughts? |
@cdunn2001: Can you point me at some documentation on instance allocators? |
@jacobsa: Hmmm. I should write something up in my blog. I guess we're talking about multiple ideas at once:
Here is my order of preference for memory models:
Arena data require a Java-esque style of coding, but with a per-request configuration threaded down the call-stack. Here is how to set things up. Key ideas:
Drawbacks:
I generally avoid STL and I would generate protobuf to use Arenas too. But actually, msgpack is smarter than Google's protobuf:
So if we had a good system for JSON (which requires a header for number of bytes), we could decouple our |
Cool, thank you for the brain dump. |
Yes, I agree that stateful allocators are useful. For specifying which arena to use for which objects. I missed stateful allocators support sometimes.
But it is not wiping, is it? Wiping is more like memset(&str[0], 0, str.capacity()); or str.resize(str.capacity());
for (auto& c: str) c = (char)rand(); Not sure why you mentioned setting data to "".
Thus the dictionary keys must wipe themself. It is what my suggestions above are about:
namespace Json
{
#ifdef SELF_WIPING_STRING
class string : public std::string
{
...
~string() { wipe(); std::string::~string(); }
};
#else
typedef std::string string;
#endif
}
namespace Json
{
#ifdef SELF_WIPING
class WipingAllocator : public std::allocator <char>
{
...
};
typedef std::basic_string <char, char_traits <char>, WipingAllocator> string;
#else
typedef std::string string;
#endif
} |
If you wipe strings before destruction, then you can be sure by program logic that they are no longer needed. You're right: You don't technically need to change the size after over-writing the data. And it's probably ok to modify hash-keys just before the hash-table (or red-black) is destructed; the data-structure would have an illegal state, but the destructor probably does not rely on that. So something simple might work as a hack. But you're missing other issues with |
The I see one problem: |
Yes, I agree, "probably ok".
Agree.
I made a proof of concept: #include <iostream>
#include <memory>
#include <string>
template <class T>
class WipingAllocator : public std::allocator <T>
{
public:
template <class U> struct rebind { typedef WipingAllocator <U> other; };
typedef typename std::allocator<T>::pointer pointer;
typedef typename std::allocator<T>::size_type size_type;
typedef typename std::allocator<T>::value_type value_type;
void deallocate(pointer p, size_type n)
{
std::fill_n((volatile char*)p, n * sizeof(value_type), 0);
std::allocator<value_type>::deallocate(p, n);
}
};
typedef std::basic_string <char, std::char_traits <char>, WipingAllocator <char> > SelfWipingString;
int main()
{
const char* p1 = NULL;
const char* p2 = NULL;
{
SelfWipingString s1 = "hello";
p1 = s1.c_str();
//std::string s2 = s1; // Compile error :(
std::string s2(s1.data(), s1.length());
p2 = s2.c_str();
}
std::cout << "After wiping: \"" << p1 << "\"" << std::endl;
std::cout << "No wiping: \"" << p2 << "\"" << std::endl;
}
As you can see from the code above, SelfWipingString can not just be assigned to an std::string (and vice versa), but the conversion is pretty easy. And it will, of course, be API caller's responsibility to wipe the std::string that he will receive. Extra copies of std::strings during returning from the function should not happen, because of C++11 move semantics. There is another little problem though: some std::string implementations contain a small fixed buffer for storing small strings without dynamic memory allocation. So we may want to wipe (*this) in the string's destructor. I.e. use both custom allocator and custom destructor. |
On the other hand, that fixed buffer inside std::string will likely be on stack and will soon be overwritten by other data... Update: not in std::map scenario though. |
What if we used a typedef for our internal strings and provided the string-with-WipingAllocator as a compiler-time option? We could return that from |
Yes, it's exactly what I am suggesting.
I think we should still return std::string in public APIs like asString(). It's much more convenient for most users of the library to operate on std::strings, than on some Json::strings, and it's very important IMO. For example, in our project here we chose JsonCpp over other JSON libraries because JsonCpp's API is very nice. |
Hi all, did you get anywhere with this? We're happy to help with the development if we can reach consensus on the approach. Did anybody already branch this and successfully implement the solution? Thanks christopher |
Hi all, just FYI we're going to try implementing this in https://github.com/EFTlab/jsoncpp/tree/secure_allocator and will contact you upon completion to see if it's worthy of inclusion. |
Hi all, this is now completed and can be pulled from https://github.com/EFTlab/jsoncpp we've unit tests to prove that this is successfully working in that way however i'm guessing given the level of changes required to template most of the code this will require some more major testing. We'll be pulling this into our larger code base ASAP but would like to kick off a conversation about merging to master. |
Submit it as a pull-request so we can see the diffs. Then it's easier to discuss. I think this is a good differentiator for jsoncpp, so I lean toward this. But if we are going to templatize (breaking binary-compatibility, forcing a major version bump), maybe we should include a number type? People currently do not enjoy having to cast |
Hi Christophers, please see #412. |
I'm closing this, unless anyone wants to discuss this further. But first, please see my comments at #412. |
We are going to provide a kind of solution for this. See #442 and linked discussions. |
I want to parse sensitive data with JsonCpp and am interested in wiping the parsed data in memory after its use. I am willing to develop a patch for it myself. If I submit such a patch, will it be merged?
If yes, is there any suggestion on how to implement such wiping?
I can think of several ways to implement wiping:
As I am interested, first of all, in wiping strings, I can introduce a type Json::string and use it everywhere in the library (except APIs, of course) instead of std::string. By default, Json::string will be typedefed to just std::string. If some #define is active, Json::string will be a type derived from std::string that will wipe all the string characters in the destructor.
I can introduce some wiping allocator and supply it to every std::string and another std:: container object being created. It seems like a more generic approach, but then it's easy to forget to supply an allocator when writing new code. And the patch will be quite large in this case.
Your way, dear library developers?
The text was updated successfully, but these errors were encountered: