Skip to content

memory: Add memory-debug scribbling when tcmalloc is disabled and not compiled for optimization#5450

Closed
jmarantz wants to merge 22 commits intoenvoyproxy:masterfrom
jmarantz:mem_debug
Closed

memory: Add memory-debug scribbling when tcmalloc is disabled and not compiled for optimization#5450
jmarantz wants to merge 22 commits intoenvoyproxy:masterfrom
jmarantz:mem_debug

Conversation

@jmarantz
Copy link
Contributor

@jmarantz jmarantz commented Jan 2, 2019

Description: This is an alternate approach to #5424 which is broken due to bazel's inability to do a prioritized select(). In this version, we just slip in some alternate operator new/delete overrides whenever we are doing a debug build (#ifndef NDEBUG) without tcmalloc. The memory-scribbling code is new to Envoy but has a lot of mileage on it it in its original location: https://github.com/apache/incubator-pagespeed-mod/blob/master/pagespeed/kernel/base/mem_debug.cc
Risk Level: low
Testing: //test/... with --compilation_mode=dbg and --define=tcmalloc=disabled , but relying on CI to see what happens in other configurations, thus this is WiP.
Docs Changes: n/a
Release Notes: n/a

…bug mode. Either way, bazel seems to be a problem.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz
Copy link
Contributor Author

jmarantz commented Jan 2, 2019

still having bazel issues turning this code off with tsan/asan. closing for now.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz jmarantz reopened this Jan 3, 2019
@jmarantz
Copy link
Contributor Author

jmarantz commented Jan 3, 2019

re-opening to see if I can get this to pass CI. Still not ready for any sort of review.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Contributor Author

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still not really ready for review but thanks for the comments :)

However I think I am heading in this direction rather than tcmalloc's debugallocator, which I can't get to work with bazel.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
…shSet.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…on* classes.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz jmarantz changed the title WiP/RFC: memory: Add memory-debug scribbling when tcmalloc is disabled memory: Add memory-debug scribbling when tcmalloc is disabled and not compiled for optimization Jan 4, 2019
@jmarantz
Copy link
Contributor Author

jmarantz commented Jan 4, 2019

OK I think this is ready for review now. #5424 is looking very difficult due to bazel limitations.

…ging is enabled.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
inline uint64_t align(uint64_t size, uint64_t alignment) {
// Check that alignment is a power of 2:
// http://www.graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2
assert((alignment > 0) && ((alignment & (alignment - 1)) == 0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASSERT or RELEASE_ASSERT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to avoid assertion layers that might allocate memory in the context of a utility designed to help with memory allocation. Although actually the right thing here is to use static_assert, which I can do with a little template magic.


#include "common/memory/align.h"

static std::atomic<int64_t> bytes_allocated(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why signed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that a negative value was possible if a block of memory was subtly corrupted before it was freed. However it's better to assert when that happens than let this go negative. I'll fix.

} // namespace Memory
} // namespace Envoy

#ifdef MEMORY_DEBUG_ENABLED
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ENVOY_MEMORY_DEBUG_ENABLED?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

namespace {

constexpr uint32_t LiveMarker1 = 0xfeedface; // first 4 bytes after alloc
constexpr uint64_t LiveMarker2 = 0xfeedfacefeedface; // first 4 bytes after alloc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment says 4 bytes, but this is 8 bytes long

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


// Writes scribble_word over the block of memory starting at ptr and extending
// size bytes.
inline void scribble(void* ptr, uint64_t rounded_size, uint64_t scribble_word) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think inline does anything here; omit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done; I was thinking I might be able to convince the compiler to optimize out the assert on alignment; but it's likely not worth it, and the compiler may figure out how to do that anyway.

// malloced memory is 0 get data that, when interpreted as pointers, will SEGV,
// and that will be easily seen in the debugger (0xfeedface pattern).
void* debugMalloc(uint64_t size) {
assert(size <= 0xffffffff); // For now we store the original size in a uint32_t.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me nervous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't be hard to use all 8 bytes in the marker to store the size. But I'm curious why it makes you nervous? Do you think -- especially in a testing/debugging contest, that we are likely to see 4G requests to malloc? I would imagine that would not behave well, and it might not be a bad idea to get an assert fail while debugging or testing to become aware of this.

If that's OK with you I'll document the assert better. If you'd prefer to allow >=4G allocations I can change it.

assert(marker != NULL);
marker[0] = LiveMarker1;
marker[1] = size;
uint32_t* ret = marker + 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is labeled ret, but it is not the returned value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I made it the returned value, but also renamed to 'payload'; wdyt?

@ggreenway
Copy link
Member

It's disappointing that there's not a good way to use the debug tcmalloc. But given that isn't possible, I am in favor of this approach to improve memory checking.

@ggreenway ggreenway self-assigned this Jan 5, 2019

namespace {

constexpr uint32_t LiveMarker1 = 0xfeedface; // first 4 bytes after alloc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those names should all be uppercase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From https://github.com/envoyproxy/envoy/blob/master/STYLE.md

The Google C++ style guide points out that constant vars should be named kConstantVar. In the Envoy codebase we use ConstantVar or CONSTANT_VAR. If you pick CONSTANT_VAR, please be certain the name is globally significant to avoid potential conflicts with #defines, which are not namespace-scoped, and may appear in externally controlled header files.

assert(marker != NULL);
marker[0] = LiveMarker1;
marker[1] = size;
uint32_t* ret = marker + 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/2/sizeof(Overhead)/sizeof(uint32_t)/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK though I'm not sure if that's easier to read. The '2' is the logical progression from assigning marker[0] and then marker[1]; now we are working on what begins at index 2, which is the payload.


} // namespace

void* operator new(uint64_t size) { return debugMalloc(size); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be size_t?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urr, yes :)

// and that will be easily seen in the debugger (0xfeedface pattern).
void* debugMalloc(uint64_t size) {
assert(size <= 0xffffffff); // For now we store the original size in a uint32_t.
uint64_t rounded = Envoy::Memory::align(size, Overhead);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this can now return 0 (it couldn't in the original code), so you might want to skip calls to malloc and scribble in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to add the overhead so we can do the inverse accounting on free. And why special-case the call to scribble, which should happily does nothing in the degenerate case?

uint64_t rounded = Envoy::Memory::align(size, Overhead);
bytes_allocated -= rounded;
scribble(ptr, rounded, DeadMarker2);
assert(LiveMarker1 == marker[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this check right after assigning marker and before accessing marker[1]? Since if this is wrong, then so will be marker[1].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to after the assignment of marker. I'm not sure what you meant in the second half of your suggestion.


constexpr uint32_t LiveMarker1 = 0xfeedface; // first 4 bytes after alloc
constexpr uint64_t LiveMarker2 = 0xfeedfacefeedface; // first 4 bytes after alloc
constexpr uint32_t DeadMarker1 = 0xabacabff; // first 4 bytes after free
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot comment on the original thread, so reviving it here.

What's the value of having 2 different dead markers? Is that useful in practice?

It has been occasionally useful, yes, to to distinguish between blobs of memories that are allocated but initialized vs freed.

note also that tcmalloc's debugallocation does the same thing (with different patterns).

Yes, the usefulness of Live and Dead is quite obvious... but my question was about having different Marker1 and Marker2 (i.e. marking first 4 bytes of the freshly allocated / freed memory differently from the rest).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must have been in some distant debug scenario this distinction became interesting. But I can't pull it out of history, and so I'll just make there be one dead marker.

// and that will be easily seen in the debugger (0xfeedface pattern).
void* debugMalloc(uint64_t size) {
assert(size <= 0xffffffff); // For now we store the original size in a uint32_t.
uint64_t rounded = Envoy::Memory::align(size, Overhead);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please rename rounded to aligned or aligned_size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


#include "common/memory/align.h"

static std::atomic<int64_t> bytes_allocated(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also track allocated_bytes_with_overhead to account for alignment and allocation headers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who would consume that?

The main reason I'm tracking this at all is so that the memory-constraining tests can run in debug mode.

namespace Memory {

uint64_t Stats::totalCurrentlyAllocated() { return 0; }
uint64_t Stats::totalCurrentlyAllocated() { return Debug::bytesUsed(); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd slightly prefer to have this whole block duplicated, so that we have:

#ifdef TCMALLOC
    ...
#elif defined(MEMORY_DEBUG_ENABLED)
    ...
#else 
    ...
#endif

instead of adding code to the build without debug memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point...I reorganized it a bit; ptal.


// We don't run memory debugging for optimizd builds to avoid impacting
// production performance.
#ifndef NDEBUG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, I'd prefer if users had to opt-in into this via --define debug_memory=enabled or something like that, instead of this being automatically injected to all non-optimized / non-sanitized builds with --define tcmalloc=disabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a healthy discussion to have...my motivation was to help people find memory bugs when they aren't looking for them. If they are looking for them they can use valgrind which is superior but onerous.

However, as I explore using this in different compile modes, I am also running into trouble with what looks to me to be inconsistent usage of operator new/delete in protobufs. This results in segv before main in these contexts, whereas in others it works great.

At least until all the build/run modes are working, I think this suggestion makes sense and I'll have this off-by-default. I re-worked the build files around this strategy. WDYT?

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz
Copy link
Contributor Author

jmarantz commented Jan 6, 2019

OK; I'm back to wanting to use the scribbling support built into tcmalloc :)

The reason I had backed off of that approach before is that I couldn't figure out how to have it just automatically work for debug builds due to the limitations of bazel select. But given that in this approach we are going to force the user to use bazel config overrides, we might as well use the support built into tcmalloc, which doesn't have this operator new/delete override messiness with protobufs.

Sorry about the churn, and thanks for all the comments along the way.

@jmarantz jmarantz closed this Jan 6, 2019
Copy link
Contributor Author

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responding to these comments; though I've closed the PR.

inline uint64_t align(uint64_t size, uint64_t alignment) {
// Check that alignment is a power of 2:
// http://www.graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2
assert((alignment > 0) && ((alignment & (alignment - 1)) == 0));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to avoid assertion layers that might allocate memory in the context of a utility designed to help with memory allocation. Although actually the right thing here is to use static_assert, which I can do with a little template magic.


#include "common/memory/align.h"

static std::atomic<int64_t> bytes_allocated(0);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that a negative value was possible if a block of memory was subtly corrupted before it was freed. However it's better to assert when that happens than let this go negative. I'll fix.


#include "common/memory/align.h"

static std::atomic<int64_t> bytes_allocated(0);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who would consume that?

The main reason I'm tracking this at all is so that the memory-constraining tests can run in debug mode.

} // namespace Memory
} // namespace Envoy

#ifdef MEMORY_DEBUG_ENABLED
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


namespace {

constexpr uint32_t LiveMarker1 = 0xfeedface; // first 4 bytes after alloc
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From https://github.com/envoyproxy/envoy/blob/master/STYLE.md

The Google C++ style guide points out that constant vars should be named kConstantVar. In the Envoy codebase we use ConstantVar or CONSTANT_VAR. If you pick CONSTANT_VAR, please be certain the name is globally significant to avoid potential conflicts with #defines, which are not namespace-scoped, and may appear in externally controlled header files.

assert(marker != NULL);
marker[0] = LiveMarker1;
marker[1] = size;
uint32_t* ret = marker + 2;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK though I'm not sure if that's easier to read. The '2' is the logical progression from assigning marker[0] and then marker[1]; now we are working on what begins at index 2, which is the payload.

uint64_t rounded = Envoy::Memory::align(size, Overhead);
bytes_allocated -= rounded;
scribble(ptr, rounded, DeadMarker2);
assert(LiveMarker1 == marker[0]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to after the assignment of marker. I'm not sure what you meant in the second half of your suggestion.

namespace Memory {

uint64_t Stats::totalCurrentlyAllocated() { return 0; }
uint64_t Stats::totalCurrentlyAllocated() { return Debug::bytesUsed(); }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point...I reorganized it a bit; ptal.

// malloced memory is 0 get data that, when interpreted as pointers, will SEGV,
// and that will be easily seen in the debugger (0xfeedface pattern).
void* debugMalloc(uint64_t size) {
assert(size <= 0xffffffff); // For now we store the original size in a uint32_t.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't be hard to use all 8 bytes in the marker to store the size. But I'm curious why it makes you nervous? Do you think -- especially in a testing/debugging contest, that we are likely to see 4G requests to malloc? I would imagine that would not behave well, and it might not be a bad idea to get an assert fail while debugging or testing to become aware of this.

If that's OK with you I'll document the assert better. If you'd prefer to allow >=4G allocations I can change it.


constexpr uint32_t LiveMarker1 = 0xfeedface; // first 4 bytes after alloc
constexpr uint64_t LiveMarker2 = 0xfeedfacefeedface; // first 4 bytes after alloc
constexpr uint32_t DeadMarker1 = 0xabacabff; // first 4 bytes after free
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must have been in some distant debug scenario this distinction became interesting. But I can't pull it out of history, and so I'll just make there be one dead marker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants