Skip to content

remove use of variable length arrays#4870

Merged
mattklein123 merged 3 commits intoenvoyproxy:masterfrom
greenhouse-org:stack-array
Oct 29, 2018
Merged

remove use of variable length arrays#4870
mattklein123 merged 3 commits intoenvoyproxy:masterfrom
greenhouse-org:stack-array

Conversation

@sesmith177
Copy link
Member

Description:

After #4645 was merged, @jmarantz noticed that the STACK_ALLOC_ARRAY macro was incorrect on Windows -- it was not properly initializing the memory for the array and the destructors of any array elements would not get called. This PR fixes the macro to:

  1. Use alloca to allocate the memory on the stack
  2. Wrap the buffer with a wrapper template class that handles constructing / destructing the elements of the array

We also update the Linux build to use this alloca+wrapper class strategy and so can add "-Wvla" to the POSIX copts to catch further uses of variable length arrays

Risk Level:
Low - should be no change in behavior
Testing:
bazel build //source/exe:envoy-static && bazel test //test/...
Docs Changes:
N/A
Release Notes:
N/A

Variable length arrays are not part of C++, so replace them with a macro
that:
1. Uses alloca to allocate the memory on the stack
2. Wrap the buffer with a wrapper template class that handles
constructing / destructing the elements of the array

Add "-Wvla" to the POSIX copts

Signed-off-by: Natalie Arellano <narellano@pivotal.io>
Signed-off-by: Sam Smith <sesmith177@gmail.com>
@jmarantz
Copy link
Contributor

Haven't looked at code yet, but if you are going to change the linux path we should probably get @mattklein123 's buy-in before even reviewing, and likely we'll want some perf tests showing the impact, a la #4867

@mattklein123 mattklein123 self-assigned this Oct 27, 2018
@mattklein123
Copy link
Member

Haven't looked at code yet, but if you are going to change the linux path we should probably get @mattklein123 's buy-in before even reviewing, and likely we'll want some perf tests showing the impact, a la #4867

Presumably what this code is doing with a wrapper around alloca() is essentially what I imagine gcc/clang are doing when generating VLA code, so I doubt there will be much/any perf difference. So at a high level it's fine with me. @jmarantz are you interested in taking a first pass on this?

template <typename T> class StackArray {
public:
StackArray(void* buf, size_t num_items) : num_items_(num_items) {
RELEASE_ASSERT(buf != nullptr, "StackArray received null pointer");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a normal ASSERT IMO

Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome. I just some trivial optimization suggestions and test nits.


#if !defined(WIN32)
#include <alloca.h>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop this blank line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like the tools/check_format.py fix script is adding this new line back

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see... np then.

StackArray(void* buf, size_t num_items) : num_items_(num_items) {
RELEASE_ASSERT(buf != nullptr, "StackArray received null pointer");
buf_ = static_cast<T*>(buf);
for (size_t i = 0; i < num_items_; i++) {
Copy link
Contributor

@jmarantz jmarantz Oct 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a micro-optimization (and to look more like STL impls I've peeked into) you could store end_ rather than num_items_. Then you'd loop with a range loop here and in the dtor.

 for (T& ref : *this) {
   new (&ref) T;
 }

The micro-optimization is that there are now two arithmetic operation per loop (bumping the pointer and comparing to end) rather than three (incrementing the index and adding it to the base, and comparing to num_items_). Of course the compiler may very well be able to optimize it to the same instructions, but using the range-loop syntax seems like a benefit too :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that was confusing. This is really two independent suggestions:

  1. Use a range-loop for iterating in the ctor and dtor
  2. Now there are no references to num_items_ anywhere in your impl except for the implementation of end(), so just store end_ rather than num_items_.

namespace Envoy {

// This macro is intended to be used as a replacement for variable-length arrays.
// Note that the StackArray wrapper object will be destructed when it leaves
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also each element's dtor will be called.

~TestEntry() { destructor_(val_); }

int val_ = 0;
bool constructed_ = false;
Copy link
Contributor

@jmarantz jmarantz Oct 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems strange as there won't be a time when the test-method has control where constructed_ will be false.

I thought for a couple of minutes how to gain confidence your ctor got run. What you are doing below would also work if StackArray failed to construct anything, but the stack-memory had random garbage in it (which would be interpreted as truthy).

Maybe have a member-var TestEntry* self_, and init it in the ctor to this. Then you can check that happened.

Signed-off-by: Amin Jamali <ajamali@pivotal.io>
Signed-off-by: Sam Smith <sesmith177@gmail.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this is a great cleanup. One question.

// Note that the StackArray wrapper object will be destructed and each element's
// destructor will be called when it leaves scope. However, the memory containing
// the array won't be deallocated until the function containing the macro returns
#define STACK_ARRAY(var, type, num) StackArray<type> var(::alloca(sizeof(type) * num), num)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this, do we actually need this macro? Couldn't we just specify the template instantiation directly with type and num for each var and actually have :alloca be done as part of the constructor? Or does that now work because then the alloca would be cleaned up on constructor exit? If that's the case do you mind adding a small comment to that effect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, if we called alloca in the constructor, the memory would be freed when the constructor returns

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK makes sense thanks.

Signed-off-by: Sam Smith <sesmith177@gmail.com>
Signed-off-by: Amin Jamali <ajamali@pivotal.io>
@mattklein123 mattklein123 merged commit 8a95f0e into envoyproxy:master Oct 29, 2018
@sesmith177 sesmith177 deleted the stack-array branch November 28, 2018 20:56
yuval-k pushed a commit to solo-io/envoy-gloo that referenced this pull request Dec 6, 2018
1. Upgrade Envoy.
2. Remove use of variable length arrays.
   See envoyproxy/envoy#4870
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants