Skip to content

Commit 2897bde

Browse files
Fix alignment calculation in XNNWeightsCache (#15090)
### Summary We're seeing crashes on Android when running XNNPACK-delegated models. I tracked it down to a bug in the alignment calculation for weight cache memory. To make the calculation, it casts the void* to a (signed) intptr_t. When the address is in the upper half of the address space, it becomes negative. This causes the modulo to return a negative value and increment the address too much - leading to out of bounds access. https://github.com/pytorch/executorch/blob/cc6cb837d6ac92f52a2d30a405900caf115f0556/backends/xnnpack/runtime/XNNWeightsCache.cpp#L166-L168 Walking through the numbers I captured in #14831: * The raw (unaligned) address of the data buffer is 0xb40000763d4bfa90. * The target alignment is 64 bytes. * Casting the address to intptr_t gives -5476376639047992688. * Mod 64 is -48. * The total offset applied is 64 - (-48) = 112. * Since the allocation size is N + 64, increasing the start by 112 means the new region extends 48 bytes past the end of the allocation. To resolve this, I replaced the alignment code with a call to std::align. Casing to uintptr_t also resolves it, but using the standard implementation seems less error prone. ### Test plan I've validated that the repro in #14831 does not crash with this change. Co-authored-by: Gregory Comer <[email protected]>
1 parent a728efa commit 2897bde

File tree

1 file changed

+36
-9
lines changed

1 file changed

+36
-9
lines changed

backends/xnnpack/runtime/XNNWeightsCache.cpp

Lines changed: 36 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,9 @@
1111
#include <executorch/runtime/core/memory_allocator.h>
1212
#include <sys/stat.h>
1313
#include <xnnpack.h>
14+
#include <exception>
15+
#include <memory>
16+
#include <new>
1417
#include <string>
1518
#include <vector>
1619

@@ -155,21 +158,45 @@ size_t XNNWeightsCache::look_up(
155158
return packed_weight_entry->second.offset;
156159
}
157160

161+
/**
162+
* Reserve space in the weight cache for n bytes of weight data, aligned to
163+
* context->kPackedAllocationAlignment. This function will return nullptr if
164+
* the allocation fails.
165+
*/
158166
void* XNNWeightsCache::reserve_space(XNNWeightsCache* context, size_t n) {
159167
// MemoryAllocator* allocator = context->runtime_allocator_;
160168
// void* reserved_pointer = allocator->allocate(n,
161169
// context->kPackedAllocationAlignment);
162170

163171
// return reserved_pointer;
164-
std::string data_container;
165-
data_container.resize(n + context->kPackedAllocationAlignment);
166-
void* maybe_aligned_space = data_container.data();
167-
void* aligned_space = (void*)((intptr_t)maybe_aligned_space + 64 -
168-
(intptr_t)maybe_aligned_space % 64);
169-
170-
context->packed_pointer_to_container_[aligned_space] =
171-
std::move(data_container);
172-
return aligned_space;
172+
try {
173+
std::string data_container;
174+
size_t raw_allocation_size = n + context->kPackedAllocationAlignment - 1;
175+
data_container.resize(raw_allocation_size);
176+
177+
void* maybe_aligned_space = data_container.data();
178+
void* aligned_space = std::align(
179+
context->kPackedAllocationAlignment,
180+
n,
181+
maybe_aligned_space,
182+
raw_allocation_size // Note that std::align mutates this value.
183+
);
184+
ET_CHECK_MSG(aligned_space != nullptr, "Memory alignment failed.");
185+
186+
context->packed_pointer_to_container_[aligned_space] =
187+
std::move(data_container);
188+
return aligned_space;
189+
} catch (std::bad_alloc& e) {
190+
// XNNPACK can gracefully handle allocation failures, so return nullptr.
191+
// We want to be able to recover from a failed attempt to load a large
192+
// model without a crash.
193+
ET_LOG(
194+
Error,
195+
"XNN weight cache failed to allocate %zu bytes: %s.",
196+
n,
197+
e.what());
198+
return nullptr;
199+
}
173200
}
174201

175202
size_t XNNWeightsCache::look_up_or_insert(

0 commit comments

Comments
 (0)