Fix alignment calculation in XNNWeightsCache (#15090)

pytorchbot · GregoryComer · web-flow · commit 2897bdeaf156 · 2025-10-13T20:57:56.000-07:00
### Summary We're seeing crashes on Android when running XNNPACK-delegated models. I tracked it down to a bug in the alignment calculation for weight cache memory. To make the calculation, it casts the void* to a (signed) intptr_t. When the address is in the upper half of the address space, it becomes negative. This causes the modulo to return a negative value and increment the address too much - leading to out of bounds access. https://github.com/pytorch/executorch/blob/cc6cb837d6ac92f52a2d30a405900caf115f0556/backends/xnnpack/runtime/XNNWeightsCache.cpp#L166-L168 Walking through the numbers I captured in #14831: * The raw (unaligned) address of the data buffer is 0xb40000763d4bfa90. * The target alignment is 64 bytes. * Casting the address to intptr_t gives -5476376639047992688. * Mod 64 is -48. * The total offset applied is 64 - (-48) = 112. * Since the allocation size is N + 64, increasing the start by 112 means the new region extends 48 bytes past the end of the allocation. To resolve this, I replaced the alignment code with a call to std::align. Casing to uintptr_t also resolves it, but using the standard implementation seems less error prone. ### Test plan I've validated that the repro in #14831 does not crash with this change. Co-authored-by: Gregory Comer <gjcomer@meta.com>
diff --git a/backends/xnnpack/runtime/XNNWeightsCache.cpp b/backends/xnnpack/runtime/XNNWeightsCache.cpp
@@ -11,6 +11,9 @@
 #include <executorch/runtime/core/memory_allocator.h>
 #include <sys/stat.h>
 #include <xnnpack.h>
+#include <exception>
+#include <memory>
+#include <new>
 #include <string>
 #include <vector>
 
@@ -155,21 +158,45 @@ size_t XNNWeightsCache::look_up(
   return packed_weight_entry->second.offset;
 }
 
+/**
+ * Reserve space in the weight cache for n bytes of weight data, aligned to
+ * context->kPackedAllocationAlignment. This function will return nullptr if
+ * the allocation fails.
+ */
 void* XNNWeightsCache::reserve_space(XNNWeightsCache* context, size_t n) {
   // MemoryAllocator* allocator = context->runtime_allocator_;
   // void* reserved_pointer = allocator->allocate(n,
   // context->kPackedAllocationAlignment);
 
   // return reserved_pointer;
-  std::string data_container;
-  data_container.resize(n + context->kPackedAllocationAlignment);
-  void* maybe_aligned_space = data_container.data();
-  void* aligned_space = (void*)((intptr_t)maybe_aligned_space + 64 -
-                                (intptr_t)maybe_aligned_space % 64);
-
-  context->packed_pointer_to_container_[aligned_space] =
-      std::move(data_container);
-  return aligned_space;
+  try {
+    std::string data_container;
+    size_t raw_allocation_size = n + context->kPackedAllocationAlignment - 1;
+    data_container.resize(raw_allocation_size);
+
+    void* maybe_aligned_space = data_container.data();
+    void* aligned_space = std::align(
+        context->kPackedAllocationAlignment,
+        n,
+        maybe_aligned_space,
+        raw_allocation_size // Note that std::align mutates this value.
+    );
+    ET_CHECK_MSG(aligned_space != nullptr, "Memory alignment failed.");
+
+    context->packed_pointer_to_container_[aligned_space] =
+        std::move(data_container);
+    return aligned_space;
+  } catch (std::bad_alloc& e) {
+    // XNNPACK can gracefully handle allocation failures, so return nullptr.
+    // We want to be able to recover from a failed attempt to load a large
+    // model without a crash.
+    ET_LOG(
+        Error,
+        "XNN weight cache failed to allocate %zu bytes: %s.",
+        n,
+        e.what());
+    return nullptr;
+  }
 }
 
 size_t XNNWeightsCache::look_up_or_insert(