Skip to content

test/access_loggers/stats: replace tcmalloc consumedBytes assertions with sizeof structural bounds#44708

Closed
Copilot wants to merge 2 commits intomainfrom
copilot/fix-memory-footprint-tests
Closed

test/access_loggers/stats: replace tcmalloc consumedBytes assertions with sizeof structural bounds#44708
Copilot wants to merge 2 commits intomainfrom
copilot/fix-memory-footprint-tests

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 28, 2026

GaugeKeyTest.ExactMemoryFootprint and StatsAccessLoggerTest.AccessLogStateMemoryFootprint used Memory::TestUtil::MemoryTest::consumedBytes() with hand-tuned tcmalloc fudge factors ("384 gives a generous 1.5x buffer for allocator page alignment", "We allow 4096 bytes for allocator caches"). This is the same anti-pattern removed for StringMatcher in #44701 — it produces CI flakes and needs re-tuning on every tcmalloc/abseil/protobuf bump.

Changes

  • GaugeKeyTest::ExactMemoryFootprintGaugeKeyTest::SizeIsBounded: drop all three MemoryTest/consumedBytes() blocks; replace with a single portable structural bound:

    EXPECT_THAT(sizeof(GaugeKey), MemNotMoreThan(8 * sizeof(void*)));
  • StatsAccessLoggerTest::AccessLogStateMemoryFootprintStatsAccessLoggerTest::AccessLogStateSizeIsBounded: remove the 10 K-item insertion/removal loop and all consumedBytes() assertions (including the two that bypassed EXPECT_MEMORY_LE with raw EXPECT_LE); replace with:

    EXPECT_THAT(sizeof(AccessLogState), MemNotMoreThan(16 * sizeof(void*)));
    EXPECT_THAT(sizeof(GaugeKey),       MemNotMoreThan(8 * sizeof(void*)));

    Comments document the InflightGauge layout (private struct) and explain that leak-after-destruction is already covered by ASAN/MSAN/LSAN in CI.

  • Add MATCHER_P(MemNotMoreThan, sz, ...) (identical to test: replace flaky StringMatcher.Memory with deterministic sizeof bounds #44701): failure message prompts the reviewer to think carefully before raising the bound.

  • Remove #include "test/common/memory/memory_test_utility.h" — no remaining callers.

All bounds are expressed in sizeof(void*) units for portability across libc++/libstdc++ and 32/64-bit builds.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • dl.google.com
    • Triggering command: /build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/bin/java bazel(envoy) --add-opens=java.base/java.lang=ALL-UNNAMED -Xverify:none -Djava.util.logging.config.file=/build/bazel_root/base/javalog.properties -Dcom.google.devtools.build.lib.util.LogHandlerQuerier.class=com.google.devtools.build.lib.util.SimpleLogHandler$HandlerQuerier -XX:-MaxFDLimit -Djava.library.path=/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/lib:/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/lib/server:/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/ -Dfile.encoding=ISO-8859-1 -Duser.country= -Duser.language= -Duser.variant= -Xmx3g -DBAZEL_TRACK_SOURCE_DIRECTORIES=1 -Djavax.net.ssl.trustStore=/tmp/custom-cacerts -Djavax.net.ssl.trustStorePassword=changeit -jar /build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/A-server.jar --max_idle_secs=10800 --noshutdown_on_low_sys_mem --connect_timeout_secs=30 (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

Background

PR #44701 replaced TEST_F(StringMatcher, Memory) with StringMatcher.SizeIsBounded, removing tcmalloc page-level consumedBytes() assertions with hand-tuned ceilings (which produced ~5/1000 CI flakes and had to be re-tuned on every tcmalloc/abseil/protobuf bump — see #37782, #43467, #44528).

The same anti-pattern exists in test/extensions/access_loggers/stats/stats_test.cc, in two places:

  1. A small EXPECT_MEMORY_LE(memory_test.consumedBytes(), 64); block (just before AccessLogStateMemoryFootprint).
  2. TEST_F(StatsAccessLoggerTest, AccessLogStateMemoryFootprint) — which mixes a deterministic EXPECT_LE(sizeof(AccessLogState), 128) static check with multiple consumedBytes()-based assertions whose comments explicitly call out allocator fudge factors:
    • "384 gives a generous 1.5x buffer for allocator page alignment."
    • "We allow 4096 bytes for allocator caches."
    • "approximately 1.6 Megabytes for 10,000 items ... threshold of 2 Megabytes here to account for this capacity and allocator page alignment."

It also bypasses EXPECT_MEMORY_LE in two spots with raw EXPECT_LE(static_cast<int64_t>(memory_test.consumedBytes()), …) and EXPECT_LE(memory_test.consumedBytes(), 4096), which skips the canonical-platform guard that EXPECT_MEMORY_LE provides — so on non-canonical platforms those lines are effectively asserting tcmalloc page math directly, which is exactly the flake source we want to remove.

The test is young (added Feb 2026 in #42226) so it hasn't been re-tuned yet, but it will start needing the same toolchain-bump tuning that #44701 retired.

Task

Apply the same approach used in #44701 to test/extensions/access_loggers/stats/stats_test.cc:

  • Replace the Memory::TestUtil::MemoryTest / consumedBytes() based assertions with deterministic sizeof-based structural bounds, expressed in sizeof(std::string) / sizeof(void*) / sizeof(some_type) for libc++ / libstdc++ / 32-bit / 64-bit portability.
  • Use a MATCHER_P(MemNotMoreThan, sz, ...) (or equivalent) similar to the one introduced in PR test: replace flaky StringMatcher.Memory with deterministic sizeof bounds #44701 so the failure message tells the maintainer to "think carefully before increasing this, and if you're sure, update the corresponding expectation."
  • Preserve the intent of each existing assertion — i.e. the bound should still describe the structural footprint the test was trying to gate on (e.g. that AccessLogState and the GaugeKey/InflightGauge slot types stay small). The existing comments documenting why the bounds exist (slot composition, what the per-item overhead is, etc.) should be kept and re-cast in terms of sizeof of the relevant types instead of measured tcmalloc bytes.
  • Where the existing test was checking "no leak after destruction" via consumedBytes(), drop the tcmalloc-based check entirely if it cannot be re-expressed deterministically. The destructor invariants the test was trying to assert are better covered by ASAN/MSAN/LSAN, which already run on this code in CI; preserving a flaky leak check here adds no real safety net. Add a brief comment explaining this.
  • Keep the test function name(s) descriptive of what they actually now check (e.g. rename AccessLogStateMemoryFootprint to AccessLogStateSizeIsBounded or similar, mirroring the rename in test: replace flaky StringMatcher.Memory with deterministic sizeof bounds #44701).
  • Remove #include "test/common/memory/memory_test_utility.h" if no other test in the file still needs it after the change.
  • If gmock/gmock.h is not already included, add it (needed for EXPECT_THAT / MATCHER_P).

Reference

The PR to mirror: #44701 — see the diff to test/common/common/matchers_test.cc for the exact style, the MemNotMoreThan matcher, and the bound expressions to follow.

Out of scope

  • Do not touch the legitimate regression-gating memory tests (test/integration/stats_integration_test.cc ClusterMemoryTest, test/common/stats/thread_local_store_test.cc MemoryWith*RealSymbolTable, test/common/stats/symbol_table_impl_test.cc SymbolTableTest.Memory). Those are intentional golden-value trackers and are correctly using EXPECT_MEMORY_EQ / EXPECT_MEMORY_LE against pinned values.
  • Do not touch test/common/common/packed_struct_test.cc in this PR.

Validation

  • bazel test //test/extensions/access_loggers/stats:stats_test should pass.
  • Confirm no remaining uses of Memory::TestUtil::MemoryTest / consumedBytes() / EXPECT_MEMORY_LE / EXPECT_MEMORY_EQ in the modified file.
  • Format/lint per repo conventions.

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: we just landed this one - which i think is a great improvement

i think there is another similar (perhaps ~adjacent) test that does the same anti-pattern and should be fixed

Assistant: Looking at the PR you just landed, the anti-pattern was: **using Memory::TestUtil::MemoryTest::consumedBytes() (tcmalloc page-level accounting) with hand-tuned ceilings + alignment fu...

This pull request was created from Copilot chat.

@repokitteh-read-only
Copy link
Copy Markdown

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #44708 was opened by Copilot.

see: more, trace.

…ds in stats_test.cc

Agent-Logs-Url: https://github.com/envoyproxy/envoy/sessions/636ffcaf-9841-4c03-8ab2-af7842e7e936

Co-authored-by: phlax <454682+phlax@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix memory footprint tests in access logger test/access_loggers/stats: replace tcmalloc consumedBytes assertions with sizeof structural bounds Apr 28, 2026
Copilot AI requested a review from phlax April 28, 2026 18:04
@phlax phlax closed this Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants