Memory leak test: take fluctuations into account #1757

giampaolo · 2020-05-12T19:45:14Z

Preamble

We have a memory leak test suite, which calls a function many times and fails if the process memory increased. We do this in order to detect missing free() or Py_DECREF calls in the C modules. When we do, then we have a memory leak.

The problem

A problem we've been having for probably over 10 years, is the false positives. That's because the memory fluctuates. Sometimes it may increase (or even decrease!) due to how the OS handles memory, the Python's garbage collector, the fact that RSS is an approximation and who knows what else. So thus far we tried to compensate that by using the following logic:

warmup (call fun 10 times)
call the function many times (1000)
if memory increased before/after calling function 1000 times, then keep calling it for another 3 secs
if it still increased at all (> 0) then fail

This logic didn't really solve the problem, as we still had occasional false positives, especially lately on FreeBSD.

The solution

This PR changes the internal algorithm so that in case of failure (mem > 0 after calling fun() N times) we retry the test for up to 5 times, increasing N (repetitions) each time, so we consider it a failure only if the memory keeps increasing between runs. So for instance, here's a legitimate failure:

psutil.tests.test_memory_leaks.TestModuleFunctionsLeaks.test_disk_partitions ... 
Run #1: extra-mem=696.0K, per-call=3.5K, calls=200
Run #2: extra-mem=1.4M, per-call=3.5K, calls=400
Run #3: extra-mem=2.1M, per-call=3.5K, calls=600
Run #4: extra-mem=2.7M, per-call=3.5K, calls=800
Run #5: extra-mem=3.4M, per-call=3.5K, calls=1000
FAIL

If, on the other hand, the memory increased on one run (say 200 calls) but decreased on the next run (say 400 calls), then it clearly means it's a false positive, because memory consumption may be > 0 on second run, but if it's lower than the previous run with less repetitions, then it cannot possibly represent a leak (just a fluctuation):

psutil.tests.test_memory_leaks.TestModuleFunctionsLeaks.test_net_connections ... 
Run #1: extra-mem=568.0K, per-call=2.8K, calls=200
Run #2: extra-mem=24.0K, per-call=61.4B, calls=400
OK

Note about mallinfo()

Aka #1275. mallinfo() on Linux is supposed to provide memory metrics about how many bytes gets allocated on the heap by malloc(), so it's supposed to be way more precise than RSS and also USS. In another branch were I exposed it, I verified that fluctuations still occur even when using mallinfo() though, despite less often. So that means even mallinfo() would not grant 100% stability.

giampaolo added 13 commits May 10, 2020 20:25

get rid of retry_for

614ad1d

refactor

731e082

fix tests

c90be83

fix unclosed subprocess.Popen instance

5dcc531

fix imports

e98ff45

remove dead code

3542353

refactor GetAdapterAddresses

8383159

adjust logic

6d49aac

ajust logic

72ce371

adjust test

368380c

fix flake8

5484498

get rid of redirect_stderr

12353e9

adjust test

a78c1fd

giampaolo mentioned this pull request May 12, 2020

Expose malloc statistics #1275

Open

hack around terminate()/wait() on win

dae1d23

giampaolo added the tests label May 12, 2020

giampaolo added 2 commits May 12, 2020 23:48

enable memleak tests on appveyor/win

3fdd6c4

take into account class attributes

5e5f7f1

giampaolo merged commit 6adcca6 into master May 12, 2020

giampaolo deleted the memleak-adjust branch May 12, 2020 22:40

giampaolo added the memleak label Nov 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak test: take fluctuations into account #1757

Memory leak test: take fluctuations into account #1757

giampaolo commented May 12, 2020 •

edited

Loading

Memory leak test: take fluctuations into account #1757

Memory leak test: take fluctuations into account #1757

Conversation

giampaolo commented May 12, 2020 • edited Loading

Preamble

The problem

The solution

Note about mallinfo()

giampaolo commented May 12, 2020 •

edited

Loading