Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to understand memory usage with php-fpm #18

Open
chregu opened this issue Nov 28, 2017 · 7 comments
Open

Trying to understand memory usage with php-fpm #18

chregu opened this issue Nov 28, 2017 · 7 comments

Comments

@chregu
Copy link
Contributor

chregu commented Nov 28, 2017

We're using the php-vips-ext together with php-fpm for image manipulation of pictures, mainly for web usage (meaning, they're not huge). All works fine and fast now, but I'm trying to understand the memory usage.

If have currently set vips_cache_set_max_mem to 50MB and vips_cache_set_max_files to 100 (just as a first start). What I see now is that some php-fpm process are using way more than that after some time (the current max is 340MB), it was way less than that before.

I also see a value of 250MB in vips_tracked_get_mem_highwater (not sure that's the one corresponding to the 340MB using memory, but good enough for getting the idea), while vips_tracked_get_mem is around the 50MB.

Is something not releasing that highwater memory? Maybe PHP itself for performance reasons? I can circumvent it with just killing php-fpm after a few hundred requests, that's totally fine, I just try to understand what's happening.

btw, I use https://github.com/pixelb/ps_mem for getting the memory usage of a process, which seems pretty accurate.

@jcupitt
Copy link
Member

jcupitt commented Nov 28, 2017

Hello Christian, the vips_cache_set_*() functions set how operation cache trimming is done, they don't set the maximum amount of memory that libvips can use.

libvips image are immutable, meaning you can make new ones but you can't modify existing ones*. This makes caching and concurrency much simpler, and also means libvips can memoize operation calls. This can produce a very nice speedup in some circumstances.

libvips keeps references to the last 1,000 operations in a table. When you execute something like:

$result = $image->avg();

libvips will construct the call to the avg (find image average) operator, then search the cache for a previous call with the same arguments. If it finds one, the new avg call is discarded and it just returns a new reference to the old result.

This works well for things like newFromFile(): the operation cache will share images between different parts of your program behind your back. You also get common-subexpression elimination -- for example, in something like:

$a = $image->add(2);
$result = $image->add(2)->add(a);

The add(2) will only be computed once.

Most libvips operation calls are very lightweight (a few 100 bytes) so keeping the last 1,000 back is no problem, but some can be expensive. To try to give users control, the cache tracks the amount of memory (actually, the amount of memory in pixel buffers), the number of open files, and the number of cached operations.

When a new operation is added to cache, if any of those three tests fail, the least-recently-used operation is repeatedly discarded until libvips is back within limits, or the cache is empty.

Obviously, if you are actively using 1GB of pixel buffers, emptying the operation cache is not going to help :-(

* you can modify images with the draw_* operations. They work like actions in a paint program: they map the image into memory, then directly modify it. If you call a draw operation on an image, it emits the "invalidate" signal and is automatically knocked out of the operation cache.

@chregu
Copy link
Contributor Author

chregu commented Nov 29, 2017

Thanks for this extensive answer. Makes some stuff clearer, but not everything. What I observe is that some php-fpm instances allovate lots of memory after some time (300+ MB), which doesn't seem to get freed never. It doesn't increase constantly, so it's most certainly not a memory leak. but it may coming from that PHP doesn't really release once allocated memory, but keeps it for later reusal. Could that be it? But fishing in the dark here

But letme try to find a concrete example. Maybe that makes it easier to illustrate.

@chregu
Copy link
Contributor Author

chregu commented Nov 29, 2017

btw, nothing tragic right now. We let the php-fpm manager shutdown a child after a few hundred requests, so it's not filling up our memory.

@jcupitt
Copy link
Member

jcupitt commented Nov 29, 2017

You could try vips_cache_set_max(0); to turn off the operation cache and see if that changes memory behaviour significantly.

As long as memory use is not rising, I think I would be inclined to just leave it.

@chregu
Copy link
Contributor Author

chregu commented Nov 30, 2017

So, I tried to make it somehow reproducable, but it's not that easy. But here's a try

I have this script:

$hashes = ["833ccf", "4e2e10", ... ]
vips_cache_set_max(0);
$file = $hashes[array_rand($hashes)]. ".jpg";
$vips = \Jcupitt\Vips\Image::newFromBuffer(file_get_contents($file));
$vips = $vips->resize(mt_rand(0, 1000) / 1000, ['vscale' => mt_rand(0, 1000) / 1000]);
$vips = $vips->similarity(['angle' => mt_rand(0, 359)]);
header("Content-Type: image/jpeg");
print $vips->jpegsave_buffer(['strip' => true, 'Q' => 80, 'interlace' => true]);

$hashes is an array of 200 picture hashes to be loaded, so that it's not always loading the same one. The pictures have a 2000x2000 dimension. I resize them randomly and rotate them to just have some different operations each time.

We use newFromBufferbecause that's what we use on our site, newFromFile doesn't really make a difference in memory consumption.

I then hit this script with Apache's ab 5'000 times and check the memory usage of the fpm-processes.
Result: (each Row is the amount of MiB a php-fpm process uses)

 273.6 | 230.9 | 256.3 | 264.9 |

After some time, those values are quite constant. And seem pretty high to me. For a turned off cache anyway.

When I do the same with imagick instead of vips (the same operations)

130.3 | 127.7 | 152.5 | 151.7 |

(the values are after I stopped the abprocess, so they're not doing anything at that moment.

The 250MiB aren't a real problem, but that value increases (the highest currently is 560MB) more and more on our "image server" with many more different images and operations and I'm mainly wondering, if that's normal or something is suspicious

@jcupitt
Copy link
Member

jcupitt commented Dec 1, 2017

Hello Christian, I tried a soak test here:

#!/usr/bin/env php
<?php

require __DIR__ . '/vendor/autoload.php';

use Jcupitt\Vips;

vips_cache_set_max(0);

foreach ($argv as $filename) {
    if (pathinfo($filename)['extension'] != "jpg") {
        continue;
    }

    echo "loop " . $filename . " ... \n";

    $data_in = file_get_contents($filename);

    $image = Vips\Image::newFromBuffer($data_in);
    $image = $image->resize(
        mt_rand(1, 1000) / 1000,
        ['vscale' => mt_rand(1, 1000) / 1000]
    );
    $image = $image->similarity(['angle' => mt_rand(0, 359)]);
    $data_out = $image->jpegsave_buffer(
        ['strip' => true, 'Q' => 80, 'interlace' => true]
    );
}

And ran it on 10,000 jpg images:

$ mkdir samples
$ for i in {1..10000}; do cp ~/pics/k2.jpg samples/$i.jpg; done
$ php soak.php samples/*
...

And in another window watched memory with:

$ for i in range{1..10000}; do ps aux | grep php | grep -v grep | cut -c 1-75; sleep 1; done
...
john     16680  188  3.2 1115664 254688 pts/5  Sl+  17:23  48:14 php soak.p
john     16680  188  3.2 1115664 254664 pts/5  Sl+  17:23  48:16 php soak.p

The numbers wobbled up and down a bit, but finished on 254MB -- the max was 260 or so. I think you're seeing expected behaviour.

You can get memory use down in a few ways. First, you are opening the image for random access (the default), and in this mode, libvips will decompress the whole input image to memory. If you get rid of the free rotation, you can stream the images, like this:

foreach ($argv as $filename) {
    if (pathinfo($filename)['extension'] != 'jpg') {
        continue;
    }

    echo "loop " . $filename . " ... \n";

    $data_in = file_get_contents($filename);

    $image = Vips\Image::newFromBuffer(
        $data_in,
        '',
        ['access' => Vips\Access::SEQUENTIAL]
    );
    $image = $image->resize(
        mt_rand(1, 1000) / 1000,
        ['vscale' => mt_rand(1, 1000) / 1000]
    );
    //$image = $image->similarity(['angle' => mt_rand(0, 359)]);

    $data_out = $image->jpegsave_buffer(
        ['strip' => true, 'Q' => 80, 'interlace' => true]
    );
}

Now the image will be streamed from source to destination and never be completely decompressed. Memory use drops down quite a bit:

john     28963  169  1.1 811892 89360 pts/5    Sl+  20:01  23:39 php soak.p
john     28963  169  1.1 795500 89368 pts/5    R+   20:01  23:41 php soak.p

About 90MB.

If you use thumbnail_buffer instead (which is much faster and better quality than simple resize), like this:

    $image = Vips\Image::thumbnail_buffer(
        $data_in,
        mt_rand(1, 1000)
    );
    //$image = $image->similarity(['angle' => mt_rand(0, 359)]);

    $data_out = $image->jpegsave_buffer(
        ['strip' => true, 'Q' => 80, 'interlace' => true]
    );

It comes down to about 60MB. If you reduce the size of the threadpool (threading doesn't help that much with image resizing) like this:

$  VIPS_CONCURRENCY=1 php soak.php samples/*

It comes down to about 45MB.

@jcupitt
Copy link
Member

jcupitt commented Dec 1, 2017

Oh, I tried like this as well:

vips_cache_set_max(0);

$data_in = file_get_contents($argv[1]);

for ($i = 0; $i < 10000; $i++) {
    echo "loop " . $i . " ... \n";

    $image = Vips\Image::thumbnail_buffer(
        $data_in,
        mt_rand(1, 1000)
    );
    //$image = $image->similarity(['angle' => mt_rand(0, 359)]);

    $data_out = $image->jpegsave_buffer(
        ['strip' => true, 'Q' => 80, 'interlace' => true]
    );
}

And with a larger 10k x 10k pixel jpg I see a fairly steady 108mb.

libvips memory use scales with image width rather than number of pixels, so that should be a good saving over imagemagick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants