Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leakage after mapnik map.render ? #1004

Open
JaylanChen opened this issue Nov 14, 2024 · 15 comments
Open

Memory leakage after mapnik map.render ? #1004

JaylanChen opened this issue Nov 14, 2024 · 15 comments
Assignees

Comments

@JaylanChen
Copy link

os: ubuntu:22.04
nodejs: v22
@mapnik/mapik: v4.6.5
pm2 v5.4.2

Based on pg database test, data count: 182104; geom type: multipolygon, 353MB

When generating a thumbnail with the following code:
Service memory will increase a lot, and not released, continue to increase.

map.xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Map srs="+proj=longlat +ellps=GRS80 +no_defs +type=crs">
  <Style name="default">
    <Rule>
      <PolygonSymbolizer fill="#d5f001" fill-opacity="1"/>
      <LineSymbolizer stroke="rgb(76, 161, 45)" stroke-opacity="1" stroke-width="0.5" stroke-dasharray="5,0"/>
      <MaxScaleDenominator>279541132.0782576</MaxScaleDenominator>
      <MinScaleDenominator>33.303899743746626</MinScaleDenominator>
    </Rule>
  </Style>
  <Layer name="default" srs="+proj=longlat +datum=WGS84 +no_defs +type=crs">
    <StyleName>default</StyleName>
    <Datasource>
      <Parameter name="type">postgis</Parameter>
      <Parameter name="application_name">map-node</Parameter>
      <Parameter name="connect_timeout">30</Parameter>
      <Parameter name="host">127.0.0.1</Parameter>
      <Parameter name="port">5432</Parameter>
      <Parameter name="dbname">test</Parameter>
      <Parameter name="user">postgres</Parameter>
      <Parameter name="password">postgres</Parameter>
      <Parameter name="table">(select * from "geodata"."test_py") as mztable</Parameter>
    </Datasource>
  </Layer>
</Map>

generate the thumbnail code

  const mapnik = require("@mapnik/mapnik");
  const fs = require("fs");

  mapnik.register_default_fonts();
  mapnik.register_default_input_plugins();

  const map = new mapnik.Map(256, 256);
  const xmlPath = path.resolve("map.xml");
  map.load(xmlPath, function (err, map) {
    console.log(err);
    map.zoomAll();
    const im = new mapnik.Image(256, 256);
    map.render(im, function (err, im) {
      im.encode("png", function (err, buffer) {
        fs.writeFileSync("map.png", buffer);
      });
    });
  });
  1. Start the service with the PM2
    http://localhost:3000/

MEM: 84MB
image

  1. generate map thumbnail
    http://localhost:3000/generate
    MEM: 84MB → 391MB
    image
    image
    image

wait 5 minutes later, MEM: 391MB
image

  1. generate map thumbnail again
    http://localhost:3000/generate
    MEM: 391MB → 690MB
    image
    image
    image
    image

test code, (no pg table data).

@artemp artemp self-assigned this Nov 14, 2024
@artemp
Copy link
Member

artemp commented Nov 14, 2024

@JaylanChen - thanks for testing! To try to narrow down the issue, could you try replacing

im.encode("png", function (err, buffer) {
        fs.writeFileSync("map.png", buffer);
      });

with something like

im.save("map.png", function (err) {
        if (err) throw err;
      });

and let me know if memory leak persists ?

@JaylanChen
Copy link
Author

test code

  map.load(xmlPath, function (err, map) {
    console.log(err);
    map.zoomAll();
    const im = new mapnik.Image(256, 256);
    map.render(im, function (err, im) {
      if (err) throw err;
      const imagePath = path.resolve("test/map.png");
      im.save(imagePath, function (err) {
        if (err) throw err;
      });
    });
  });

image
image
image

test again

image

memory leak persists.

@JaylanChen
Copy link
Author

JaylanChen commented Nov 15, 2024

test gdal with natural_earth.tif

<Map srs="+proj=longlat +ellps=GRS80 +no_defs +type=crs">
    <Style name="raster">
        <Rule>
            <RasterSymbolizer/>
        </Rule>
    </Style>
    <Layer name="layer" srs="+proj=longlat +ellps=GRS80 +no_defs">
        <StyleName>raster</StyleName>
        <Datasource>
            <Parameter name="type">gdal</Parameter>
            <Parameter name="file">data/natural_earth.tif</Parameter>
        </Datasource>
    </Layer>
</Map>

servce started:
image

http://localhost:3000/gengdal
once:
image
twice:
image

memory leak persists.

also test shp
http://localhost:3000/genshp

memory is not leaked or not obvious.

test code is updated (with tif and shp).

@artemp
Copy link
Member

artemp commented Nov 15, 2024

@JaylanChen - Thanks for great self-contained test cases 👍. I had a chance to run your app. I'm not, yet, convinced there is a memory leak, though. I'm seeing high but stable memory usage with http://localhost:3000/gengdal. And GC is reclaiming memory overtime.
image

So far I only tested on macOS. I'll try running on Linux as well.

But I noticed your code can (should!) be improved. You're creating new mapnik.Map + load(<xml>) per request. This is extremely inefficient and is draining OS resources( db connections, file descriptors etc). To make your app scalable and improve performance you should find a way to create bunch of Map objects, load XML and re-use them. You can achieve this by using object pool.
HTH.

@JaylanChen
Copy link
Author

@JaylanChen - Thanks for great self-contained test cases 👍. I had a chance to run your app. I'm not, yet, convinced there is a memory leak, though. I'm seeing high but stable memory usage with http://localhost:3000/gengdal. And GC is reclaiming memory overtime. image

So far I only tested on macOS. I'll try running on Linux as well.

But I noticed your code can (should!) be improved. You're creating new mapnik.Map + load(<xml>) per request. This is extremely inefficient and is draining OS resources( db connections, file descriptors etc). To make your app scalable and improve performance you should find a way to create bunch of Map objects, load XML and re-use them. You can achieve this by using object pool. HTH.

Thank you for your advice. Actually I used Pool to cache an instance of mapnik.

this code only for test.

@JaylanChen
Copy link
Author

I also tested it in the Linux ARM64 environment.
Server: Ubuntu-22.04
Node: v20.18.0

The initial memory is about 67 mb. (Wait 2 minutes after startup)
image

curl http://localhost:3000/gengdal
MEM: 67mb → 80.6mb
image

Wait few minutes, MEM: 80.8mb
image

curl http://localhost:3000/gengdal again.
MEM: 84.6mb → 92.3mb
image

Wait few minutes, MEM: 92.0mb
image

@artemp
Copy link
Member

artemp commented Nov 22, 2024

@JaylanChen - I did some testing on both macOS and Linux (Ubuntu 24.04) and I'm seeing stable memory usage over time.

I did some modification to your app.js and settings, see below.

I'm using OSM data and XML file (mapnik.xml) generated as per
https://github.com/gravitystorm/openstreetmap-carto
https://github.com/mapbox/carto

For load testing I'm using https://www.artillery.io/ e.g

DEBUG=http artillery run mapnik-load-test.yml

image

  • mapnik-load-test.yml
config:
   target: http://localhost:3000
   phases:
      - duration: 60
        arrivalRate: 1
        rampTo: 5
        maxVusers: 20
        name: Warm up
      - duration: 30m
        arrivalRate: 5
        maxVusers: 50
        name: Ramp up load

   processor: "./location_generator.js"

scenarios:
    - flow:
      - get:
          url: "/generate_map?easting={{ easting }}&northing={{ northing }}"
          beforeRequest: getRandomLocation
  • location_generator.js
const fs = require('fs');
const mapnik = require("@mapnik/mapnik");
const json = JSON.parse(fs.readFileSync('./data/Trees.geojson', 'utf8'));
const num_features = Object.keys(json.features).length;

const tr = new mapnik.ProjTransform(new mapnik.Projection("epsg:4326"),
                                    new mapnik.Projection("epsg:3857"));

const generateRandomKey = (length) => Math.floor(Math.random() * length);

const getRandomLocation= (requestParams, context, ee, next) => {
  var key = generateRandomKey(num_features);
  const lon = json.features[key].geometry.coordinates[0];
  const lat = json.features[key].geometry.coordinates[1];
  const coord = tr.forward([lon, lat]);
  context.vars.easting = Math.floor(coord[0]);
  context.vars.northing = Math.floor(coord[1]);
  next();
};

module.exports = {
  getRandomLocation,
};
  • pm2.config.js
module.exports = {
  apps: [
    {
      name: 'mapnik-test',
      script: './src/app.js',
      instances : "4",
      exec_mode : "cluster"
    },
  ],
};
  • app.js
const express = require("express");
const mapnik = require("@mapnik/mapnik");
const path = require("path");
const fs = require("fs");
const genericPool = require("generic-pool");
const { pid } = require('node:process');

mapnik.register_default_fonts();
mapnik.register_default_input_plugins();

const app = express();
const port = 3000;

const map_factory = {
  create: function() {
    const map = new mapnik.Map(4*256, 4*256);
    const xmlPath = path.resolve("../openstreetmap-carto/mapnik.xml");
    //const xmlPath = path.resolve("test/gdal_map.xml");
    map.loadSync(xmlPath);
    console.log(`--> Load Map pid:${pid} xml:${xmlPath}`);
    return map;
  },
  destroy: function(map) {
    delete map;
    console.log(`<-- Destroy Map ${pid}`);
  }
};

const opts = {
  max: 6,
  min: 2
};

const pool = genericPool.createPool(map_factory, opts);

app.get("/", (req, res) => {
  res.send("Hello World, express & mapnik!");
});

app.get("/generate_map", (req, res) => {
  const mapPromise = pool.acquire();
  mapPromise
    .then(function(map) {
      var easting = +req.query.easting;
      var northing = +req.query.northing;
      var bbox = [easting - 1000, northing - 1000, easting + 1000, northing + 1000];
      map.zoomToBox(bbox);
      const im = new mapnik.Image(4*256,4*256);
      map.render(im, function(err, im) {
        if (err) throw err;
        im.encode('png256', function (err, buffer) {
          if (err) throw err;
          res.type('png');
          res.send(buffer);
          console.log(`==> pid:${pid} req(x:${easting} y:${northing}) res(size:${buffer.length})`);
          pool.release(map);
        });
      });
    })
    .catch(function(err) {
      res.send(`FAIL:${err}`);
    });
});

app.listen(port, () => {
  console.log(`Example app listening on port ${port}`);
});


process.on('SIGINT', function() {
  pool.drain().then(function() {
    pool.clear();
  });
});

@JaylanChen
Copy link
Author

Generally, the small amount of data (the amount of data covered by a single tile picture) will not continuously increase in memory; when the amount of data is relatively large (the threshold is not clear), it will not be released.

I use the thumbnail of the entire map to reproduce the scene where a single tile picture contains a large amount of data.

I downloaded the osm us-pacific data and imported it into the pg db using osm2pgsql.

osm2pgsql -d osm-us-pacific -U postgres -P 5332 -H 192.168.1.121 -U postgres -W -C 25000 us-pacific-latest.osm.pbf

I updated the test code adding logic to the pool of the map instance.(This is similar to your test code, and the same as real use).

After the service starts, the memory takes up about 95mb.
image

http://localhost:3000/genpg
The first time: 447mb
image
If requested again, the memory will also be increased.
the second time: 862mb
image
the third time: 928.9mb
the fourth time: 931.6mb
image

If there is no concurrency (instances remain minimum 2), the memory stabilizes but is not released. If concurrency occurs, such as 4 requests, two more instances are created and the memory is increased again. (1.7gb)
image
image

An instance takes up so much memory, is there something wrong?

@JaylanChen
Copy link
Author

If you change the condition of the map xml (the data volume decreases), repeat the above test steps, and the memory increase also decreases.

<Parameter name="table"><![CDATA[(select * from planet_osm_polygon where way_area < 50) as t1]]></Parameter>

image

3 map instances. mem 119.7mb.

image

@artemp
Copy link
Member

artemp commented Nov 25, 2024

@JaylanChen - I'm wondering if "forking" processes might be related to this issue. Could you try "cluster" mode to see if memory usage pattern changes? e.g

module.exports = {
  apps: [
    {
      name: 'mapnik-test',
      script: './src/app.js',
      instances : "4",
      exec_mode : "cluster"
    },
  ],
};

@JaylanChen
Copy link
Author

JaylanChen commented Nov 26, 2024

@JaylanChen - I'm wondering if "forking" processes might be related to this issue. Could you try "cluster" mode to see if memory usage pattern changes? e.g

module.exports = {
  apps: [
    {
      name: 'mapnik-test',
      script: './src/app.js',
      instances : "4",
      exec_mode : "cluster"
    },
  ],
};

full table data

Initial state memory
image

the first request, the node with id 1 is responsible for processing the request:
image
image

the second request, the node with id 3 is responsible for processing the request:
image
image

the third request, the node with id 1 is responsible for processing the request:
image
image

pm2 restart pm2.config.js
image

concurrent with 20 requests

image
image
image
image

five mapnik instances were created for each node (two were already created when the pool was initialized). the memory of each node is about the same.
image
image

concurrent with 20 requests again. reuse the map instance, with less memory increase.
image
image

table data with filter (way_area < 50)

(select * from planet_osm_polygon where way_area < 50) as t1

3 separate requests.

image

pm2 restart pm2.config.js

concurrent with 20 requests

due to the small amount of data, the single request time is short, with only 3 nodes, and an additional instance is created.
the node with id 1 creates no additional map instance and has a little smaller memory than the other nodes.

image
image
image
image
image

HTP.

@artemp
Copy link
Member

artemp commented Nov 26, 2024

@JaylanChen - thanks for trying ^ I'm going to investigate memory usage/leaks further.

artemp added a commit that referenced this issue Nov 27, 2024
@artemp
Copy link
Member

artemp commented Nov 27, 2024

@JaylanChen - It looks like defining NAPI_EXPERIMENTAL improves memory management of a running node process. I'm going to do more testing and if everything is OK I'll release development package to try.

I was using following script (OOM without NAPI_EXPERIMENTAL)

'use strict'
const mapnik  = require("@mapnik/mapnik");

for (var i = 0; i < 10000000; ++i)
{
  var im = new mapnik.Image(256, 256);
  if (i % 10000 == 0)
  {
    const memoryUsage = process.memoryUsage();
    console.log('Memory Usage:', memoryUsage);
    // if (global.gc) {
    //   global.gc();
    // } else {
    //    console.log('Garbage collection unavailable.  Pass --expose-gc '
    //                + 'when launching node to enable forced garbage collection.');
    // }
  }
}

ref -> nodejs/node-addon-api#1213

@JaylanChen
Copy link
Author

When the new development package is released, please let me know that I can help test the validation.

@artemp
Copy link
Member

artemp commented Nov 28, 2024

@JaylanChen => npm install @mapnik/[email protected]

Give a try and let me know ^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants