HTTP connections - keep-alive / sessions #2056

iliakarmanov · 2016-03-04T17:56:03Z

I notice that when I submit HTTP requests to the graphhopper server I am able to get 2000 a second using a HTTPConnectionPool and multiprocessing in python. However, if I use the exact same code but connect to the OSRM server for some reason the requests per second become super slow (perhaps 10 a second):

conn_pool = HTTPConnectionPool(host='localhost', port=5000, maxsize=cpu_count())

def ReqOsrm(data):
    url, qid = data
    try:
        response = conn_pool.request('GET', url)
        json_geocode = json.loads(response.data.decode('utf-8'))
        ...
# Run:
url_routes = CreateUrls(routes_csv)
pool = Pool(cpu_count())
calc_routes = pool.map(ReqOsrm, url_routes)

I'm not too sure why this may be. If I don't use a connectionpool then I run out of ports as it's very fast and creates a new TCP connection for each request.

This leads me to believe it is something to do with the server-routing.exe

The text was updated successfully, but these errors were encountered:

danpat · 2016-03-04T18:02:21Z

osrm-routed doesn't support Keepalive, so each request will happen over a new connection, and incur overhead.

It sounds like it takes some time for your connectionpool code to add/remove connections from the pool.

iliakarmanov · 2016-03-04T19:35:17Z

Thank you danpat! That's very useful to know as it was driving me mad (I thought I wasn't establishing a session correctly). Sticking to the python framework would you able to recommend a way of issuing HTTP get requests? E.g. asyncio, requests-futures, gevent, grequests, threading, multiprocessing, etc.

danpat · 2016-03-04T21:21:39Z

@iliakarmanov If you're aiming for maximum performance, you can skip the HTTP overhead and use libosrm.a directly in-process. There's an example in C++ here:

https://github.com/Project-OSRM/osrm-backend/tree/develop/example

although it is not multi-threaded, an each request is single-threaded.

If that's not an option, then you'll just have to experiment with the various frameworks - whatever keeps your per-request overhead to a minimum. I suspect you'll get pretty good results with multiprocessing and a worker queue, but it depends on what work you need to do with each response.

Each HTTP response should just take a few ms, (like, 4-10ms per request).

iliakarmanov · 2016-03-04T22:26:03Z

To be honest I've never never touched C but if it's the case of perhaps just inserting from-to coordinates line-by-line (e.g. create the C script as a string in python) I may experiment. The example you posted is great; ideally I would have something with a loop that loops through all the OD pairs given but perhaps I can make something work with a bit of play!

With your last part -> multiprocessing and a worker queue is similar to something I've tried (basically my first post, no?) and it was too fast for HTTP (I was running out of sockets). So I guess I can't make it too fast (assuming I'm using HTTP); however if I leave HTTP then I'm not limited by the lack of keep-alive. Does that sound about correct? Thanks again for all your help!

TheMarex · 2016-03-05T17:39:51Z

@iliakarmanov if you are more comfortable with node try something like:

> npm install osrm async
> node
var async = require('async');
var OSRM = require('osrm');
var osrm = new OSRM("dataset.osrm");
var start_destinations = [[[lat, lon], [lat, lon]], ....];
async.map(start_destinations, osrm.route, function(err, results) {
    console.log(results.map(function(r) { return r.route_summary.total_time; }));
});

Note this example works with 4.9.1 and below.

iliakarmanov · 2016-03-08T11:20:09Z

Thanks both for the suggestions. I am more comfortable with js. However, it would be nice to incorporate keep-alive in the future (if possible)

TheMarex · 2016-03-08T15:35:43Z

Keep-alive is probably not going to be supported in the C++ version of the server, maybe once we move the code to node. Thanks for the update! 👍

daniel-j-h · 2016-03-10T15:56:14Z

For the record, @systemed asked on IRC for libosrm integration into other HTTP servers. I wanted to look into Microsoft's Casablanca and Facebook's Proxygen anyway, to learn about their architecture and API.

I started an experimental Casablanca integration here: https://github.com/daniel-j-h/libosrm-http-casablanca

which is quite small and limited. As of writing this it only accepts route requests and only returns distance and duration as JSON. I haven't had the time and / or use-case to work on it more. What it already gives you is a high-performance concurrent HTTP server with keep-alive and all that on top of libosrm.

Maybe it helps as an example.

TheMarex closed this as completed Mar 8, 2016

akashihi mentioned this issue Aug 19, 2019

Keepalive #5518

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP connections - keep-alive / sessions #2056

HTTP connections - keep-alive / sessions #2056

iliakarmanov commented Mar 4, 2016

danpat commented Mar 4, 2016

iliakarmanov commented Mar 4, 2016

danpat commented Mar 4, 2016

iliakarmanov commented Mar 4, 2016

TheMarex commented Mar 5, 2016

iliakarmanov commented Mar 8, 2016

TheMarex commented Mar 8, 2016

daniel-j-h commented Mar 10, 2016

HTTP connections - keep-alive / sessions #2056

HTTP connections - keep-alive / sessions #2056

Comments

iliakarmanov commented Mar 4, 2016

danpat commented Mar 4, 2016

iliakarmanov commented Mar 4, 2016

danpat commented Mar 4, 2016

iliakarmanov commented Mar 4, 2016

TheMarex commented Mar 5, 2016

iliakarmanov commented Mar 8, 2016

TheMarex commented Mar 8, 2016

daniel-j-h commented Mar 10, 2016