Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP connections - keep-alive / sessions #2056

Closed
iliakarmanov opened this issue Mar 4, 2016 · 8 comments
Closed

HTTP connections - keep-alive / sessions #2056

iliakarmanov opened this issue Mar 4, 2016 · 8 comments

Comments

@iliakarmanov
Copy link

I notice that when I submit HTTP requests to the graphhopper server I am able to get 2000 a second using a HTTPConnectionPool and multiprocessing in python. However, if I use the exact same code but connect to the OSRM server for some reason the requests per second become super slow (perhaps 10 a second):

conn_pool = HTTPConnectionPool(host='localhost', port=5000, maxsize=cpu_count())

def ReqOsrm(data):
    url, qid = data
    try:
        response = conn_pool.request('GET', url)
        json_geocode = json.loads(response.data.decode('utf-8'))
        ...
# Run:
url_routes = CreateUrls(routes_csv)
pool = Pool(cpu_count())
calc_routes = pool.map(ReqOsrm, url_routes)

I'm not too sure why this may be. If I don't use a connectionpool then I run out of ports as it's very fast and creates a new TCP connection for each request.

This leads me to believe it is something to do with the server-routing.exe

@danpat
Copy link
Member

danpat commented Mar 4, 2016

osrm-routed doesn't support Keepalive, so each request will happen over a new connection, and incur overhead.

It sounds like it takes some time for your connectionpool code to add/remove connections from the pool.

@iliakarmanov
Copy link
Author

Thank you danpat! That's very useful to know as it was driving me mad (I thought I wasn't establishing a session correctly). Sticking to the python framework would you able to recommend a way of issuing HTTP get requests? E.g. asyncio, requests-futures, gevent, grequests, threading, multiprocessing, etc.

@danpat
Copy link
Member

danpat commented Mar 4, 2016

@iliakarmanov If you're aiming for maximum performance, you can skip the HTTP overhead and use libosrm.a directly in-process. There's an example in C++ here:

https://github.com/Project-OSRM/osrm-backend/tree/develop/example

although it is not multi-threaded, an each request is single-threaded.

If that's not an option, then you'll just have to experiment with the various frameworks - whatever keeps your per-request overhead to a minimum. I suspect you'll get pretty good results with multiprocessing and a worker queue, but it depends on what work you need to do with each response.

Each HTTP response should just take a few ms, (like, 4-10ms per request).

@iliakarmanov
Copy link
Author

To be honest I've never never touched C but if it's the case of perhaps just inserting from-to coordinates line-by-line (e.g. create the C script as a string in python) I may experiment. The example you posted is great; ideally I would have something with a loop that loops through all the OD pairs given but perhaps I can make something work with a bit of play!

With your last part -> multiprocessing and a worker queue is similar to something I've tried (basically my first post, no?) and it was too fast for HTTP (I was running out of sockets). So I guess I can't make it too fast (assuming I'm using HTTP); however if I leave HTTP then I'm not limited by the lack of keep-alive. Does that sound about correct? Thanks again for all your help!

@TheMarex
Copy link
Member

TheMarex commented Mar 5, 2016

@iliakarmanov if you are more comfortable with node try something like:

> npm install osrm async
> node
var async = require('async');
var OSRM = require('osrm');
var osrm = new OSRM("dataset.osrm");
var start_destinations = [[[lat, lon], [lat, lon]], ....];
async.map(start_destinations, osrm.route, function(err, results) {
    console.log(results.map(function(r) { return r.route_summary.total_time; }));
});

Note this example works with 4.9.1 and below.

@iliakarmanov
Copy link
Author

Thanks both for the suggestions. I am more comfortable with js. However, it would be nice to incorporate keep-alive in the future (if possible)

@TheMarex
Copy link
Member

TheMarex commented Mar 8, 2016

Keep-alive is probably not going to be supported in the C++ version of the server, maybe once we move the code to node. Thanks for the update! 👍

@TheMarex TheMarex closed this as completed Mar 8, 2016
@daniel-j-h
Copy link
Member

For the record, @systemed asked on IRC for libosrm integration into other HTTP servers. I wanted to look into Microsoft's Casablanca and Facebook's Proxygen anyway, to learn about their architecture and API.

I started an experimental Casablanca integration here: https://github.com/daniel-j-h/libosrm-http-casablanca

which is quite small and limited. As of writing this it only accepts route requests and only returns distance and duration as JSON. I haven't had the time and / or use-case to work on it more. What it already gives you is a high-performance concurrent HTTP server with keep-alive and all that on top of libosrm.

Maybe it helps as an example.

@akashihi akashihi mentioned this issue Aug 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants