Implement a model serving framework #1873

futurely · 2016-04-17T03:20:02Z

as @tqchen suggested in soumith/convnet-benchmarks#101 (comment) to compete with https://github.com/tensorflow/serving.

revilokeb · 2016-06-04T19:32:39Z

@futurely @piiswrong a straightforward way to deploy mxnet models into production environments would indeed be highly welcome.
I have made some very good experience with an open source API server called Deepdetect (http://www.deepdetect.com/, https://github.com/beniz/deepdetect) which I am using heavily to deploy models for my commercial production environments. Currently it is supporting Caffe and XGBoost with partial support for Tensorflow on its way (my experience so far only relates to using Caffe). Would this be a route to go down for mxnet?

jordan-green-zz · 2016-12-30T08:02:06Z

@futurely @revilokeb @piiswrong Hey guys,
Where did you end up with this? Model serving and management is an area of focus for me, and I'd be keen to spend some dev hours on a compatible solution

piiswrong · 2016-12-30T08:44:40Z

No one is doing it yet. An easy solution is to use AWS Lambda but it doesn't support GPU and doesn't do batching.

You are welcome to work on it. Please propose a design and we can discuss it

beniz · 2016-12-30T08:49:03Z

@jordan-green you may be interested in opening an issue for mxnet prediction support with https://github.com/beniz/deepdetect as it already has support for Caffe, XGBoost and Tensorflow. It may not be executed immediately, though not too difficult I believe. If you can help a bit with it, it is even better and will happen faster.

zihaolucky · 2016-12-30T09:06:54Z

Excited to see this!!

jordan-green-zz · 2017-01-03T10:03:07Z

Hi all, my current gut feeling is that this piece of functionality may be best provided as a standalone project, under a compatible and permissive license (most likely apache), so as to benefit other frameworks also.

It would seem that outside of TF Serving, there's not a lot out there. Deep Detect looks interesting @beniz, however it appears to be under the GPL license - can you please confirm?

Lambda / OpenWhisk

Lambda would almost certainly be a great option if it had GPU support, and Amazon will almost certainly provide this in the near future, whether via a different class of lambda or via their new elastic GPU offering (which may be slightly less suited here than the prior). This if of course not an open source solution, and as such may not be the ideal. This had me thinking about other options for implementing a simple, server-less method for hosting inference models, and I think OpenWhisk may suit here.

GPU Compatibility

I can't find validation that it works on GPUs, however their generic action invocation appears to run an arbitrary binary via Alpine Linux, which I've used with cuda in the past with some success. I'll spin up an OpenWhisk VM on my GPU box and report back as to whether or not GPUs are accessible, however it's not immediately obvious to me why it shouldn't be.

Simplification

From there, I think making use of the amalgamation script/s within Mxnet to provide a simple 'runnable' object may be a good approach to providing a simple deployment process to users. This will obviously need performance testing.

Mxnet Integration

I think this could prove to be a powerful tool for many ML frameworks, with MxNet serving as the foundation in places. Perhaps this would best be its own project/repository, mirrored within and closely integrating with Mxnet? Thoughts on this are much appreciated.

Please let me know your thoughts, and once I've validated some of the moving pieces, particuarly GPU support on OpenWhisk, I'll knock together a design proposal for further discussion.

beniz · 2017-01-03T10:19:09Z

DD is under LGPL, please see https://github.com/beniz/deepdetect/blob/master/COPYING.

eric-haibin-lin · 2017-09-28T04:46:05Z

@kevinthesun

kevinthesun · 2017-09-28T05:35:51Z

@yuruofeifei and I are working on MXNet Model serving. It's still in early stage. In current phase, it creates a http end point and allows developers to fully customize their preprocess and post process function for inference. In the future stage, more powerful functions will be added.
https://github.com/yuruofeifei/mms

piiswrong added the Call for Contribution label Apr 17, 2016

beniz mentioned this issue Jan 12, 2017

Support for mxnet jolibrain/deepdetect#236

Closed

tqchen closed this as completed Oct 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a model serving framework #1873

Implement a model serving framework #1873

futurely commented Apr 17, 2016

revilokeb commented Jun 4, 2016

jordan-green-zz commented Dec 30, 2016

piiswrong commented Dec 30, 2016

beniz commented Dec 30, 2016

zihaolucky commented Dec 30, 2016

jordan-green-zz commented Jan 3, 2017

beniz commented Jan 3, 2017

eric-haibin-lin commented Sep 28, 2017

kevinthesun commented Sep 28, 2017

Implement a model serving framework #1873

Implement a model serving framework #1873

Comments

futurely commented Apr 17, 2016

revilokeb commented Jun 4, 2016

jordan-green-zz commented Dec 30, 2016

piiswrong commented Dec 30, 2016

beniz commented Dec 30, 2016

zihaolucky commented Dec 30, 2016

jordan-green-zz commented Jan 3, 2017

Lambda / OpenWhisk

GPU Compatibility

Simplification

Mxnet Integration

beniz commented Jan 3, 2017

eric-haibin-lin commented Sep 28, 2017

kevinthesun commented Sep 28, 2017