Support both Theano and Tensorflow backend #611

trungnt13 · 2016-02-16T08:40:23Z

The idea comes from keras: https://github.com/fchollet/keras/tree/master/keras/backend

Some functions have the same functional, but 2 backends can return different format of results. However, it is not a big issue if we carefully abstract them.

The biggest advantages of adding tensorflow backend is that it significantly speed up the building and running process on CPU. Hence, it helps a lot with the development process.

I ported updates.py and objectives.py from Lasagne to use multiple backend. Calculating the objective itself can be 3 times faster than Theano on CPU.

I am aware that some Lasagne modules are specialised for only Theano, however, I think it is possible to adapt all common layers for both frameworks.

f0k · 2016-02-16T12:17:02Z

We discussed this as well when TensorFlow came out, and decided to wait until it's more mature and/or faster than Theano on GPU. One of the problems is that unlike Keras, Lasagne does not (and does not want to) abstract away from Theano, but accepts and returns Theano expressions. This makes it a bit harder to transparently switch backends.

PS: I've seen that your "odin" project is a potpourri of Keras, Lasagne and your own (even including the README). Please make sure to adhere to the licenses of Keras and Lasagne, i.e., include their license texts where appropriate ("The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software."). Thanks!

trungnt13 · 2016-02-16T14:51:08Z

Thanks for your remind, the fact I just start the code and haven't got a clear direction, re-implement all Lasagne layers to be compatible with both backends would be a huge amount of work. Hence, I prefer to integrate it directly to Lasagne if possible.

Since the expression of tensorflow and theano both represent a computational graph, it is also possible to manually manipulate the expression from keras networks. And I think this would be perfect features for Lasagne.

However, It seems that the advantages of faster CPU does not surpass the drawback on huge amount of work now.

benanne · 2016-02-16T23:14:06Z

I think a better strategy would be to create a sister library for TensorFlow, that provides largely the same API as Lasagne (where possible), but returns TensorFlow objects where Lasagne would return Theano objects. That would make it much easier for users and devs who only care about one or the other. Of course we'd need a few devs involved in both of them, to keep the APIs roughly in sync. But it would of course be fine if they didn't support exactly the same stuff.

The main thing I would want to avoid is for it to slow things down. If we commit to supporting everything on both backends, suddenly implementing a feature is twice as much work. I think that would be a bad idea.

trungnt13 · 2016-02-17T14:16:49Z

I think that would be too much work. What about only create a wrapper that makes an interface for tensorflow to be similar to theano.

Basically, all the code in Lasagne can be kept unchanged (maybe some operators like: x**2 must be changed to T.pow(x, 2)). At highest level of abstraction, from theano import tensor as T would be identical to from theano_tf import tensor as T. I haven't think this one through but I think it highly possible.

So in this case, even if the tensorflow wrapper mess up, it doesn't affect the theano part anyway.

btw, tensorflow just releases new version: allow installed Cuda >= 7.0 and cuDNN >= R2, and add support for cuDNN R4 It should be faster, but the key is how much.

ebenolson · 2016-02-17T14:36:16Z

Possibly relevant thread on r/machinelearning yesterday. I haven't looked at Keras' code myself in any detail, but some commenters felt that the dual backend made the source harder to understand.

Personally I think it sounds like a nightmare trying to keep everything working consistently and efficiently - we've had plenty of issues just with different Theano optimizations/implementations creating corner cases.

benanne · 2016-02-17T22:39:43Z

I'm with @ebenolson here, which is why I think in the long term, having separate libraries is the way to go. The design principles underlying Lasagne, and even most of the interface choices we have made, are not closely tied to Theano. They could just as well be applied in a library built on top of TensorFlow instead (or even MXNet, CGT, ...).

I don't think this would be too much work -- for one, other people could work on it (i.e., TensorFlow users), and there would be no extra burden for either "side" of the divide to keep up with the other. They would just follow the same design principles and make similar interface choices. An added benefit would be that it would be really easy to switch from one to the other (although this would not be a primary goal, more of a side effect).

ebattenberg · 2016-02-19T21:24:17Z

Agree with @ebenolson and @benanne.

aukejw · 2016-02-29T08:30:47Z

Will Lasagne team members still be involved in its development, or oversee the major design decisions somehow? IMO it's important that the Lasagne team has a say in what happens: to have some sort of quality assurance if this sister library carries the Lasagne label, and (to a lesser extent) to ensure that credit is given where it is due.

Anyone can copy Lasagne and replace the theano operations by their tensorflow-equivalents, but high test-coverage and active discussion on issues and PRs is just as important -- if not more.

benanne · 2016-02-29T10:34:27Z

Absolutely. I think most of the Lasagne core team is still using Theano almost exclusively, so it might take a while before such an undertaking even becomes feasible.

f0k · 2016-02-29T13:52:55Z

With the current amount of active developers we won't be able to develop two libraries in parallel. Even on Lasagne alone we're not exactly progressing fast (partly because we're thinking things through extensively before forming design decisions, but also because participation on those discussions is... mhm... choppy).

Maybe the most likely scenario is that we'll switch over to TensorFlow if and when it becomes more mature and faster than Theano (for now it has more disadvantages than advantages, at least for what I'm using Lasagne for). I.e., at some point we might create a sister library and focus on that, and just oversee porting things back to the Theano side as needed. Who knows!

cancan101 · 2016-02-29T19:59:26Z

What about using something like: https://github.com/dementrock/tensorfuse ?

benanne · 2016-02-29T20:40:11Z

Sounds good in theory, but that project looks quite limited and has no tests... also we would probably miss out on some features that TF offers over Theano. I think whatever we end up doing for the TF case, it should be as 'vanilla' as possible, because one of the key ideas behind Lasagne is not to hide the underlying library behind abstraction layers.

f0k · 2016-03-01T15:00:49Z

+1 -- a Lasagne for TensorFlow will probably look a bit different in some respects, as far as needed to support multi-GPU and distributed models. I haven't looked at the TensorFlow API in detail, but I doubt a generic theano-to-tensorflow wrapper would suffice.

benanne · 2016-03-10T03:01:33Z

This just got open-sourced: https://github.com/tensorflow/models/blob/master/inception/slim/README.md

Some of the design principles are quite similar to those of Lasagne, although there are a few things that are not as flexible as they could be, imo (most notably weight initialization). Still, worth taking a look at :)

dnouri · 2016-03-11T02:55:44Z

Link in the previous comment is now broken. Here's a working one: https://github.com/tensorflow/models/blob/master/inception/inception/slim/README.md

justheuristic · 2016-10-26T23:02:54Z

One more cross-breed: https://github.com/openai/rllab/blob/master/sandbox/rocky/tf/core/layers.py#L377
It wraps several core layers with 100% same interface.

benanne · 2016-10-29T19:19:54Z

interesting, thanks for the heads up!

Omrigan mentioned this issue Feb 27, 2017

Support both Theano (Lasagne; Keras) and Tensorflow (Keras) backend yandexdataschool/AgentNet#92

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support both Theano and Tensorflow backend #611

Support both Theano and Tensorflow backend #611

trungnt13 commented Feb 16, 2016

f0k commented Feb 16, 2016

trungnt13 commented Feb 16, 2016

benanne commented Feb 16, 2016

trungnt13 commented Feb 17, 2016

ebenolson commented Feb 17, 2016

benanne commented Feb 17, 2016

ebattenberg commented Feb 19, 2016

aukejw commented Feb 29, 2016

benanne commented Feb 29, 2016

f0k commented Feb 29, 2016

cancan101 commented Feb 29, 2016

benanne commented Feb 29, 2016

f0k commented Mar 1, 2016

benanne commented Mar 10, 2016

dnouri commented Mar 11, 2016

justheuristic commented Oct 26, 2016

benanne commented Oct 29, 2016

Support both Theano and Tensorflow backend #611

Support both Theano and Tensorflow backend #611

Comments

trungnt13 commented Feb 16, 2016

f0k commented Feb 16, 2016

trungnt13 commented Feb 16, 2016

benanne commented Feb 16, 2016

trungnt13 commented Feb 17, 2016

ebenolson commented Feb 17, 2016

benanne commented Feb 17, 2016

ebattenberg commented Feb 19, 2016

aukejw commented Feb 29, 2016

benanne commented Feb 29, 2016

f0k commented Feb 29, 2016

cancan101 commented Feb 29, 2016

benanne commented Feb 29, 2016

f0k commented Mar 1, 2016

benanne commented Mar 10, 2016

dnouri commented Mar 11, 2016

justheuristic commented Oct 26, 2016

benanne commented Oct 29, 2016