-
Notifications
You must be signed in to change notification settings - Fork 947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixes for support of new backend #836
Conversation
@f0k , The tests are now failing due to this error: if theano.config.device == 'cuda' and theano.gpuarray.pygpu_activated:
....
elif theano.config.device == 'gpu' and theano.sandbox.cuda.cuda_enabled:
.... Hopefully this would remove the issue. Is there a case where |
The cpu tests were passed. How do you run the gpu and cuda tests? |
So I checked the mock that you use for sphinx. It seems that since we are supporting both backends, we will not need to change that code and it should still work. Although if you prefer I can change it to support |
lasagne/layers/corrmm.py
Outdated
except: | ||
raise ImportError('sandbox.cuda is not available in your\ | ||
version of theano.') | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here, we need to also take into account the case where device == 'cpu', but either back-end has manually been activated by a call to use(...)
.
Also, you may need to try to import theano.gpuarray (in the first case) or theano.sandbox.cuda, in the other case.
Yes, gpuarray was still in sandbox in that version. It'd be great if the check didn't depend on this, so we stay compatible to Theano 0.8.2 for the next Lasagne release (I'm not 100% sure how useful this is, but I didn't want to drop backwards compatibility too early -- I'd bump the requirement to 0.9 immediately after).
Yes, see #606 for the latter case, and I guess the former case will be similar. So instead, we should try to import gpuarray from both locations. Something like: try:
from theano import gpuarray
except ImportError:
from theano.sandbox import gpuarray
gpu = gpuarray.pygpu_activated
if not gpu:
try:
from theano.sandbox import cuda
except ImportError:
gpu = False
else:
gpu = cuda.cuda_enabled This would tell whether a GPU is available and activated for use.
This can only be done locally, using pytest: http://lasagne.readthedocs.io/en/latest/user/development.html#testing
Oh yes, that's true. Let's leave it as it is then, I can change it when dropping support for the old backend. |
To also mingle in the different imports: try:
from theano import gpuarray
except ImportError:
from theano.sandbox import gpuarray
gpu = gpuarray.pygpu_activated
if gpu:
dnn = gpuarray.dnn # or whatever module/Op is needed
else:
try:
from theano.sandbox import cuda
except ImportError:
gpu = False
else:
gpu = cuda.cuda_enabled
dnn = cuda.dnn # or whatever module/Op is needed After this preparation, we can throw the necessary errors depending on |
@f0k , thanks. I'm having problem running the tests. Could you let me know on which versions of theano this should work? |
It should work on Theano 0.8.2, Theano 0.9.0 and the current bleeding-edge version. (The
What exactly? Let me know if I can help! |
@f0k .Thanks for your reply. I don't know why part of your tests have failed. Do you suggest any fixes? I have a rather weird problem. Also, with Theano 0.9.X, when I set the flag to use gpu, this import can be done: |
the Theano flag set to "gpu" mean that it is sandbox/cuda that is used. So
it is normal that pygpu (the new back-end) is not activated.
…On Wed, May 17, 2017 at 5:51 PM Reyhane Askari ***@***.***> wrote:
@f0k <https://github.com/f0k> .Thanks for your reply. I don't know why
part of your tests have failed. Do you suggest any fixes? I have a rather
weird problem.
Also, with Theano 0.9.X, when I set the flag to use *gpu*, this import
can be done:
from theano.sandbox import gpuarray as gpu but, gpu.pygpu_activated is
false. I manually double checked theano.config.device and it was gpu. Do
you have any idea why pygpu_activated doesn't work?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#836 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AALC-0wASBlMsfbFM5Uy4sy1JDFqqktVks5r62vXgaJpZM4NarF1>
.
|
The tests were fine, the coverage was decreased. As before, the parts that require a GPU must be marked as And thanks for separating out the GPU check, but we cannot place it in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added some more comments on the code. But I'd really like the implementation in #836 (comment) instead of a centralized one.
lasagne/layers/dnn.py
Outdated
@@ -1,6 +1,4 @@ | |||
import theano | |||
from theano.sandbox.cuda import dnn | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The empty line was on purpose; PEP 8 suggests to group imports into standard library, third-party, and internal (https://www.python.org/dev/peps/pep-0008/#imports).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I will revert it.
lasagne/layers/dnn.py
Outdated
else: | ||
raise ImportError("requires GPU support -- see http://lasagne.\ | ||
readthedocs.org/en/latest/user/installation.html#gpu-support") | ||
if not dnn_available: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You sometimes call it dnn_avail
and sometimes dnn_available
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank, done.
lasagne/layers/dnn.py
Outdated
"latest/user/installation.html#cudnn" % | ||
dnn.dnn_available.msg) # pragma: no cover | ||
"cuDNN not available: %s\nSee http://lasagne.readthedocs.org\ | ||
/en/latest/user/installation.html#cudnn") # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You removed the helpful message Theano puts into dnn_available.msg
when the import failed, but you left in the %s
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure that there is such message for dnn_present()
. I will double check and try to keep the message.
@f0k , thanks. I didn't know about |
I ran the tests on the gpu and there are two failing tests that I will work on them tomorrow. It is |
lasagne/layers/dnn.py
Outdated
@@ -31,8 +32,7 @@ | |||
if not dnn_avail: | |||
raise ImportError( | |||
"cuDNN not available: %s\nSee http://lasagne.readthedocs.org\ | |||
/en/latest/user/installation.html#cudnn" | |||
% dnn.dnn_available.msg) # pragma: no cover | |||
/en/latest/user/installation.html#cudnn") # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it for now so that the tests would pass and fix it later.
lasagne/tests/layers/test_noise.py
Outdated
@@ -138,7 +138,7 @@ def test_get_output_for_deterministic(self, layer): | |||
input = theano.shared(numpy.ones((3, 3))) | |||
result = layer.get_output_for(input, deterministic=True) | |||
result_eval = result.eval() | |||
assert (result_eval == input.eval()).all() | |||
assert (np.array(result_eval) == np.array(input.eval())).all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... so in the new backend, eval()
does not always return a ndarray?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It returned a gpuarray ndarray
and I think the problem is that the equal method is not defined properly. So I had to convert it to an np.array
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, it returns a GpuArray
for which ==
only checks for identity. A shorter variant would be np.allclose(result.eval(), input.eval())
, which handles the conversion. The tests above will need to be changed as well then -- they assert that the comparison is false, but if the comparison only checks for identity, this is trivially fulfilled.
Also for some reason, the input is not floatX, I think this was not intended. It's not terribly important for the test; we can rely on Theano that CPU and GPU versions are the same (we still might want to run the test suite with float64 warnings enabled and see if there are more glitches like this). But why is the result still a gpuarray then? Ah, I see, it's a float64 GpuArray!
Phew. I think we may need to go through all the tests and see if they're using ==
or !=
where they should use np.allclose
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did actually check the np.allclose
and it was true. So I was really surprised at first that the ==
gave false
. I think changing all the tests could be another PR but for now I will change this one to np.allclose
and test it with float64 and float32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think changing all the tests could be another PR
Yes, this doesn't have to happen on the spot.
I will change this one to
np.allclose
👍
and test it with float64 and float32
Please only test with floatX -- we can run the suite with different floatX settings if we want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yes that's what I meant. I will use floatX in the code and test it with the theano flag of floatX=float32 and floatX=falot64.
lasagne/tests/layers/test_special.py
Outdated
return lambda shape: (np.arange(np.prod(shape)).reshape(shape)) \ | ||
/ np.prod(shape) | ||
return lambda shape: ((np.arange(np.prod(shape)).reshape(shape)) / | ||
np.prod(shape)).astype(theano.config.floatX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh! I hadn't noticed when reviewing #446. This is wrong -- it will always return an array of zeros in Python2. We could add a __future__
import, but I'd prefer to fix this as:
from lasagne.utils import floatX
return lambda shp: (floatX(np.arange(np.prod(shp)).reshape(shp)) /
floatX(np.prod(shp)))
In theory, I should do a separate PR for this, but then you'd need to undo your change and rebase -- can you just edit it with my fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I made this change but still there seemed to be a small difference in the values for some tests. So I changed the atol
of np.allclose
from the default value of 1e-08 to 1e-07 in two cases.
I saw that you had two imports for numpy in this file: lasagne/tests/layers/test_noise.py |
I don't know why the test for coverage is failed and what should be done. I will wait for @f0k to review the PR completely and then I will run the tests for device = cpu, gpu, cuda with float32 and float64 using Theano dev, 0.8 and 0.9 versions. |
Can you make an issue in libgpuarray about == and !=? They should behave
like numpy.
Le ven. 26 mai 2017 15:26, Reyhane Askari <[email protected]> a
écrit :
… I don't know why the test for coverage is failed and what should be done.
I will wait for @f0k <https://github.com/f0k> to review the PR completely
and then I will run the tests for device = *cpu*, *gpu*, *cuda* with
*float32* and *float64* using Theano *dev*, *0.8* and *0.9* versions.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#836 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AALC-4KEgyjnvNX9Tndpa8kVBE-D1Kafks5r9ydNgaJpZM4NarF1>
.
|
@nouiz I created an issue here: Theano/libgpuarray#443 |
Okay, thanks... many tests have
Hmm, this is indeed strange. It says https://coveralls.io/builds/11709461/source?filename=lasagne%2Futils.py#L371 is not covered although it was covered before. It must have been accidentally covered by one of the tests that you changed, maybe via the Sorry for all the unrelated problems this triggered, and sorry for being so slow with reviewing, I'm writing my thesis. If you don't mind, I can try to take over this PR, rearrange some commits, add the missing test -- this might be easier than going back and forth all the time. Let me know if that's okay with you. Otherwise, the next step would be to add a test that covers the line that's currently missing. |
No problem at all. There is no rush. I created a PR for the missing tests. I will rebase after that one got merged. Regarding the numpy, I can get rid of that commit if you'd prefer and create a separate PR for it. |
Great, thank you!
Yes, that'd be better, so that the history is easier to understand afterwards. It'd be best to address all |
6abb084
to
b379c5d
Compare
@f0k , so I rebased and removed the commit regarding to numpy and squashed some other commits. Please let me know if the rest is good. Then I will test it with different device and different versions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's close now. I have some comments, and regarding the commits, I'd prefer:
- changed equality to np.allclose and change in np.atol
- everything else squashed in a single commit that introduces the backend support and updates the directly related tests
lasagne/layers/corrmm.py
Outdated
"latest/user/installation.html#gpu-support") # pragma: no cover | ||
if enabled: | ||
from gpu.basic_ops import gpu_contiguous | ||
from gpu.blas import GpuCorrMM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice except for two things:
- Burying the second
if not enabled
within the firstif not enabled
is a bit confusing, I'd place it as theelse
part. - Calling everything
gpu
is a nice idea, but as we will remove support forsandbox.gpuarray
andsandbox.cuda
afterwards, I'm not sure we should invent a new name for thegpuarray
module. Hmm. Butfrom theano.sandbox import cuda as gpuarray
could be confusing as well. - I think it won't work this way.
from gpu.basic_ops import gpu_contiguous
will try to literally import something from a "gpu.basic_ops" module, and not care thatgpu
was assigned to a module instance before:
>>> import scipy as sc
>>> from sc import sinh
ImportError: No module named sc
>>> from scipy import sinh
So I would suggest the following variant:
# check if Theano's new GPU backend is available and in use
try:
from theano import gpuarray as gpu
except ImportError:
from theano.sandbox import gpuarray as gpu
gpu_enabled = gpu.pygpu_activated
# if not, try to fall back to Theano's old GPU backend
if not gpu_enabled:
try:
from theano.sandbox import cuda as gpu
except ImportError:
gpu_enabled = False # [I know it's False already but it is easier to follow]
else:
gpu_enabled = gpu.cuda_enabled
# if either of the backends is available, use it, otherwise bail out
if gpu_enabled:
gpu_contiguous = gpu.basic_ops.gpu_contiguous
GpuCorrMM = gpu.blas.GpuCorrMM
else:
raise ImportError(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is nicer.
lasagne/layers/dnn.py
Outdated
if not dnn_avail: | ||
raise ImportError( | ||
"cuDNN not available:See http://lasagne.readthedocs.org\ | ||
/en/latest/user/installation.html#cudnn") # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're still missing the message here. In line with the suggestion above, I would suggest:
# check if Theano's new GPU backend is available and in use
try:
from theano import gpuarray as gpu
except ImportError:
from theano.sandbox import gpuarray as gpu
gpu_enabled = gpu.pygpu_activated
dnn_enabled = gpu.dnn.dnn_present
# if not, try to fall back to Theano's old GPU backend
if not gpu_enabled:
try:
from theano.sandbox import cuda as gpu
except ImportError:
gpu_enabled = False # [I know it's False already but it is easier to follow]
else:
gpu_enabled = gpu.cuda_enabled
dnn_enabled = gpu.dnn.dnn_available
# if either of the backends is available, use it, otherwise bail out
if gpu_enabled:
if dnn_enabled():
dnn = gpu.dnn
else:
raise ImportError("cuDNN not ..." % dnn_enabled.msg)
else:
raise ImportError("requires GPU ...")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change this one as well.
lasagne/tests/layers/test_conv.py
Outdated
@@ -7,6 +7,25 @@ | |||
import lasagne | |||
from lasagne.utils import floatX, as_tuple | |||
|
|||
theano_backend = "cpu" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove this line.
lasagne/tests/layers/test_conv.py
Outdated
theano_backend = "pygpu_sandbox" | ||
gpu = gpuarray.pygpu_activated | ||
if not gpu: | ||
theano_backend = "cpu" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove this line.
87c9471
to
f88d104
Compare
I don't get these errors locally when I run |
This is with Theano 0.8.0, with no GPU back-end present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More comments. Also I'd prefer to have the np.allclose
commit first. This way every commit is a fully working version of Lasagne.
else: | ||
raise ImportError( | ||
"requires GPU support -- see http://lasagne.readthedocs.org/en/" | ||
"latest/user/installation.html#gpu-support") # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please also include the comments I had in my suggestion? (Except for the one in square brackets.)
lasagne/layers/dnn.py
Outdated
dnn_enabled = gpu.dnn.dnn_present | ||
if not gpu_enabled: | ||
try: | ||
from theano.sandbox import cuda as gpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per @lamblin's comment, we need to explicitly import dnn for the old backend. The simplest fix would be to add a line here (before the except
): import theano.sandbox.cuda.dnn
. This will also make dnn available as gpu.dnn
.
A cleaner version would be to assign dnn = gpu.dnn
above, after dnn_enabled = ...
, and add from theano.sandbox.cuda import dnn
here, and change the if/else
construct below, but I'd be fine with the simpler solution.
dnn.dnn_available.msg) # pragma: no cover | ||
|
||
"requires GPU support -- see http://lasagne.readthedocs.org/en/" | ||
"latest/user/installation.html#gpu-support") # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, please include the comments I had in my suggestion.
lasagne/layers/dnn.py
Outdated
raise ImportError( | ||
"cuDNN not available:See http://lasagne.readthedocs.org\ | ||
/en/latest/user/installation.html#cudnn\ | ||
" % dnn_enabled.msg) # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be careful, you included the % dnn_enabled.msg
but not the %s
.
f88d104
to
c31b5f4
Compare
@f0k , thanks. Unfortunately I don't have more time for this ticket this week. There is a failing test for the coverage. Also, I think we didn't really need to change the order of the commits. People usually update their repo with a merge commit and every commit doesn't need to be a working version on a dev branch. I had to resolve 3 conflicts to change the order of these two commits. |
Having a clean history is not needed for updating -- otherwise we could also leave all the separate fixes in there. It's to trace back potential problems later. E.g., with a clean history, you can use
Sorry, something must have gone wrong then. The two commits change different files, so it should be possible to change their order freely with a
This is weird, this means the code has changed. I'll look into it. Thanks for all the work! |
Interesting. There's a line that's only covered in Python 2. Usually coveralls compiles its report from both the Python 2 and Python 3 run: https://coveralls.io/builds/11789199 |
It did!
There isn't. We can either exclude it from coverage completely, write a different .coveragerc for Python 3, or rely on coveralls to merge the Python 2 and 3 reports as it should. I guess I'll leave it as it is for now. I'll merge this PR in the evening if there are no further comments. |
@f0k , perfect! Thanks a lot! |
@f0k , I wanted to test this with device = cpu, gpu, cuda and floatX=float32,float64 and theano versions of 0.8.X and 0.9.X and master. I'll do the tests in the weekend and update you. |
Okay, thanks! I'll wait with merging then, just in case. When testing 0.8.x, use 0.8.2 if in doubt. The requirement is currently set to 0.8.0, but I'll update it for releasing Lasagne 0.2. |
Hey @f0k , so I tested everything and removed the
The rest were all fine. |
The old back-end (device=gpu) only support float32. So testing it with
float64 isn't a good think to do, as many things will fail. So it seem all
is good.
…On Mon, Jun 5, 2017 at 11:19 AM Reyhane Askari ***@***.***> wrote:
Hey @f0k <https://github.com/f0k> , so I tested everything and removed
the .pyc and theano cache after each test :
1. On theano/master, with device=gpu, we have this error which is
expected: ValueError: You are tring to use the old GPU back-end. It
was removed from Theano. Use device=cuda* now.
2. On theano/0.8.X with device=gpu and floatX=float64, we have many
errors. I don't know if it was working before and these are expected. It is
attached here
<https://github.com/Lasagne/Lasagne/files/1052332/8_gpu_64.txt>. It
works fine with float32.
The rest were all fine.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#836 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AALC--iHTe_m9fpj0qYixz50Dhh5oBTBks5sBByYgaJpZM4NarF1>
.
|
Thank you for the extensive test! So the errors you had with float64 and device=gpu were all Merging this, thanks again! |
Hi @f0k , we are willing to contribute more to Lasagne, specially for the issues that would help the release of Lasagne so that the old backend of theano won't be used anymore. Do you have any suggestions where I can help you? |
That's great! I'll also be able to contribute more myself when I've finished my thesis (probably in a month), but help for releasing 0.2 would be greatly appreciated. It's taking much longer than anticipated.
In general, the issues/PRs to be addressed for 0.2 are listed in this milestone: |
Great! For the future: with "just leave a post" I meant "leave a post at the corresponding issue that you plan to work on" :) Just so nobody else takes it (which is unlikely given past experience, but anyway). |
This PR is a first attempt to fix #698, I have not added the fix for sphinx yet.