-
Notifications
You must be signed in to change notification settings - Fork 839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed colab dependency versions #76
Fixed colab dependency versions #76
Conversation
## Detailed changes - Minor updates on instructions to run the colab file - Made the colab file install torch 1.4.0+cu100, torchvision 0.5.0+cu100 and scipy 1.1.0 (thanks @CyFeng16) - Made the colab file clone the code from the oficial repo (this can now be done since the main changes are merged in it) - Removed warning messages when running interpolation (thanks @CyFeng16)
Hi @AlphaGit The error is obviously an out of memory error:
Is there a way to make it run given that hardware? Is it possible to change any parameter of the network to perform it? Can it use cpu (even lasting for many time)? Also, the same question applies on how to run it for 1080p. Thanks a lot for your work and PR, Kind regards. |
Hey there @betegon! No, unfortunately I’m not aware of a good way to make that happen. I know that some pieces of software based on DAIN (like GRisk/DainApp) just break out the image into smaller ones and perform multiple interpolations for each frame. That works but it’s mainly circumventing the problem. But I guess you could simulate the same approach with ffmpeg commands. Aside from that, I would look into performing some network pruning on DAIN’s stored models, but that seems like a good effort on its own. If you actually did this, not only you’d solve the memory problem, but it’d also run significantly faster. |
Hi @AlphaGit I have used your colab with GPU Tesla K80, 418.67, 11441 MiB on a video in 480p (https://www.youtube.com/watch?v=gWemAUjHo4U). Everything works correctly up to the cell :
where I have the following error :
Do you think this means that there are things to fix or that it comes from the GPU (in the collab, it says that tests have not been done on the K80)? Thanks a lot for your work |
Hi @lbourdois! I have been investigating a bit and I suspect that the error you mention might not be related to the changes in this PR. Unfortunately, I cannot be entirely sure because Google Collab won't give me a K80 so I cannot verify my hypothesis. However, I'll tell you how you can do it and you can tell me if this approach worked. If it works, feel free to submit a new PR with the changes. If it didn't, I think you could open a new issue and we can investigate further. The error seems to be related to the NVidia Drivers and the actual hardware that is running the compiled CUDA code. According to this thread, this issue happens on a K80 when the code being compiled doesn't account for that version. NVIDIA/flownet2-pytorch#86 (comment) If that's the case, you can find the exact version of the code for K80s in this page (it's 3.7), so you would need to modify the '-gencode', 'arch=compute_37,code=sm_37', This is how I would do it in your case, so you don't have to deal with cloning the repo and modifying the Colab notebook:
If it works, that was it and you have the solution. If it doesn't, then this isn't it and we'll need further investigation. |
@AlphaGit For the problem with the K80, I tried to restart the Collab for 1 hour to follow your instructions but I didn't get it back. |
@AlphaGit hola, hablas español cierto? Un gusto! Y muchas gracias por los aportes. Muchas gracias por tu ayuda, un abrazo. hello, you speak spanish right? |
@alphayome Hola! Sí, hablo español. Un placer poder ayudar. :) En lugar de utilizar un link te recomiendo bajar el archivo .ipynb y subirlo a Google Colab. El problema con los links es que se desactualizan muy fácil cuando hay más de una persona trabajando en el archivo, dado que no trabajamos sobre una misma copia. Esto que dices me hace pensar que deberíamos poner algún tipo de versión en ese mismo archivo. Probablemente lo haga en adición a algún cambio extra. |
@AlphaGit gracias por la respuesta. |
@alphayome En la rama master de este repositorio, esa debería siempre ser la versión "autoritativa". https://github.com/baowenbo/DAIN/blob/master/Colab_DAIN.ipynb |
@AlphaGit La respuesta corta es que no se sabe, jeje -- no hay una forma clara de determinar qué te va a entregar Google. No tengo problema en darte una mano con lo que necesites, pero preferiría que no sea en este repositorio a menos que sea un problema con el código. Eso es para evitar generar ruido a los dueños originales. Tengo miedo que muchas actualizaciones los fuercen a no prestar atención aquí, lo cual sería desafortunado para todos. Podés contactarme en privado a [email protected] -- con gusto te doy una mano en el proyecto que estés trabajando. ¡Saludos! |
entiendo @AlphaGit te escribo a gmail. Referente a este hilo, realicé la corrida de CUDA y Se truncaron las últimas líneas 5000 del resultado de transmisión. |
@AlphaGit estaba viendo la solución que le indicaste a iborduos sobre modificar en el setup en my_package\DepthFlowProjection\setup.py tengo que cambiar todas las lineas para que sean iguales? como adjunto acá abajo? `#!/usr/bin/env python3 from setuptools import setup, find_packages cxx_args = ['-std=c++11'] nvcc_args = [ setup( gracias por tu ayuda. (Respondí acá porque tiene que ver con el hilo) |
No, you just need to add them. Make sure you use the right version of the |
@lbourdois Hey there! I actually got a Tesla T4 and was able to test our hypothesis. Yes, it works! I will soon be sending a patch to address the missing Colab GPU model kernels. |
Fixes #44.
Detailed changes