Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stable version of Brain Observatory Toolbox and Deep Interpolation and update from MATLAB 2023a to MATLAB 2023b #150

Merged
merged 2 commits into from
Apr 26, 2024

Conversation

aranega
Copy link
Contributor

@aranega aranega commented Apr 25, 2024

This PR includes the Deep Interpolation Toolbox and passes from the bleeding-edge version of the Brain Observatory Toolbox to the stable version 9.3.5, compatible with MATLAB 2023, as well as the upgrade from MATLAB 2023a to MATLAB 2023b.

@vijayiyer05
Copy link

vijayiyer05 commented Apr 25, 2024

Thank you @aranega for this PR.

I investigated the error we saw and it arises because BOT v0.9.4 uses a dictionary method introduced in MATLAB 2023b.

Based on this, I suggest to potentially update this PR asap to:

  1. Go forward to R2023b (rather than deferring this decision as we spoke about today)
  2. Try again w/ BOT v0.9.4 & go back to this version if it works in your hands w/ R2023b

(only after testing by @SimaoBolota-MetaCell of course!)

@aranega
Copy link
Contributor Author

aranega commented Apr 25, 2024

Got it! I'll make the modification and first tests later during the day. I'll update this PR with the 2023b version and the new version of BOT :)

@aranega
Copy link
Contributor Author

aranega commented Apr 26, 2024

@vijayiyer05 @satra The pull request is ready to go. As discussed by email, we will stick with verion 0.9.3.5 at the moment as there is some constraints implied on the folder naming convention starting from 0.9.4 that makes BOT misinterpret some paths.

This pull request also adds the update from MATLAB 2023a to 2023b

@aranega aranega changed the title Add stable version of Brain Observatory Toolbox and Deep Interpolation Add stable version of Brain Observatory Toolbox and Deep Interpolation and update from MATLAB 2023a to MATLAB 2023b Apr 26, 2024
@vijayiyer05
Copy link

@satra You must be very inspiring! The upcoming visit on Monday inspired several refinements in the hopes of the best-possible demo status on Monday (for the base container). If you may be able to process a second PR this week, it's much appreciated. See you Monday.

@satra satra merged commit 3093267 into dandi:dandi Apr 26, 2024
@kabilar
Copy link
Member

kabilar commented May 1, 2024

Hi @aranega, thanks for these contributions. I am not sure why, but the (GPU, MATLAB, GPU+MATLAB) Docker images are failing to build since April 25. I have pasted the logs in #152. Feel free to investigate. We will also explore the issues but may not get to it for a little while since we are currently focused on refactoring our deployment of JupyterHub. Thanks.

@aranega
Copy link
Contributor Author

aranega commented May 1, 2024

Hi @kabilar, thanks a lot for the details and your message. I checked the logs for all the dockerfile, and they all fail for different reasons:

  • the MATLAB dockerfile fails because of mpm, the MATLAB package manager. It's something new, I'll see if I can reproduce the issue in local, but if that's the case, it might be due to an update of mpm I guess. If that's the case, I'm not sure what I can do. I'll tell you more about it as soon as possible.
  • the MATLAB GPU dockerfile fails for a problem of space in the machine that builds the docker image for some reasons. That's odd. At the same time, if there is an issue with mpm, even without issue with the disk space, we should have the same issue than with the MATLAB dockerfile.
  • the GPU image build fails apparently due to some issues with the version of the nvidia runtime libs that are installed from conda. For those, I guess it's worth the shot to try newer version as the one in the MATLAB GPU dockerfile.

If you have some ideas about the disk space or if you know about some updates about mpm, please let me know, that would help in the investigation process 🙂

@satra
Copy link
Member

satra commented May 2, 2024

@aranega - if you build locally and push to your own dockerhub, docker will reuse the layers when building with the autobuild for dandi.

@aranega
Copy link
Contributor Author

aranega commented May 8, 2024

@satra Thanks for the information :) and sorry for the delay, with the fluctuations in my connection stability, it was hard to push an image in my repo in dockerhub. I just finished to push the image for MATLAB. The repository is vincentaranega/dandi and the image is tagged as this vincentaranega/dandi:matlab for the base MATLAB image. How can I retrigger a build?

Checking the logs of the previous failing built for the MATLAB and the MATLAB-GPU build, I get the impression that things are not related to the Dockerfiles, but to other parameters (issue with network for the MATLAB image and issue with free space on disk for the MATLAB-GPU image). When those cases happen, what would be the easiest way to retrigger a build?

@satra
Copy link
Member

satra commented May 9, 2024

unfortunately that's something we have to trigger. i just did. let's see what it does.

@kabilar
Copy link
Member

kabilar commented May 10, 2024

Hi @aranega, the image builds seem to have failed again due to the runners not having enough disk space.

  1. MATLAB Docker image build results in Error: Unable to install software. Check that your device has enough free disk space.
  2. GPU+MATLAB Docker image build results in ...no space left on device...

I wasn't able to find a solution. There are several posts to the Docker Community Forums, but none have an answer. I have emailed Docker support to see how we can increase the disk space for the runners.

@kabilar
Copy link
Member

kabilar commented May 10, 2024

Hi @aranega @vijayiyer05,

Response from Docker Support:

Unfortunately, it isn't possible to increase the disk space for runners, you will need to try and downsize the image anyway you can as this is likely that cause of the error.

Current sizes:

Image Size
Dockerfile 3.6 GB
Dockerfile.matlab 7.3 GB
Dockerfile.gpu 7.7 GB
Dockerfile.gpu.matlab 13.8 GB
  1. The DANDI team will need to work on reducing the GPU image size and fixing the nvidia libraries.
  2. Perhaps you would be able to make the MATLAB images smaller with the following steps:
    1. Remove some of these Python packages? Are there any dandi/example-notebooks or example-live-scripts that run MATLAB scripts that would also need any of the above Python packages?
    2. Further optimize the MATLAB dependencies as they appear to be adding 3.7 GB for the non-GPU image and 6.1 GB for the GPU image.
  3. Although we may move to GitHub Actions to build the images, the standard runners won't be able to handle the MATLAB+GPU image. GitHub offers larger runners but at a cost.

Thanks.

@vijayiyer05
Copy link

vijayiyer05 commented Jul 25, 2024

Hi @kabilar, I've just a good discussion w/ @aranega & others on the MetaCell team. They've been working to characterize & somewhat streamline the compute requirements for building images from the Dockerfiles. Where things stand, we believe the basic MATLAB (which has a nice comprehensive set of tools) should fit into a GitHub Action standard runner.

We're not fully sure where things with your system upgrades. Have you moved to GitHub Actions?

@asmacdo
Copy link
Member

asmacdo commented Jul 25, 2024

Our image builds are still done via Docker directly, but I'm +1 on image builds with GitHub Actions.

I think we should be fine to make that switch even before the rest of our infrastructure is automated with actions

@kabilar
Copy link
Member

kabilar commented Jul 25, 2024

Hi @vijayiyer05, thanks for reaching out. That's great to hear.

On our end, yesterday Austin finished the work we have been doing for the past few months to swap out the "engine" of our JupyterHub deployment so we will now be able to more easily add features and keep in sync with upstream JupyterHub developments. Next, we will be working on reducing costs where we know resources are being wasted, and setting up the tooling to monitor usage/costs. As Austin mentioned, we can prioritize setting up the image builds with GH Actions thereafter. Thanks.

@vijayiyer05
Copy link

vijayiyer05 commented Aug 13, 2024

Thank you @asmacdo & @kabilar for these insights and (exciting!) updates.

I'm not fully clear what the 'early adoption' of GitHub Actions looks like on the downstream end. I'm curious if all is clear to @aranega or if we should further clarify?

@aranega
Copy link
Contributor Author

aranega commented Aug 19, 2024

Thanks a lot @asmacdo and @kabilar for the update 🙂.
Sorry for the late answer @vijayiyer05, the week coming back from vacations was quite overwhelming.

Correct me if I'm wrong, but the early adoption of github action will be transparent on a user point of view, even int the point of view of the image maintainers. Hopefully, with the machines provided by github having more disk space, the default base Matlab image will build fine and the GPU Matlab image will build with a little bit of shrinking.

@kabilar
Copy link
Member

kabilar commented Aug 19, 2024

Hi @vijayiyer05 @aranega, yes, once we put together the GH Action for image building, you will be able to see the image building runs in the Actions tab. We will let you know once this is set up. Please let me know if I misunderstood the discussion. Thanks.

@vijayiyer05
Copy link

Thanks @aranega & @kabilar for your helpful clarifications! I'll follow up further offline with Vincent to map this information to their ongoing project.

@vijayiyer05
Copy link

Looping in @ramnarak (a MathWorks colleague) just so they're aware of this PR. They'll be at the INCF assembly & will (we hope!) speak to MATLAB on DandiHub there in Austin. Between now & then, they can help with testing from the MathWorks end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants