Skip to content

Add support for conda lock file #642

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 59 commits into from
Oct 7, 2024

Conversation

munishchouhan
Copy link
Member

@munishchouhan munishchouhan commented Sep 12, 2024

depends upon seqeralabs/libseqera#25
the above PR needs 'git revert d088604' before merging

This PR will add the following

  1. upload lockfile to bucket
  2. download lockfile from bucket
  3. link in build page

Signed-off-by: munishchouhan <[email protected]>
Signed-off-by: munishchouhan <[email protected]>
@munishchouhan munishchouhan linked an issue Sep 12, 2024 that may be closed by this pull request
@munishchouhan munishchouhan self-assigned this Sep 12, 2024
@munishchouhan munishchouhan marked this pull request as draft September 12, 2024 14:46
@pditommaso
Copy link
Collaborator

Good start. To tell the truth, still not sure we should go ahead with this approach or just store the lock file in the surreal db like we are doing for the conda env, even tho it there's the possibility to have the same problem as #559

@munishchouhan
Copy link
Member Author

Good start. To tell the truth, still not sure we should go ahead with this approach or just store the lock file in the surreal db like we are doing for the conda env, even tho it there's the possibility to have the same problem as #559

#559 will be solved in surrealdb version 2.0.0

@pditommaso
Copy link
Collaborator

I'm bit confused by this comment in the issue

SURREAL_HTTP_MAX_ML_BODY_SIZE (defaults to 4 GiB)
SURREAL_HTTP_MAX_SQL_BODY_SIZE (defaults to 1 MiB)
SURREAL_HTTP_MAX_RPC_BODY_SIZE (defaults to 4 MiB)
SURREAL_HTTP_MAX_KEY_BODY_SIZE (defaults to 16 KiB)
SURREAL_HTTP_MAX_SIGNUP_BODY_SIZE (defaults to 1 KiB)
SURREAL_HTTP_MAX_SIGNIN_BODY_SIZE (defaults to 1 KiB)
SURREAL_HTTP_MAX_IMPORT_BODY_SIZE (defaults to 4 GiB)

It seems suggesting the default sql body size is 1 MB, instead the error we are hitting is much smaller.

@munishchouhan
Copy link
Member Author

It seems suggesting the default sql body size is 1 MB, instead the error we are hitting is much smaller.

we are using /key routes, which is constrained by 16 KiB
SURREAL_HTTP_MAX_KEY_BODY_SIZE (defaults to 16 KiB)

@munishchouhan
Copy link
Member Author

if we use sql to store it then we can bypass this limit

@pditommaso
Copy link
Collaborator

This sounds like a plan. please give it a try

@munishchouhan
Copy link
Member Author

This sounds like a plan. please give it a try

ok sure

@munishchouhan
Copy link
Member Author

munishchouhan commented Sep 13, 2024

There is an issue accessing the Conda lock file. The lock file is present in the generated image, not in the buildkit container we are running in Wave.
so either we need to pull the image in another pod and get the file or we need to generate conda lockfile from conda.yml file

cc @ewels @pditommaso

@munishchouhan
Copy link
Member Author

in latter case of generating conda lockfile from conda file, we still need another job to achieve that

@ewels
Copy link
Member

ewels commented Sep 13, 2024

Better to get the file from the container - I was trying to avoid generating the lock file separately because then there's no absolute guarantee that it'll end up the same as the actual environment. If it comes from the environment itself it's certain.

@ewels
Copy link
Member

ewels commented Sep 13, 2024

Can we print the lock file to stdout and then capture that from the build?

@munishchouhan
Copy link
Member Author

munishchouhan commented Sep 13, 2024

Can we print the lock file to stdout and then capture that from the build?

we can do this:

FROM {{base_image}}
COPY --chown=$MAMBA_USER:$MAMBA_USER conda.yml /tmp/conda.yml
RUN micromamba install -y -n base -f /tmp/conda.yml \
    {{base_packages}}
    && micromamba env export --explicit > environment.lock \
    && cat environment.lock
    && micromamba clean -a -y
RUN 
USER root
ENV PATH="$MAMBA_ROOT_PREFIX/bin:$PATH"

I ran it for conda package 'bwa'
i got this in the stdout

#10 12.24 # This file may be used to create an environment using:
#10 12.24 # $ conda create --name <env> --file <this file>
#10 12.24 # platform: linux-aarch64
#10 12.24 @EXPLICIT

Signed-off-by: munishchouhan <[email protected]>
@ewels
Copy link
Member

ewels commented Sep 15, 2024

Exactly - that works!

Were there a load of lines after the @EXPLICIT yeah?

We'd need to remove the line prefixes but that's all I think..

@pditommaso
Copy link
Collaborator

A better approach (maybe) could be: 1) creating the container "locally"; 2) copy the lock file from the built container via buildkit; 3) uploading it to the registry.

Something similar is done for singularity, here.

@munishchouhan
Copy link
Member Author

A better approach (maybe) could be: 1) creating the container "locally"; 2) copy the lock file from the built container via buildkit; 3) uploading it to the registry.

Something similar is done for singularity, here.

Ok sure, I will try this one

final query = """\
INSERT into wave_conda_lock {
buildId: '$buildId',
condaLock = '$condaLock'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the lock file contains a ' ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point
I have not tested yet with surrealDB
I will use bytes datatype to save it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to byte[]

@pditommaso
Copy link
Collaborator

Added [email protected] and shortened the lock markers. Tried a build but cannot see any lock file. @munishchouhan what am i missing?

@munishchouhan
Copy link
Member Author

Added [email protected] and shortened the lock markers. Tried a build but cannot see any lock file. @munishchouhan what am i missing?

may bewave.build.conda-lock-prefix

@pditommaso
Copy link
Collaborator

may bewave.build.conda-lock-prefix

That was missing indeed, now I see the upload but it's not rendering in the view. checking

@pditommaso
Copy link
Collaborator

ok works

pditommaso and others added 2 commits October 4, 2024 10:59
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: munishchouhan <[email protected]>
@munishchouhan
Copy link
Member Author

munishchouhan commented Oct 4, 2024

@pditommaso is this integration test is enough or were you referring to some other test?

def 'should extract conda lockfile from s3' (){

@pditommaso
Copy link
Collaborator

Ideally it would nice to have a test passing trough the controller, building a conda container and check the lock file exists. S3 uploading not so important, could be mocked. More important lock file is grabbed.

@munishchouhan
Copy link
Member Author

munishchouhan commented Oct 4, 2024

Ideally it would nice to have a test passing trough the controller, building a conda container and check the lock file exists. S3 uploading not so important, could be mocked. More important lock file is grabbed.

I gave it try and able to succeed with full flow, but when we do build in test and image is uploaded to the registry, next time when the same test runs, it will find that image and return cached and in that cache we will not have conda lockfile

I will try to mock the build and send mockedevent to do this testing

Signed-off-by: munishchouhan <[email protected]>
Signed-off-by: munishchouhan <[email protected]>
@munishchouhan
Copy link
Member Author

I overcome the caching issue by providing the invalid repo. so the push image part fails and everytime we run the test, it creates a fresh build

Signed-off-by: munishchouhan <[email protected]>
Signed-off-by: munishchouhan <[email protected]>
@munishchouhan
Copy link
Member Author

this e2e test tasks long time to complete, what we can do is add another module for e2e test, which runs on demand and when merging a pr to master

@pditommaso
Copy link
Collaborator

Ok, let's keep the e2e in a separate PR and merge this one

@munishchouhan
Copy link
Member Author

Ok, let's keep the e2e in a separate PR and merge this one

yes, working on that,
Ok I will remove the e2e from this PR

@pditommaso
Copy link
Collaborator

this still requires the update in libseqera right?

@pditommaso
Copy link
Collaborator

no, already merged 👍

Signed-off-by: munishchouhan <[email protected]>
@munishchouhan
Copy link
Member Author

Tested on local:
Screenshot 2024-10-07 at 15 17 29

@pditommaso pditommaso merged commit 497185e into master Oct 7, 2024
4 checks passed
@pditommaso pditommaso deleted the 172-add-support-for-conda-lock-file branch October 7, 2024 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for Conda lock file
4 participants