Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Export conda explicit specification file from project #1873

Merged
merged 8 commits into from
Sep 10, 2024

Conversation

synapticarbors
Copy link
Contributor

@synapticarbors synapticarbors commented Aug 21, 2024

This builds on the basic export framework designed in #1427, but focuses on exporting a Conda Explicit Specification. It's based on the prototype that I wrote in pixi2ces.

The idea is to extract the package URLs from the lock file, and those can be used to create a conda env. The output file can be used directly:

mamba create -n test --file <spec file>

This offers a bridge from pixi to conda so that environments and locking can be done by pixi, but then the solved environments can be used directly in conda. I know there have been discussions about pixi as a replacement for conda-lock and this is a personal interest since there are still situations where tooling is built around conda/mamba but it's easier to manage environment specifications with pixi and then use the output to deploy in conda.

This also includes an option to dump a side requirements.txt file for pypi-dependencies. You could then (1) create the conda env, and then (2) use pip or uv to install pypi reqs using this additional file. What can go into a requirements.txt file is a lot more varied, so I'm 100% not sure if I've covered all of the representations that pixi handles.

Feedback is appreciated since I'm not super experienced in rust and this is my first contribution to pixi. I'm also not sure about testing, since I don't know if there is a way to round-trip the output. There are round-trip tests of the ExplicitEnvironmentSpec output to string functionality that I wrote in rattler https://github.com/conda/rattler/blob/main/crates/rattler_conda_types/src/explicit_environment_spec.rs#L275-L324

Also thank you in advance to @abkfenris for providing the interface design for the export command.

Fixes #1216


/// The platform to render. Defaults to the current platform.
#[arg(long)]
pub platform: Option<Platform>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A high-value property of conda-lock is being able to get all the locks for a single env. For pixi, I'd probably also want multiple environments, by default all. This explosion is handled with the --filename-template, where there are a few known tokens available. Also constructor really prefers .txt as an extension, but who knows.

Putting these together:

[tasks.generate-flat-locks]
cmd = """
pixi project export conda_explicit_spec \
  --environment "*"
  --filename-template "locks/{enviroment}_{platform}.{suffix}"
  --conda-suffix "conda.lock.txt"
  --pip-suffix "requirements.txt"
"""
inputs = ["pixi.lock"]
outputs = ["locks"]

Might yield:

build_linux-64.conda.lock.txt
build_linux-64.pip-requirements.txt
build_win-64.conda.lock.txt
build_win-64.pip-requirements.txt
test_linux-64.conda.lock.txt
test_linux-64.pip-requirements.txt
test_win-64.conda.lock.txt
test_win-64.pip-requirements.txt

Or:

cmd = """
pixi project export conda_explicit_spec \
  --environment "*"
  --filename-template "locks/{enviroment}/{platform}/{suffix}"
  --conda-suffix "conda.lock.txt"
  --pip-suffix "requirements.txt"
"""

Would generate something more like:

build/
  linux-64/
    conda.lock.txt
    pip-requirements.txt
  win-64/
    conda.lock.txt
    pip-requirements.txt
test/
  linux-64/
    conda.lock.txt
    pip-requirements.txt
  win-64/
    conda.lock.txt
    pip-requirements.txt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to add a way to dump all env/platform combinations if that's what the pixi team thinks is the way to go. I'm not sure about --filename-template. I think that's non-trivial to do in rust, or at least I'm not sure how best to implement that. Indicatif does something similar with progress bar templating, so that could give some ideas. Would it be sufficient to have an option to dump all, but the file naming convention is not flexible? Currently it's just conda-{platform}-{env}.lock and requirements-{platform}-{env}.txt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love that. Would love to see that either in this PR or a followup!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, the minimal thing for e.g. miniforge-style repos using constructor would be:

  • mostly-fixed (but sufficiently verbose) file names
    • a .txt ending for the conda-lock so that constructor knows what to do with it without invoking the solver again
  • ideally an --output-folder (which it would ensure exists)

I would still want to check these in, as they are pretty much optimal inputs to inputs/depends-on and, by extension, CI caches, while pixi.lock is generally too-broad (e.g. don't rebuild the whole product just because you changed the test procedure).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bollwyvl Ok, so from what you're saying as well as the suggestions in #1873 (comment), we want to create an output directory that will contain the exported files. The linked comment just has it as a positional argument, while you're suggesting a flag. I'm not sure which would be preferred. Other questions/comments:

  • Do we need to handle warnings or errors for overwriting existing outputs, or just let it overwrite?
  • Makes sense to use the .txt suffix since that's in the CEP standard. I was just basing it on the naming convention from the explicit render in conda-lock which used .lock

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

positional argument, while you're suggesting a flag

Positional, but defaults to . seems reasonable as...

warnings or errors for overwriting existing outputs

...while I haven't kicked the tires, I imagine the runtime of this is probably so small it won't matter.

naming convention from the explicit render in conda-lock

conda-lock lets a user emit whatever they want, and doesn't generally talk about a specific named environment, so always needed something in anything but the most trivial case. Or, put differently, it never bothered me enough to PR to change it.

@ruben-arts
Copy link
Contributor

Hi @synapticarbors,
@Hofer-Julian and I had a conversation to propose a cli format. Here is our conclusion. Hope it helps!

The pixi project export

The simplest command

environment.yml

pixi project export --format conda-environment output

Results in:

.
├── output
│   ├── environment.yml # default env
│   └── <name_of_environment>-environment.yml # non default envs are prefixed with name
└── pixi.toml

conda-lock

pixi project export --format conda-lock output

Results in:

.
├── output
│   ├── conda-lock.yml # default env
│   └── <name_of_environment>-conda-lock.yml # non default envs are prefixed with name
└── pixi.toml

Specifiy which environment to output

pixi project export --environment test --environment lint --format conda-environment output

Results in:

.
├── output
│   ├── test-environment.yml
│   └── lint-environment.yml
└── pixi.toml

Differences between format

  • conda-lock: full "explicit" lockfile
  • conda-environment: manifest like, specs defined as in the manifest

Future notes

  • We consider therequirements.txt with pip-requirements as the --format
  • --platform could be implemented but it doesn't have to be in the first version
  • --filename-template is something we don't want to get started with as good defaults make it simpler for the users and lower maintenance load.


/// The platform to render. Defaults to the current platform.
#[arg(long)]
pub platform: Option<Platform>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love that. Would love to see that either in this PR or a followup!

/// This flag allows creating the spec file even if PyPI dependencies are present.
/// Alternatively see --write-pypi-requirements
#[arg(long, default_value = "false")]
ignore_pypi_errors: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldnt we just rename this to --skip-pypi?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to clarify something, since the comment

--platform could be implemented but it doesn't have to be in the first version

leads me to believe that there might be some confusion. My understanding is that there are 3 conda formats in the wild currently:

  • environment.yml This mirrors the manifest in terms of how package versions can be specified and can mix conda and pypi packages. It is platform independent, but supports selectors (at least via conda-lock). This can be used by conda/mamba, but requires a full solve.
  • conda-lock Produced by conda-lock. It contains all platform and env information. You can create environments using conda-lock or micromamba, but not with conda itself.
  • explicit specification file This is bound to a single environment/platform combination since it basically just contains raw urls. It is installable directly by conda/mamba and cannot contain pypi packages.

What this PR contains is an implementation of the explicit specification file, so it requires both an environment and platform provided by the user, unless we want to dump a separate file for every combination found in the project. Additionally, since it can't contain pypi packages, a side file must be included or those packages have to be ignored. Making it its own format would require users to make similar choices about what they want in the requirement.txt since it can contain loose version bounds or explicit urls (as I'm doing here), and then they would have to run two different export commands.

I thought to make each format its own subcommand since they will have potentially independent flags, rather than a --format or --kind (like conda-lock uses to switch between formats).

@baszalmstra -- I could change it to --skip-pypi, but I originally based my prototype off of pixi-pack and --ignore-pypy-errors was the flag it was using.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right: ideally at least the docs/interactive --help would point at a spec, e.g

@synapticarbors
Copy link
Contributor Author

I refactored the code with the following changes:

  • no arguments dumps all files for every (env, platform) pair to a named output directory
  • using the -e/--environment and -p /--platform flags (which can be repeated for multiple envs and platforms), you can dump a subset of (env, platform) combinations.

So for example using the polarify example:

$ pixi project export ces -p osx-64 -p linux-64 -e default output
$ eza --tree output
output
├── default-linux-64-conda_spec.txt
└── default-osx-64-conda_spec.txt

or

$ pixi project export ces -p linux-64  output
$ eza --tree output
output
├── default-linux-64-conda_spec.txt
├── lint-linux-64-conda_spec.txt
├── pl017-linux-64-conda_spec.txt
├── pl018-linux-64-conda_spec.txt
├── pl019-linux-64-conda_spec.txt
├── pl020-linux-64-conda_spec.txt
├── py39-linux-64-conda_spec.txt
├── py310-linux-64-conda_spec.txt
├── py311-linux-64-conda_spec.txt
└── py312-linux-64-conda_spec.txt

For the pypi-source-deps example:

$ pixi project export ces --write-pypi-dependencies output
$ eza --tree output
output
├── default-linux-64-conda_spec.txt
├── default-linux-64-requirements.txt
├── default-osx-64-conda_spec.txt
├── default-osx-64-requirements.txt
├── default-osx-arm64-conda_spec.txt
├── default-osx-arm64-requirements.txt
├── default-win-64-conda_spec.txt
└── default-win-64-requirements.txt
$ cat output/default-linux-64-conda_spec.txt
# Generated by `pixi project export`
# platform: linux-64
@EXPLICIT
https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2#d7c89558ba9fa0495403155b64376d81
https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-2_gnu.tar.bz2#73aaf86a425cc6e73fcf236a5a46396d
https://conda.anaconda.org/conda-forge/linux-64/bzip2-1.0.8-hd590300_5.conda#69b8b6202a07720f448be700e300ccf4
https://conda.anaconda.org/conda-forge/linux-64/ca-certificates-2024.2.2-hbcca054_0.conda#2f4327a1cbe7f022401b236e915a5fef
https://conda.anaconda.org/conda-forge/linux-64/ld_impl_linux-64-2.40-hf3520f5_1.conda#33b7851c39c25da14f6a233a8ccbeeca
https://conda.anaconda.org/conda-forge/linux-64/libexpat-2.6.2-h59595ed_0.conda#e7ba12deb7020dd080c6c70e7b6f6a3d
https://conda.anaconda.org/conda-forge/linux-64/libffi-3.4.2-h7f98852_5.tar.bz2#d645c6d2ac96843a2bfaccd2d62b3ac3
https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-13.2.0-h77fa898_7.conda#72ec1b1b04c4d15d4204ece1ecea5978
https://conda.anaconda.org/conda-forge/linux-64/libgomp-13.2.0-h77fa898_7.conda#abf3fec87c2563697defa759dec3d639
https://conda.anaconda.org/conda-forge/linux-64/libnsl-2.0.1-hd590300_0.conda#30fd6e37fe21f86f4bd26d6ee73eeec7
https://conda.anaconda.org/conda-forge/linux-64/libsqlite-3.45.3-h2797004_0.conda#b3316cbe90249da4f8e84cd66e1cc55b
https://conda.anaconda.org/conda-forge/linux-64/libuuid-2.38.1-h0b41bf4_0.conda#40b61aab5c7ba9ff276c41cfffe6b80b
https://conda.anaconda.org/conda-forge/linux-64/libxcrypt-4.4.36-hd590300_1.conda#5aa797f8787fe7a17d1b0821485b5adc
https://conda.anaconda.org/conda-forge/linux-64/libzlib-1.2.13-hd590300_5.conda#f36c115f1ee199da648e0597ec2047ad
https://conda.anaconda.org/conda-forge/linux-64/ncurses-6.5-h59595ed_0.conda#fcea371545eda051b6deafb24889fc69
https://conda.anaconda.org/conda-forge/linux-64/openssl-3.3.0-h4ab18f5_3.conda#12ea6d0d4ed54530eaed18e4835c1f7c
https://conda.anaconda.org/conda-forge/linux-64/python-3.12.3-hab00c5b_0_cpython.conda#2540b74d304f71d3e89c81209db4db84
https://conda.anaconda.org/conda-forge/linux-64/readline-8.2-h8228510_1.conda#47d31b792659ce70f470b5c82fdfb7a4
https://conda.anaconda.org/conda-forge/linux-64/tk-8.6.13-noxft_h4845f30_101.conda#d453b98d9c83e71da0741bb0ff4d76bc
https://conda.anaconda.org/conda-forge/noarch/tzdata-2024a-h0c530f3_0.conda#161081fc7cec0bfda0d86d7cb595f8d8
https://conda.anaconda.org/conda-forge/linux-64/xz-5.2.6-h166bdaf_0.tar.bz2#2161070d867d1b1204ea749c8eec4ef0

and

$ cat output/default-linux-64-requirements.txt
https://files.pythonhosted.org/packages/bb/2a/10164ed1f31196a2f7f3799368a821765c62851ead0e630ab52b8e14b4d0/blinker-1.8.2-py3-none-any.whl --hash=sha256:1779309f71bf239144b9399d06ae925637cf6634cf6bd131104184531bf67c01
https://files.pythonhosted.org/packages/ba/06/a07f096c664aeb9f01624f858c3add0a4e913d6c96257acb4fce61e7de14/certifi-2024.2.2-py3-none-any.whl --hash=sha256:dc383c07b76109f368f6106eee2b593b04a011ea4d55f652c6ca24a754d1cdd1
https://files.pythonhosted.org/packages/28/76/e6222113b83e3622caa4bb41032d0b1bf785250607392e1b778aca0b8a7d/charset_normalizer-3.3.2-py3-none-any.whl --hash=sha256:3e4d1f6587322d2788836a99c69062fbb091331ec940e02d12d179c1d53e25fc
https://github.com/pallets/click/releases/download/8.1.7/click-8.1.7-py3-none-any.whl
git+https://github.com/pallets/flask@f93dd6e826a9bf00bf9e08d9bb3a03abcb1e974c
https://files.pythonhosted.org/packages/e5/3e/741d8c82801c347547f8a2a06aa57dbb1992be9e948df2ea0eda2c8b79e8/idna-3.7-py3-none-any.whl --hash=sha256:82fee1fc78add43492d3a1898bfa6d8a904cc97d8427f683ed8e798d07761aa0
https://files.pythonhosted.org/packages/ef/a6/62565a6e1cf69e10f5727360368e451d4b7f58beeac6173dc9db836a5b46/iniconfig-2.0.0-py3-none-any.whl --hash=sha256:b6a85871a79d2e3b22d2d1b94ac2824226a63c6b741c88f7ae975f18b6778374
https://files.pythonhosted.org/packages/04/96/92447566d16df59b2a776c0fb82dbc4d9e07cd95062562af01e408583fc4/itsdangerous-2.2.0-py3-none-any.whl --hash=sha256:c6242fc49e35958c8b15141343aa660db5fc54d4f13a1db01a3f5891b98700ef
https://files.pythonhosted.org/packages/31/80/3a54838c3fb461f6fec263ebf3a3a41771bd05190238de3486aae8540c36/jinja2-3.1.4-py3-none-any.whl --hash=sha256:bc5dd2abb727a5319567b7a813e6a2e7318c39f4f487cfe6c89c6f9c7d25197d
https://files.pythonhosted.org/packages/42/d7/1ec15b46af6af88f19b8e5ffea08fa375d433c998b8a7639e76935c14f1f/markdown_it_py-3.0.0-py3-none-any.whl --hash=sha256:355216845c60bd96232cd8d8c40e8f9765cc86f46880e43a8fd22dc1a1a8cab1
https://files.pythonhosted.org/packages/87/5b/aae44c6655f3801e81aa3eef09dbbf012431987ba564d7231722f68df02d/MarkupSafe-2.1.5.tar.gz --hash=sha256:d283d37a890ba4c1ae73ffadf8046435c76e7bc2247bbb63c00bd1a709c6544b
https://files.pythonhosted.org/packages/b3/38/89ba8ad64ae25be8de66a6d463314cf1eb366222074cfda9ee839c56a4b4/mdurl-0.1.2-py3-none-any.whl --hash=sha256:84008a41e51615a49fc9966191ff91509e3c40b939176e643fd50a5c2196b8f8
-e ./minimal-project
https://files.pythonhosted.org/packages/49/df/1fceb2f8900f8639e278b056416d49134fb8d84c5942ffaa01ad34782422/packaging-24.0-py3-none-any.whl --hash=sha256:2ddfb553fdf02fb784c234c7ba6ccc288296ceabec964ad2eae3777778130bc5
https://files.pythonhosted.org/packages/88/5f/e351af9a41f866ac3f1fac4ca0613908d9a41741cfcf2228f4ad853b697d/pluggy-1.5.0-py3-none-any.whl --hash=sha256:44e1ad92c8ca002de6377e165f3e0f1be63266ab4d554740532335b9d75ea669
https://files.pythonhosted.org/packages/f7/3f/01c8b82017c199075f8f788d0d906b9ffbbc5a47dc9918a945e13d5a2bda/pygments-2.18.0-py3-none-any.whl --hash=sha256:b8e6aca0523f3ab76fee51799c488e38782ac06eafcf95e7ba832985c8e7b13a
git+https://github.com/pytest-dev/pytest.git@51845fc70dba0fba27387e21e2db39d583892dec
git+https://github.com/psf/requests.git@0106aced5faa299e6ede89d1230bd6784f2c3660
https://files.pythonhosted.org/packages/87/67/a37f6214d0e9fe57f6ae54b2956d550ca8365857f42a1ce0392bb21d9410/rich-13.7.1-py3-none-any.whl --hash=sha256:4edbae314f59eb482f54e9e30bf00d33350aaa94f4bfcd4e9e3110e64d0d7222
https://files.pythonhosted.org/packages/a2/73/a68704750a7679d0b6d3ad7aa8d4da8e14e151ae82e6fee774e6e0d05ec8/urllib3-2.2.1-py3-none-any.whl --hash=sha256:450b20ec296a467077128bff42b73080516e71b56ff59a60a02bef2232c4fa9d
https://files.pythonhosted.org/packages/9d/6e/e792999e816d19d7fcbfa94c730936750036d65656a76a5a688b57a656c4/werkzeug-3.0.3-py3-none-any.whl --hash=sha256:fc9645dc43e03e4d630d23143a04a7f947a9a3b5727cd535fdfe155a17cc48c8

@bollwyvl
Copy link
Contributor

bollwyvl commented Aug 25, 2024 via email

@synapticarbors
Copy link
Contributor Author

Kind of pedantic, but: if not configurable, perhaps using _ as the separator would be less ambiguous, as that can't appear in the platform or env name.

Sure, if that's what the pixi devs want, I can change it after the next round of comments.

@synapticarbors
Copy link
Contributor Author

I wanted to add a note about the generated requirements files that I noticed during testing. pip does not like it when you have a mix of urls with and without hashes and will not install from the requirements.txt since the presence of a hash flips on --require-hashes. The solution mentioned in some pip issues is to split the requirements into two files where in one, all listed packages have hashes and the other has no hashes. uv handles this situation from a single file though.

@synapticarbors
Copy link
Contributor Author

@ruben-arts I just wanted to reach out and get your feedback on the changes I've made and what steps still need to be completed to get this accepted and merged.

@ruben-arts
Copy link
Contributor

Hey @synapticarbors Sorry for the late reaction.

I would like the pypi-dependencies to be a seperate command from the conda explicit spec. And a small amount of feedback written to the terminal when the commands succeeds would be nice.

If the documentation is then updated this looks good to me!

@synapticarbors
Copy link
Contributor Author

@ruben-arts Thanks. I'll remove the pypi functionality from here shortly and update the documentation. Are there any tests that I should add?

@ruben-arts
Copy link
Contributor

@synapticarbors, great! If you could add some regression tests to see if the output is stable for the export command that would be great.

@synapticarbors
Copy link
Contributor Author

@ruben-arts Is there a recommended way to build a lockfile for the purpose of testing? I was looking at the builder pattern, but there are a lot of details that would need to be filled in manually. There's also some lock files in rattler/test-data, but I'm not sure about how best to access them from a test in pixi and if that's even desirable.

Or is there another strategy you'd recommend for testing?

@ruben-arts
Copy link
Contributor

For this test I would be fine to call the generation of the ces file from a pre build multi platform, multi env with PyPi dependencies lockfile. And compare the results with a known version to verify we're not breaking it along the way. We usually use insta snapshot for these tests.

@synapticarbors
Copy link
Contributor Author

In the latest changes I've:

  • removed all functionality related to pypi dependencies/requirements.txt
  • changed the delimiter in the output files to _ (e.g. default_linux-64_conda_spec.txt).
  • Added a test using insta that snapshots a multi-env/multi-platform project that's included in the same directory (I wasn't sure if I should put it in the top-level tests).

Let me know if there are any other changes you'd like to see. I'd also appreciate any feedback on code quality since I still consider myself a bit of a rust novice.

Copy link
Contributor

@baszalmstra baszalmstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two nitpicks but other then this this looks very good to me!

src/cli/project/export/conda_explicit_spec.rs Show resolved Hide resolved
src/cli/project/export/conda_explicit_spec.rs Outdated Show resolved Hide resolved
abkfenris added a commit to abkfenris/pixi that referenced this pull request Sep 7, 2024
Builds upon prefix-dev#1873 to add exporting of `environment.yml` files.

Since prefix-dev#1873 provides the conda-lock style matrix of exports, I focused on exporting an environment.yml that matches the manifest.

Tests against a bunch of the different example manifests for the various pypi-options, and I tested locally that at least some of those were installable with micromamba.

Replaces prefix-dev#1427
@baszalmstra baszalmstra merged commit 6188c69 into prefix-dev:main Sep 10, 2024
36 checks passed
@baszalmstra
Copy link
Contributor

@synapticarbors Thanks for this great PR! My apologies for taking so long to get it merged!

baszalmstra pushed a commit that referenced this pull request Sep 13, 2024
This fixes a reference to a flag that was removed from `pixi project
export conda-explicit-spec` during the review of PR #1873.
@matthewfeickert
Copy link
Contributor

@bollwyvl
Copy link
Contributor

ruh-roh: I believe the .txt file (and the examples above, which I somehow overlooked) should be sorted topologically by import order, not alphabetically by package name... i could certainly see how this would confuse $CONDA_EXE, as per the draft spec a solver won't be invoked.

When an explicit input file is processed, the conda client SHOULD NOT invoke a solver. Because of this, the lines SHOULD be sorted topologically; e.g. if a package A depends on B, then the URL of B should come first.

@wolfv
Copy link
Member

wolfv commented Sep 19, 2024

yep, I think that's correct! Thanks for dropping by @bollwyvl

@synapticarbors
Copy link
Contributor Author

Maybe I've just lucked out that all of the envs I've tested this on have worked despite not being topologically sorted? It's currently just pulling from the ordering in the lock file and I didn't put in any logic to order them in a particular way. Is there a simple way of getting the topological sort order from the lockfile?

@synapticarbors
Copy link
Contributor Author

Would the solution be to get an environment from lockfile.environment(env_name), and the get the repo data record using conda_repodata_records_for_platform, and then from the RepoDataRecord, use the depends field of the PackageRecord to get the topological sort order?

@synapticarbors
Copy link
Contributor Author

Also, I want to apologize for overlooking this. I'm a bit embarrassed that this slipped through and is out in the wild. I'm not sure what is the fastest path to getting this fixed. The implementation of toposort that conda-lock is using is in the vendored conda code: https://github.com/conda/conda-lock/blob/main/conda_lock/_vendor/conda/common/toposort.py

and then it's used in conda-lock here: https://github.com/conda/conda-lock/blob/main/conda_lock/lockfile/v2prelim/models.py#L104

I think I could re-implement the toposort from conda in rust and then build up the dependencies from the PackageRecord, but if there's a better strategy, please let me know.

@bollwyvl
Copy link
Contributor

bollwyvl commented Sep 20, 2024 via email

@baszalmstra
Copy link
Contributor

If you get the RepoDataRecords from the lockfile you can use this function to sort them topologically: https://docs.rs/rattler_conda_types/0.27.6/rattler_conda_types/struct.PackageRecord.html#method.sort_topologically

@maresb
Copy link
Contributor

maresb commented Sep 20, 2024

The most complicated thing about "topologically sorting" is the name. Probably it's best to use the Rattler function, but rolling your own is easy, at least in Python. The algorithm is dead-simple:

Proceed in multiple rounds.

The first round consists of all packages with no dependencies. Toss these all into the result in no particular order. For good measure, sort alphabetically to get a deterministic order. (The result of topologically sorting isn't unique!)

The second round consists of all remaining packages whose dependencies are contained in the first round. Again, toss these in as a batch at the end of your result, preferably sorted alphabetically.

Repeat until there are no remaining packages. The number of rounds required is equal to the depth of the dependency DAG. (In case of a dependency cycle this algorithm will get stuck having rounds with outstanding packages but no new ones to add.)

Now your result is topologically ordered.

(Side note: as a topologist, I have no idea why the computer scientists decided to call this "topological".)

@synapticarbors
Copy link
Contributor Author

@baszalmstra -- thanks. I have a fix that I just need to do a bit more testing on and will submit a PR in a couple of hours hopefully. The weird thing is that the breaking pixi project that @matthewfeickert supplied works fine on osx-64. Could there be a difference in how mamba works on linux-64 vs osx-64? If so, that would explain why I didn't catch it with many of the local tests I did.

@fecet
Copy link
Contributor

fecet commented Sep 24, 2024

Anyone compare performance with conda-lock? I found it's significantly slow with conda-lock (~8min and 2min)

@synapticarbors
Copy link
Contributor Author

@fecet -- what exactly are you comparing? For a locked pixi env, the export should be almost instantaneous since there is no solve involved; it's just extracting metadata out of pixi.lock and reformatting it into a text file. If you provide a reproducer, I'm happy to take a look.

@bollwyvl
Copy link
Contributor

(but don't go looking just yet, as the lack of the as-yet-unreleased dependency-order sort fix will break most downstream use cases, a la #1873 (comment))

@fecet
Copy link
Contributor

fecet commented Sep 24, 2024

Will do it tomorrow, I'm trying to use pixi to replace conda, so my comparison starts from a environment.yml (no lock file here), for pixi, I perform

pixi init --import environment.yml
pixi project export conda-explicit-spec . -vvvv

for conda-lock, I do

conda-lock --kind explicit -f environment.yml -p linux-64 --log-level DEBUG

@synapticarbors
Copy link
Contributor Author

@fecet If there isn't a lock file, the export will cause an initial solve of all of the platforms/envs to generate the lock file, so any performance issues should be due to machinery outside of the actual export. It's just doing the same thing if you ran pixi list (or any other command) on an unlocked project. Just checking on my local machine pixi init --import doesn't kick off any solve, so again, that will run the first time you try to export or run any other pixi commands.

pixi should generally be at least as fast as conda-lock (which is using mamba under-the-hood) to resolve the dependencies (although there maybe edge cases), so differences might come down to whether each tool has cached the required downloads already.

Again, I'm happy to take a look if you make the environment.yml available somewhere. If you've found a problematic solve issue, then I'm sure the pixi devs would appreciate knowing about it and I'd raise it in a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add pixi list --explict
8 participants