Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply Absorption Law when normalizing markers #4536

Closed
charliermarsh opened this issue Jun 26, 2024 · 11 comments · Fixed by #4639
Closed

Apply Absorption Law when normalizing markers #4536

charliermarsh opened this issue Jun 26, 2024 · 11 comments · Fixed by #4639
Labels
enhancement New feature or improvement to existing functionality

Comments

@charliermarsh
Copy link
Member

E.g., python_version < '3.11' or (python_version < '3.11' and implementation_name == 'cpython') should be reduced to python_version < '3.11'.

@charliermarsh charliermarsh added the enhancement New feature or improvement to existing functionality label Jun 26, 2024
@charliermarsh
Copy link
Member Author

@charliermarsh
Copy link
Member Author

Alternatively, maybe we can just apply this in the or and and methods (make them smart constructors)?

@BurntSushi
Copy link
Member

BurntSushi commented Jun 26, 2024

I think smart constructors are the way to go. To make smart constructors work, you have to make sure they are the only things capable of constructing the internal representation. You also need to make all normalization steps inductive. On top of this, I think the normalization logic (which would need to be ported to smart constructors in pep508_rs) currently lives in uv-resolver because it relies on a pubgrub type. So that dependency on pubgrub would probably need to be broken.

I would love to do this personally as I love smart constructors and they've worked out great when I used them in the past (like in regex-syntax). But it's a meaty task I think.

@ibraheemdev
Copy link
Member

ibraheemdev commented Jun 26, 2024

I'm not sure the smart constructor helps with the simplification. For example, the expression ((x and y) or z) is fully simplified, but combining that with or y we get ((x and y) or z) or y, which simplifies to y or z. Importantly, the complete simplification cannot be performed inductively, so a SAT solver (such as this) would need to be run after the entire marker tree has been constructed.

I think it would be helpful to know what kind of expressions the lockfile generates to see how far we need to take this. We can solve this issue using our current recursive approach and see whether more complex cases come up. It's totally possible that an inductive approach that only looks one layer deep is enough for us, but I would hesitate to add smart constructors and them end up not being useful because we need to do a recursive solve anyways.

konstin added a commit that referenced this issue Jun 28, 2024
This includes a functional change, we now skip the forked state pop/push
if we didn't fork.

From transformers:

```
DEBUG Pre-fork split universal took 0.036s
DEBUG Split python_version >= '3.10' and python_version >= '3.10' and platform_system == 'Darwin' and python_version >= '3.11' and python_version >= '3.12' and python_version >= '3.6' and platform_system == 'Linux' and platform_machine == 'aarch64' took 0.048s
DEBUG Split python_version <= '3.9' and platform_system == 'Darwin' and platform_machine == 'arm64' and python_version >= '3.7' and python_version >= '3.8' and python_version >= '3.9' took 0.038s
```

The messages could use simplification from
#4536

We can consider nested spans in the future but this works nicely for
now.
@notatallshaw-gts
Copy link

FYI I came across this today with the following example:

$ echo "pylint" | uv pip compile - --annotation-style line --python-version 3.10 --universal 2>/dev/null | grep dill
dill==0.3.8 ; python_version < '3.11' or python_version >= '3.11'  # via pylint

@charliermarsh
Copy link
Member Author

Confusingly there are Python versions that do not match either of those bounds. For example, 3.11.0a0 would not match either.

@notatallshaw-gts
Copy link

notatallshaw-gts commented Jul 1, 2024

Confusingly there are Python versions that do not match either of those bounds. For example, 3.11.0a0 would not match either.

I don't feel like that's a correct interpretation of Python version numbers.

When you are installing into a Python installation there is only one specific version number available for that Python. So if you are installing into Python 3.11.0a0 then all versions of Python available (i.e. the one you are installing into) are pre-releases and therefore python_version>= 3.11 matches 3.11.0a0.

But that's just me thinking out loud, when I get a moment I'll check what pip does and read through the spec.

@charliermarsh
Copy link
Member Author

I think per the spec it's correct. I actually agree that we should probably simplify that, but I think that's why it doesn't get simplified today.

python_version>= 3.11 is implicitly python_version>= 3.11.0, and 3.11.0a0 is not >= 3.11.0, right?

@notatallshaw-gts
Copy link

and 3.11.0a0 is not >= 3.11.0, right

My understanding is per PEP 440 3.11.0a0 matches >= 3.11.0, when 3.11.0a0 is the only version available because then pre-releases are implied. So my logic was that when you are installing into Python it's version is the only version available, and therefore you can always take pre-releases to be applied.

But, this was just all off the top of my head, I should of really waited to get home and test it out. I will do so later.

@notatallshaw
Copy link
Contributor

notatallshaw commented Jul 1, 2024

Got home and ran some tests and read some specs and I still think uv isn't correct here, I created a seperate issue: #4714

@charliermarsh
Copy link
Member Author

I filed a separate issue for that normalization here: #4719. (It's different than what's described in this issue.)

ibraheemdev added a commit that referenced this issue Jul 8, 2024
## Summary

More marker simplification:
- Filters out redundant subtrees based on outer expressions, e.g. `a and (a or
b)` simplifies to `a`.
- Flattens nested trees internally, e.g. `(a and b) and c`

Resolves #4536.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or improvement to existing functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants