Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple tuple #35812

Merged
merged 8 commits into from
Jul 9, 2023
Merged

Decouple tuple #35812

merged 8 commits into from
Jul 9, 2023

Conversation

deinst
Copy link
Contributor

@deinst deinst commented Jun 22, 2023

Change sage/combinat/tuple.py to use itertools instead of GAP.

📚 Description

This fixes #35784.
Iterating Tuples delegates to itertools.product
Iterating UnorderedTuples delegates to itertools.combinations_with_replacement

📝 Checklist

  • The title is concise, informative, and self-explanatory.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation accordingly.

⌛ Dependencies

The generators for the classes Tuples and UnorderedTuples used
the GAP for generating Tuples.  This could be done much more simply
using itertools.

Also fixed a bug where if the underlying list had multiple identical
elements then redundant tuples were generated.
This commit is just a cosmetic change from the last (deleting a
commented out line).  This should finish bug sagemath#35784.
@mkoeppe
Copy link
Member

mkoeppe commented Jun 23, 2023

Could you add a test that illustrates the fixed bug?

Also removed a blank line that irritated the linter.
@deinst
Copy link
Contributor Author

deinst commented Jun 23, 2023

This adds checks having duplicates in the underlying lists. It also removed a blank line that annoyed the linter

@@ -75,7 +75,7 @@ def __init__(self, S, k):
"""
self.S = S
self.k = k
self._index_list = [S.index(s) for s in S]
self._index_list = list(dict.fromkeys(S.index(s) for s in S))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not simply using self._index_list = list(range(len(S))) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be self._index_list = range(len(S)) is enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to remove duplicates from the list (that is the bug noted in the conversation in the bug report). The reason that I use the dict.fromkeys dance instead of set is so that the order of elements is not changed, leaving the order of the generated tuples the same as before. (The same is true of the seemingly gratuitous reverse in the __iter__)

The reason for the _index_list is to remove duplicates from a list that has non-hashable elements.

range(len(S)) absolutely won't work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for the precise answer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is very bad to rely on the dict order behavior (which could change without notice) and it is not clear that this is what the code is trying to do. IMO it would be much better to then run sorted(set(self._index_list)) to get the unique elements.

Copy link
Collaborator

@tscrim tscrim Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are just ints. Please read what I actually wrote more carefully.

Edit: Perhaps the fact that I was basing everything off the code before the change was unclear; so self._index_list = [S.index(s) for s in S].

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the order is currently guaranteed, but that is basically an implementation detail that could change (as it is not the fundamental programming model for a dict). It also makes for very brittle code a change the insertion order means everything gets broken.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order is guaranteed precisely so that one can rely on it in code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if there is now a strong guarantee that it won't change (which I would not say there is considering Python has shown it is okay breaking backwards incompatibility), it still is brittle and obfuscated code. Having an explicit sorted makes the intent of the code clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

considering Python has shown it is okay breaking backwards incompatibility

If you're referring to the Python 2 -> Python 3 transition, well that won't happen again.

@@ -178,7 +167,7 @@ def __init__(self, S, k):
"""
self.S = S
self.k = k
self._index_list = [S.index(s) for s in S]
self._index_list = list(dict.fromkeys(S.index(s) for s in S))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

@dcoudert
Copy link
Contributor

It is surprising to have Tuples and UnorderedTuples yielding lists and not tuples, but it might be a significant amount of work to change.

@deinst
Copy link
Contributor Author

deinst commented Jun 25, 2023

It is no problem to have it return tuples. The time consuming part is rerunning all the tests. I'll update the pull request as soon as the testing is done (assuming nothing that uses the module breaks).

@deinst
Copy link
Contributor Author

deinst commented Jun 25, 2023

schemes/projective/projective_space.py expects Tuples to return a list. This appears to be the only place that this assumption is made, and it appears easy to fix (just convert each tuple to a list as we use it), though I am reluctant to poke my fingers in code that I do not completely understand.

@dcoudert
Copy link
Contributor

I had a look at schemes/projective/projective_space.py and the only required modification is:

-                    if gcd([ai] + tup) == 1:   # when Tuples returns lists
+                    if gcd((ai,) + tup) == 1:  # when Tuples returns tuples

However, it's best to make a specific issue / PR if we change the return type from list to tuple. We can put this in the wishlist for the moment.

Copy link
Contributor

@dcoudert dcoudert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

This has Tuples and UnorderedTuples return tuples.
Need to fix up schemes/projective/projective_space.py as it assumes the
tuples are lists.
@@ -2374,7 +2374,7 @@ def rational_points(self, bound=0):
for ai in R:
P[i] = ai
for tup in S[i - 1]:
if gcd([ai] + tup) == 1:
if gcd([ai] + list(tup)) == 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, gcd((ai,) + tup) is more appropriate.

sage: def A(a, tup):
....:     return [a] + list(tup)
....: 
sage: def B(a, tup):
....:     return (a,) + tup
....: 
sage: tup = tuple(range(10))
sage: %timeit A(100, tup)
134 ns ± 0.0589 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
sage: %timeit B(100, tup)
84.8 ns ± 0.0729 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
sage: tup = tuple(range(4))
sage: %timeit A(100, tup)
123 ns ± 1.49 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
sage: %timeit B(100, tup)
79.1 ns ± 0.0314 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
sage: %timeit gcd(A(100, tup))
7.03 µs ± 4.32 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
sage: %timeit gcd(B(100, tup))
6.95 µs ± 11.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Copy link
Collaborator

@tscrim tscrim Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not a fair comparison because of the extra (unnecessary) list cast, which creates a copy. (Recall the old code was returning a list.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not a fair comparison because of the extra (unnecessary) list cast, which creates a copy. (Recall the old code was returning a list.)
I agree. Since the code now returns a tuple, it's better to use (ai,) + tup than casting to list.

Part of the ongoing project to free Tuple of GAP.
Copy link
Contributor

@dcoudert dcoudert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

The use of the python dict guarantee of insertion order of keys was
confusing.  There was no compelling reason (other than an aversion to
change) for keeping insertion order.
@mkoeppe
Copy link
Member

mkoeppe commented Jun 29, 2023

Is there a remaining work item, or is it ready for review again?

@deinst
Copy link
Contributor Author

deinst commented Jun 29, 2023

It should be good to go.

Copy link
Collaborator

@tscrim tscrim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

@vbraun
Copy link
Member

vbraun commented Jul 2, 2023

Merge conflict

SageMath version 10.1.beta5, Release Date: 2023-07-01
@github-actions
Copy link

github-actions bot commented Jul 2, 2023

Documentation preview for this PR (built with commit f79c0d0; changes) is ready! 🎉

@deinst
Copy link
Contributor Author

deinst commented Jul 2, 2023

Thanks @mkoeppe . Just for future reference, is there some way I could have foreseen this and prevented everyone extra work?

@mkoeppe
Copy link
Member

mkoeppe commented Jul 3, 2023

Not really; I'm doing bulk changes at the moment, which often create (easy to resolve) conflicts.

Copy link
Contributor

@dcoudert dcoudert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@vbraun vbraun merged commit 9119b4d into sagemath:develop Jul 9, 2023
14 of 15 checks passed
@mkoeppe mkoeppe added this to the sage-10.1 milestone Jul 9, 2023
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix enumeration/cardinality of UnorderedTuples, remove dependency on GAP (for modularization)
5 participants