Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterable vs Iterator distinction #2942

Closed
pkch opened this issue Mar 2, 2017 · 4 comments
Closed

Iterable vs Iterator distinction #2942

pkch opened this issue Mar 2, 2017 · 4 comments

Comments

@pkch
Copy link
Contributor

pkch commented Mar 2, 2017

I think many python programmers think of an iterable as a container of items that allows several passes through it. In other words, they would think the following code is correct:

def count_max_values(iterable: Iterable) -> int:
    """Count the number of times the maximum value appears in iterable"""
    max_value = max(iterable, default=None)
    counter = 0
    for item in iterable:
        if item == max_value:
            counter += 1
    return counter

Obviously, if the argument was an Iterator, there would be no doubt that the above implementation is incorrect.

However, at the moment, the following code would pass the type check:

iter = (i for i in range(5))
count_max_values(iter)

The reason is that the definition of Iterable is currently only concerned with the presence of __get_item__() or __iter__() methods, and so every Iterator is automatically an Iterable.

Would it be worth redefining Iterator and Iterable? For example, the rule could be that if an object defines both __iter__ and __next__ methods, it is not an Iterable (since it's very weird for an iterable to have a __next__ method); otherwise, if it defines __iter__ or __getitem__, it is an Iterable. If necessary, an option could be given to the programmer to explicitly override this rule (marking as Iterable an object with both __iter__ and __next__; or as not Iterable an object with __iter__ and without __next__).

@ilevkivskyi
Copy link
Member

Would it be worth redefining Iterator and Iterable?

I think no, since it reflects the Python runtime semantics. Your code correctly passes mypy because it works at runtime without errors.

The problem with your code is a design/behavior error, not a type error: generator expression is already exhausted before the for loop. Your code implies that you need an "immutable" iterable. So that you could wrapt the argument in a tuple initially iterable = tuple(iterable). This something quite difficult to catch statically.

@gvanrossum
Copy link
Member

We are not going to change this, but you can use the type Container for iterables that can be iterated repeatedly.

@pkch
Copy link
Contributor Author

pkch commented Mar 3, 2017

@gvanrossum You meant Collection, not Container; Container would cause mypy to reject the function as implemented because Container doesn't guarantee the presence of __iter__, which is implicitly used inside the function.

@ilevkivskyi I also thought it's difficult, but it appears this problem has recently been solved in 3.6 with the addition of typing.Collection which works perfectly:

from typing import Collection
def count_max_values(iterable: Collection) -> int:
    """Count the number of times the maximum value appears in iterable"""
    max_value = max(iterable, default=None)
    counter = 0
    for item in iterable:
        if item == max_value:
            counter += 1
    return counter

Now an attempt to call count_max_values( (i for i in range(5)) ) will be statically rejected by mypy.

@gvanrossum
Copy link
Member

gvanrossum commented Mar 3, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants