-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: generator expressions #14848
RFC: generator expressions #14848
Conversation
+1 to the approach – allows getting the feature onto master quickly and with minimal disruption. |
I see how the product only needs a hidden |
This is fantastic ...but greedy for predicates! 🚶 :) |
I'm thinking about how N-d generators should work, in particular whether the generator or the inner iterator should determine the result shape, and how this affects syntax. There are 4 cases, based on whether the desired result is 1-d or n-d, and whether you want to use N variables for N dimensions, or refer to the index tuple as a whole. Here's what the 4 cases look like if the type of generator determines the result shape. I use the following definitions:
We don't have the syntax in the 3rd row, and it seems a bit doubtful. Here's how the last 2 rows change if iterator type determines shape:
This seems to be much more symmetrical, so I conclude that the inside iterator needs to determine the result shape, and that we want only one type of cc @timholy |
Also, that analysis implies that something like |
Yes, that seems to be the natural choice for a language in which arrays are first class citizens. |
I also think the iterator type should determine the shape, and I like the proposed (breaking) change. I haven't followed recent changes as carefully as I'd like, but what's the julia> reshape(collect(Base.product(1:3,1:4,1:3)), (12,3))
12x3 Array{Tuple{Int64,Int64,Int64},2}:
(1,1,1) (1,1,2) (1,1,3)
(2,1,1) (2,1,2) (2,1,3)
(3,1,1) (3,1,2) (3,1,3)
(1,2,1) (1,2,2) (1,2,3)
(2,2,1) (2,2,2) (2,2,3)
(3,2,1) (3,2,2) (3,2,3)
(1,3,1) (1,3,2) (1,3,3)
(2,3,1) (2,3,2) (2,3,3)
(3,3,1) (3,3,2) (3,3,3)
(1,4,1) (1,4,2) (1,4,3)
(2,4,1) (2,4,2) (2,4,3)
(3,4,1) (3,4,2) (3,4,3) With this line of thinking, your first two cases could be written _ for I in vec(cartesian(X,Y))
_ for (i,j) in vec(cartesian(X,Y)) |
The two perspectives can be unified if the shape of the iterator only materializes within |
If Generators are iterators with a shape… and that shape is respected by concatenations and comprehensions, etc., then it makes sense that an iterators shape should be respected by generators themselves, too. Are there other functions or syntaxes that should respect iterator shape? Also… in a somewhat frustrating twist… the pending concatenation change makes the generator comprehension syntax not quite as clear cut. g = (i for i in 1:10) # a ten element generator
v = [i for i in 1:10] # a ten element vector
x = [g] # a one element vector of generators? |
@mbauman If we take a clue from Python, |
I talked earlier about a sister function to collect which could be named |
@timholy Yes, I was thinking of |
c19e917
to
909e08b
Compare
909e08b
to
d065a70
Compare
Have added NEWS and docs. So far, performance is a bit sub-par and we seem to need more inlining. |
d065a70
to
b937ae7
Compare
I propose merging this. Any final suggestions or objections? |
The syntax `f(x) for x in iter` is syntax for constructing an instance of this | ||
type. | ||
""" | ||
immutable Generator{I,F} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any particular reason you chose to go with {I,F}
, but ::F, ::I
(for the order)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think Generator(func, range)
is a bit more natural since it matches the order of comprehensions, but I suspect dispatching on the kind of iterator will be much more common than dispatching on the type of function. For example below I dispatch on Generator{IteratorND}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generator(func, range)
also allows Generator(range) do ... end
syntax.
This is going to be so nice. Big 👍 |
enclosed in parentheses to avoid ambiguity:: | ||
|
||
julia> collect(1/(i+j) for i=1:2, j=1:2) | ||
ERROR: function collect does not accept keyword arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit unfortunate. How about parsing this as a generator instead and using ;
to give keywords in such cases:
collect(1/(i+j) for i=1:2, j=1:2) # 2d generator
collect(1/(i+j) for i=1:2; j=1:2) # 1d generator and keyword argument `j`
collect(j=1:2, 1/(i+j) for i=1:2) # ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would consider giving a syntax error since this is so ambiguous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the presence of the for
makes this quite unambigous in the programmer's mind. You wouldn't write a generator and then a keyword argument without adding parentheses around the former.
It's ambiguous for the parser, but as @StefanKarpinski noted ;
can be used to pass keyword arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also write this:
collect((1/(i+j) for i=1:2), j=1:2) # 1d generator and keyword argument `j`
Given that there are so many other ways to express the 1d generator + keyword arg and this particular one looks so much like it should be a 2d generator, it seems to me that it's a bit of a shame not to parse it as such. If you're not convinced by that, at least make it an error so that we can change our minds later without breaking anyone's code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Julia's parser is "eager" at many occasions: As long as the next token can be part of the current expression, it is. Given all the ambiguity with operators, macro calls, one-line if statements etc., that's an important rule.
Thus this should be a generator, not a keyword. (Or an error, of course.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favor of after a for i=1:2
in parens having both , j=1:2
and , j in 1:2
be interpreted as more of generator. If not that, then an error. Another note is that if we eliminated the comma from the comprehension syntax, then I believe this issue would really go away:
[1/(i+j) for i=1:2 j=1:2]
collect(1/(i+j) for i=1:2 j=1:2)
This is currently a syntax error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What @toivoh said. It might be briefly painful but I think we could consider being more picky about semicolons vs commas for kwargs at call sites.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha. Fair enough. I retract the space-separated proposal :-P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it's worth pointing out that kwargs aren't the only ambiguous case for ,
. other comma-sensitive contexts include tuples and array literals as well as assignment/let/return expressions (although assignment in those other contexts is mostly a syntax error)
How about allowing to specify several indices parenthesized after
Or perhaps you might as well just put the parentheses around the whole generator expression. |
b937ae7
to
578ca47
Compare
Ok, I have changed it to make all trailing comma-separated expressions part of the generator. |
Nice. |
If we don't want to include the parsing of |
Conflicts: src/julia-parser.scm
578ca47
to
5c29840
Compare
doctests added. |
I vote for merging, but only after Tim's inline comments are addressed. They look like pretty minor details. |
5c29840
to
8962053
Compare
function (::Type{IteratorND}){I,N}(iter::I, shape::NTuple{N,Integer}) | ||
if length(iter) != prod(shape) | ||
throw(DimensionMismatch("dimensions $shape must be consistent with iterator length $(iter(a))")) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ERROR: UndefVarError: a not defined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks.
This introduces the types `Generator`, which maps a function over an iterator, and `IteratorND`, which wraps an iterator with a shape tuple.
8962053
to
bc0956b
Compare
RFC: generator expressions
|
I realized a good approach to comprehensions would be to add the generalized syntax (#4470) first, and once we are happy with how it works then use it to implement comprehensions.
Example: