Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change syntax of azip! to be similar to for loops #626

Merged
merged 1 commit into from
Sep 9, 2019

Conversation

jturner314
Copy link
Member

@jturner314 jturner314 commented May 1, 2019

The azip! macro is a common source of confusion for both new and experienced users. I think this is largely due to its unique and somewhat non-intuitive dereferencing behavior. (For example, it's inconsistent that the pattern corresponding to b is &b, but the pattern corresponding to mut a is mut a. You can see why this is the case (you generally want to dereference in the immutable case but not the mutable case), but it's confusing.)

As I see it, the azip! macro is useful for two reasons:

  1. It's more concise than Zip::from followed by .and and .apply.
  2. It keeps the patterns and producer-generating expressions next to each other so that it's easy to see which parameter name corresponds to which producer.

This PR proposes a new syntax for azip!, based on standard for loops, that still has both of these qualities:

azip!((pat1 in expr1, pat2 in expr2, ...) body_expr)

So, for example, you would write

fn zip_ref_ref<Sa, Sb, D: Dimension>(a: &mut ArrayBase<Sa, D>, b: &ArrayBase<Sb, D>)
where Sa: DataMut<Elem = f64>,
      Sb: Data<Elem = f64>
{
    azip!((a in a, &b in b) *a = 2.0*b);
}

instead of

fn zip_ref_ref<Sa, Sb, D: Dimension>(a: &mut ArrayBase<Sa, D>, b: &ArrayBase<Sb, D>)
where Sa: DataMut<Elem = f64>,
      Sb: Data<Elem = f64>
{
    azip!(mut a (a), b (b) in { *a = 2.0*b });
}

(This example comes from #625.)

The goal of this new syntax is to match the behavior of for loops, which Rust users should be familiar with. The pat and expr in each pat in expr correspond directly to the pat and expr in for pat in expr: pat is the pattern of the item, and expr is the expression creating the producer/iterator.

The special index pattern is handled similarly to how it is now, for example

azip!((index (i, j), &b in &b, &c in &c) {
    a[[i, j]] = b - c;
});

The example above does show one disadvantage of this approach compared to the current version of azip: it's necessary to write &b in &b, while in the current version of azip!, this would simply be b. If desired, we could interpret &b as equivalent to &b in &b. Unfortunately, that wouldn't help with mutable references (because you don't want to dereference the item in the pattern). IMO, the additional clarity and consistency is worth the additional verbosity in that occurs some cases with this PR.

Please let me know what you think!

TODO: Change par_azip too.

Closes #625.

@termoshtt
Copy link
Member

IMO, the additional clarity and consistency is worth the additional verbosity in that occurs some cases with this PR.

I prefer to this verbosity.

it's unique and somewhat non-intuitive dereferencing behavior

I think this confusion is mainly due to () notation in azip!(mut a (a), b (b) in { *a = 2.0*b });. We need identifiers for both iterator and iteratee in azip! use case, and the abbreviation b for &b in &b seems to be overdone.

@LukeMathWalker
Copy link
Member

I have struggled with azip in the past and this definitely looks like a step in the right direction.

@jturner314 jturner314 mentioned this pull request Jul 25, 2019
src/zip/zipmacro.rs Outdated Show resolved Hide resolved
@bluss
Copy link
Member

bluss commented Sep 3, 2019

Great idea.

Old azip works well in the synthetic universe, where arrays are named a, b and c. 😅

The new syntax should work just as well for real world arrays that have longer names or expressions to specify themselves.

I wonder if we could drop the parenthesis around the patterns and trade that for a mandatory block? Would that work out, like this?

azip!(a in array1, &b in array2 { *a = 2.0 * b });

Edit Oh right, I guess that's why we have the in "keyword" in there, and it doesn't work.

With this syntax I'm almost missing that the name of the macro does not use "for" somewhere, it would make it easier to read, like "array for a in array1, ..."?

@jturner314
Copy link
Member Author

Here are a few options:

azip!((a in array1, &b in array2) *a = 2.0 * b);  // a
azip!(for (a in array1, &b in array2) *a = 2.0 * b);  // b
azip!(a in array1, &b in array2 => *a = 2.0 * b);  // c
azip!(for a in array1, &b in array2 => *a = 2.0 * b);  // d

// Require braces around body:
azip!((a in array1, &b in array2) { *a = 2.0 * b });  // e
azip!(for (a in array1, &b in array2) { *a = 2.0 * b });  // f
azip!(a in array1, &b in array2 => { *a = 2.0 * b });  // g
azip!(for a in array1, &b in array2 => { *a = 2.0 * b });  // h

// Move body before loop specification, braces required:
azip!({ *a = 2.0 * b } for a in array1, &b in array2);  // i

I like all of these except for i, which I find confusing because loops in Rust (for, while, while let) are always specified with the looping information before the body.

@bluss
Copy link
Member

bluss commented Sep 4, 2019

My thinking with syntax has generally been that I want to avoid "noise", especially punctuation.

In this:

Zip::from(&mut a).and(&b).and(&c).apply(|a, &b, &c| *a = b + c);

Every symbol that is punctuation - : & ( ) | ; is noise. Some of it necessary, but still noise.
This is the noise:

   ::    (&     ).   (& ).   (& ).     (| , & , & |           );

This is what we want to say:

Zip  from  mut a  and  b  and  c  apply  a   b   c  *a = b + c  

And now you know why I picked some obscure shorthands for azip.

However, I didn't exactly like how azip turned out, and I started preferring plain Zip, it's easier to read even with the noise. For this reason, it's great to have a second go at azip and try to improve it. Some of the shorthands in azip also turned out to be wrong-headed when we expanded with more kinds of NdProducers.

For the alternatives, a and e seem to be the best. (The arrow versions look neat but less so when we have a multiline loop body, maybe you agree=)

Any one of them that are particularly natural to understand for the user? I think all of them are easy to read, I guess it could be tricky to remember how to write all of them. I'm fine with leaving it up to you to pick any variant you'd prefer @jturner314.

By virtue of having patterns exposed like in a for loop and without the pipe symbols for the closure, we can say that we reach some level of noise reduction with azip, hopefully to make it worthwhile above Zip. If we don't do better than Zip (all metrics considered), it doesn't have a point of course.

@jturner314
Copy link
Member Author

It seems like there is consensus that the approach in this PR is good. I've rebased on the latest master and made a small fix to the azip docs; this is ready to merge once CI passes.

I've chosen not to require curly braces around the body expression. The user can always use braces if they prefer, but in simple cases, it's nice not to require the extra noise.

@jturner314 jturner314 merged commit 7d04eb7 into rust-ndarray:master Sep 9, 2019
@jturner314 jturner314 deleted the new-azip branch September 9, 2019 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants