Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move v0 to an init keyword argument for reduce, etc #27711

Merged
merged 2 commits into from
Jun 28, 2018

Conversation

andyferris
Copy link
Member

@andyferris andyferris commented Jun 21, 2018

The initial value v0 has been moved to a keyword argument init in reduce, mapreduce, foldl, mapfoldl, foldr and mapfoldr.

See related discussions at #27704 and refs within. In particular, this will allow the future addition of multiple input iterators to mapreduce just like we allow for map. We will need to let the deprecations expire before implementing all of #27704. We wouldn't have considered this as practical before because keyword arguments weren't type stable.

To facilitate the optional keyword argument, I introduced a new singleton called NoInit. It seemed unsafe to use nothing as the default value to indicate no specified value, because nothing is beginning to mean something in certain situations and I wanted this to be as generic as possible.

@andyferris andyferris added collections Data structures holding multiple items, e.g. sets keyword arguments f(x; keyword=arguments) triage This should be discussed on a triage call needs nanosoldier run This PR should have benchmarks run on it labels Jun 21, 2018
@@ -59,7 +59,7 @@ julia> a
Set([7, 4, 3, 5, 1])
```
"""
union!(s::AbstractSet, sets...) = foldl(union!, s, sets)
union!(s::AbstractSet, sets...) = foldl(union!, sets; init = s)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally we don't use spaces around = for keyword arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

base/reduce.jl Outdated
Singleton used for the purpose of indicating to the [`reduce`](@ref) family of functions
that no initial value has been specified for the reduction.
"""
struct NoInit; end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must say I'm not a fan of this. This feels like a slippery slope to singletons for every possible situation. Instead I think it makes sense to use nothing here, which has become pretty conventional. If someone wants to actually have nothing as their initial value, they should use Some(nothing).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I'm definitely in two minds about it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, were you suggesting we unwrap Something here?

Copy link
Member

@ararslan ararslan Jun 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. So init=nothing would mean "no initial value provided," whereas init=Some(nothing) would mean "the initial value is provided and is nothing." That's pretty much the use case for Some. 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - bear with me, I'm trying to get my head around this pattern. If init = Some(1) do we use 1 or Some(1) for the inital value? How do I provide Some(nothing) as the initial element if I need to do that? I'll also need to find the best place to handle the unwrapping elegantly. Do we have a convenience function for unwrapping zero or one Somes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You really only ever need to worry about wrapping with Some if you're dealing explicitly with nothings, which should be an incredibly uncommon case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I'm not going to express any preference towards using kwargs or not for this, if we do this, let's do without any special magic constants. Here's pseudo code for it (assuming I approximately get the definition of mapfoldl right):

function mapfoldl(f, op, itr; init...)
    if length(init) == 1
        v = init.init
        init = NamedTuple()
    end
    isempty(init) || throw(ArgumentError("incorrect kwarg")) # MethodError?
    for x in itr
        x = f(x)
        if @isdefined v
            v = op(v, x)
        else
            v = x
        end
    end
    return v
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jameson, I like it!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of weird that we can do this variadic keyword args but not standard ones. Maybe that's fine but it feels kind of messy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will be a slightly messier implementation but it’s simpler for end users, which is what gains my vote here.

It does make me wonder if the “pattern” could be supported in a more straightforward way by the language.

test/reduce.jl Outdated
@@ -118,12 +118,12 @@ sum2(itr) = invoke(sum, Tuple{Any}, itr)
plus(x,y) = x + y
sum3(A) = reduce(plus, A)
sum4(itr) = invoke(reduce, Tuple{Function, Any}, plus, itr)
sum5(A) = reduce(plus, 0, A)
sum5(A) = reduce(plus, A; int=0)
sum6(itr) = invoke(reduce, Tuple{Function, Int, Any}, plus, 0, itr)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can someone help me with this? What's the form for invoke with keyword arguments?

Copy link
Member

@ararslan ararslan Jun 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you have to use the gensym'd name, i.e. Symbol("#kw##reduce"), then specify the keyword arguments' types with a NamedTuple type. See for example how functions with keyword arguments are precompiled in base/precompile.jl.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it took me a while to grok but I think I've got this sorted now. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this not work? invoke(reduce, Tuple{Function, Any}, plus, itr, init=0)

test/reduce.jl Outdated
sum9(A) = mapreduce(x->x, plus, 0, A)
sum10(itr) = invoke(mapreduce, Tuple{Function, Function, Int, Any}, x->x,plus,0,itr)
sum9(A) = mapreduce(x->x, plus, A; init=0)
sum10(itr) = invoke(mapreduce, Tuple{Function, Function, Int, Any}, x->x,plus, 0, itr)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, invoke with keywords?

@andyferris andyferris force-pushed the ajf/reduce-init-keyword branch 2 times, most recently from 96dda6f to 8598a5e Compare June 24, 2018 23:58
@andyferris
Copy link
Member Author

andyferris commented Jun 25, 2018

OK, I've fixed some tests in the first commit.

In the second commit I broke everything again. Here I'm trying to get rid of NoInit using a varargs form of keyword arguments. I've just realized that my implementation is busted because I forgot about Pairs being inserted (I really still don't like this decision).

Like [`reduce`](@ref), but with guaranteed left associativity. `v0` will be used
exactly once.
"""
mapfoldl(f, op, itr; [init])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we use brackets to signal that keyword arguments are optional, see e.g. sum. I'm not sure whether we should or not, but better be consistent about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following the examples in titlecase, replace, replace! (and I think some others). Let's discuss and deal with whatever is the best documentation pattern here separately.

@martinholters
Copy link
Member

Would it make sense to also add an init keyword to the specialized reductions, e.g. sum and prod? (Maybe not so much for giving another value than 0 or 1, respectively, but giving that value with the desired type for the empty case, ref. #27766 (comment)).

@andyferris
Copy link
Member Author

@StefanKarpinski is it likely we will be able to get this into v0.7-final? (Otherwise it's v2.0)

@andyferris
Copy link
Member Author

Would it make sense to also add an init keyword to the specialized reductions, e.g. sum and prod?

Seems like a decent idea. Though, lets see if we can merge this quickly and go from there (also, that's potential 1.x material).

@StefanKarpinski
Copy link
Member

Seems like a good idea to me.

@StefanKarpinski StefanKarpinski added this to the 0.7 milestone Jun 26, 2018
@KristofferC
Copy link
Member

@nanosoldier runbenchmarks(ALL, vs = ":master")

@ararslan
Copy link
Member

Had to restart the server. @nanosoldier runbenchmarks(ALL, vs=":master")

@andyferris andyferris removed the triage This should be discussed on a triage call label Jun 27, 2018
@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@andyferris
Copy link
Member Author

The large regressions shown by nanosoldier are all caused by deprecations affecting BaseBenchmarks.jl. We can fix that separately, but I'm not sure how to use two different versions of BaseBenchmarks on the one nanosoldier comparison job? I have checked performance of mapreduce before and after this PR and I did't observe any regression if you use the new syntax.

(Actually, to me that highlights a design issue with our CI... if the tests we use to check PRs aren't in the same repo (or a git submodule) then how can a single github PR possibly be self-contained?)

The initial value `v0` has been moved to a keyword argument `init` in
`reduce`, `mapreduce`, `foldl`, `mapfoldl`, `foldr` and `mapfoldr`.

(This will allow the future addition of multiple input iterators like
we allow for `map`).
@andyferris
Copy link
Member Author

Fingers crossed for CI this time.

Hack utilizes varargs form of keyword arguments.
@andyferris
Copy link
Member Author

All checks passed. I will merge.

We'll need to update BaseBenchmarks.jl to reflect these deprecations.

@andyferris andyferris merged commit c670739 into master Jun 28, 2018
@andyferris andyferris deleted the ajf/reduce-init-keyword branch June 28, 2018 22:40
@andyferris
Copy link
Member Author

I forgot about accumulate... (I haven't used v0 there before). I guess that'll be another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets keyword arguments f(x; keyword=arguments) needs nanosoldier run This PR should have benchmarks run on it
Projects
None yet
Development

Successfully merging this pull request may close these issues.