Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

document that 2-arg mul! methods trashes all arguments #16772

Closed
tkelman opened this issue Jun 5, 2016 · 88 comments
Closed

document that 2-arg mul! methods trashes all arguments #16772

tkelman opened this issue Jun 5, 2016 · 88 comments
Labels
docs This change adds or pertains to documentation linear algebra Linear algebra stdlib Julia's standard library

Comments

@tkelman
Copy link
Contributor

tkelman commented Jun 5, 2016

Example methods are here for Triangular, or here for Diagonal where these follow the BLAS trmm convention of mutating whichever argument is not the type in question. So A_mul_B!(A, B) might mutate A, or it might mutate B, depending on their types. This is the case for several other combinations of input types as well, and I think it's bad for generic programming. Ref collapsed discussion right below #16615 (comment), and #16577 and #16562.

If we can't apply a consistent rule here, "always mutates the second argument" (or, if we apply it consistently everywhere, the first?), then we shouldn't define this method. Better to favor the more explicit A_mul_B!(C, A, B) method where the output is a separate argument. Some combinations of types can allow one of A_mul_B!(A, A, B) or A_mul_B!(B, A, B) to work, but writing code that does that will be inherently non-generic. The individual methods of A_mul_B! should be responsible for checking whether the inputs are === (and possibly more complicated forms of alias checks if we have more view types in common use) and only allowing the specific combinations that are expected to work.

This issue is still present for all the in-place multiplication methods even if #6837 renames all 7 (or 9 if you count possible combinations we don't define right now, ref #5332) variants to methods of mul!. I think this is an explicitly post-0.5 cleanup to make though since we don't have time to deal with it right now.

@tkelman tkelman added linear algebra Linear algebra design Design of APIs or of the language itself labels Jun 5, 2016
@simonbyrne
Copy link
Contributor

simonbyrne commented Jun 5, 2016

I know I've mentioned it before, but I would like to plug the Inplace{N} type I use in InplaceOps.jl.

@stevengj
Copy link
Member

stevengj commented Jun 6, 2016

See also #16702

@tkelman
Copy link
Contributor Author

tkelman commented Aug 1, 2016

Adding to this list the 2-argument scale!(A, b) vs scale!(b, A) from #14425 (and #17739). The ambiguity in which argument gets mutated strikes me as bad for the same reasons as in A_mul_B!, so we should probably require an explicit output argument if you need to be picky about left multiplication vs right-multiplication.

@Keno
Copy link
Member

Keno commented Aug 1, 2016

I don't think scale! is problematic, because in general one of them is a scalar, unlike A_mul_B! which are both arrays.

@tkelman
Copy link
Contributor Author

tkelman commented Aug 1, 2016

There are a few methods where that's not the case:

scale!(A::AbstractArray{T<:Any,2}, b::AbstractArray{T<:Any,1}) at linalg/generic.jl:477
scale!(b::AbstractArray{T<:Any,1}, A::AbstractArray{T<:Any,2}) at linalg/generic.jl:478

If you write code where some inputs may need to have varying dimensionality of scalar, 1, or 2, there's an ambiguity here.

@Keno
Copy link
Member

Keno commented Aug 1, 2016

Arguably those shouldn't exist. I had no idea what they were going to do until I looked at the code. I think that should be explicitly written as

A*Diagonal(b)

(or rather whatever mutating equivalent we have).

@andreasnoack
Copy link
Member

scale! is the mutating equivalent and, as far as I'm aware, nobody who has been using these methods has had problems. With a language that supports dispatch, I think it fine to have scale!(Vector, Matrix) and scale!(Matrix,Vector) instead of scale_matrix_with_vector_left!(Matrix,Vector) and scale_matrix_with_vector_right!(Matrix,Vector).

@tkelman
Copy link
Contributor Author

tkelman commented Aug 1, 2016

What does

function foo(a, b)
    scale!(a, b)
    return a
end

do then? If we can't give a consistent definition for what the generic function verb scale! means in terms of its side effects, that's bad.

@andreasnoack
Copy link
Member

I think we should use types in the definitions. That was the point I tried to make in the last comment.

@Keno
Copy link
Member

Keno commented Aug 1, 2016

I don't have a particularly strong opinion on the topic, but it feels like a pun to me.

@tkelman
Copy link
Contributor Author

tkelman commented Aug 1, 2016

I think we should use types in the definitions

That defeats the purpose of generic programming. I think the non-commutative case would be better served by a scaleleft!(A, b) and scaleright!(A, b) where the mutated argument is the first input, as is the convention in most other ! functions.

edit: or mul!(A, A, b) for right-multiplication by a scalar and mul!(A, b, A) for left-multiplication by a scalar.

@Jutho
Copy link
Contributor

Jutho commented Aug 2, 2016

I am also in favor of only having the 3-argument mul!, where the only difference for the special types (scaling with a vector, or multiplying with a triangular matrix) is that you can use the same output argument as the (other) input argument, whereas you can't for a general matrix. That might still be hard to explain, but a simple detection in the generic matrix multiplication code (which is already present) suffices to catch this.

@StefanKarpinski StefanKarpinski added this to the 0.6.0 milestone Aug 2, 2016
@Sacha0
Copy link
Member

Sacha0 commented Dec 22, 2016

Unlikely to receive attention prior to 0.6. Best!

@StefanKarpinski
Copy link
Sponsor Member

Bump, if any linalg people are looking for things to do.

@StefanKarpinski
Copy link
Sponsor Member

Seems like the two-argument mul! should be documented to return the product and potentially trash either or both of its arguments and that it may return one of its arguments or a new object.

@JeffBezanson JeffBezanson added the docs This change adds or pertains to documentation label Jul 27, 2017
@JeffBezanson
Copy link
Sponsor Member

A majority of triage'ers think this behavior is fine if it's clearly documented, warning people that you can't predict which argument will be mutated.

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Jul 27, 2017

The three-argument mul! should be documented to only mutate the output, so if you want to mutate only the left argument of the product, then write mul!(A, A, B); if you want to mutate only the right argument of the product, write mul!(B, A, B); if you want to leave them both intact, write mul!(A, B, C). Without the two-argument form mul!(A, B) there's no way to express that you're ok with either argument being trashed as long as you get the product as a result – this is far less constrained than any of the above three calls since it could return either argument or neither and completely independently it could trash either, both or neither of the inputs depending on what's efficient. It should be clearly documented that this is expert functionality to be used with caution.

@tkelman
Copy link
Contributor Author

tkelman commented Jul 27, 2017

In practice in current package code the two-argument versions are very rarely used, and when they are they are virtually always used for their side effects rather than assigning their output and using that. Given that usage, these are more harmful than helpful since their side effects aren't predictable in a generic way.

@simonbyrne
Copy link
Contributor

simonbyrne commented Jul 27, 2017

A lot of the 3-argument versions will fail or give incorrect results if you reuse input and output arguments:

julia> A = eye(2)
2×2 Array{Float64,2}:
 1.0  0.0
 0.0  1.0

julia> B = eye(2)
2×2 Array{Float64,2}:
 1.0  0.0
 0.0  1.0

julia> Base.A_mul_B!(A,A,B)
ERROR: ArgumentError: output matrix must not be aliased with input matrix
Stacktrace:
 [1] gemm_wrapper!(::Array{Float64,2}, ::Char, ::Char, ::Array{Float64,2}, ::Array{Float64,2}) at ./linalg/matmul.jl:349
 [2] A_mul_B!(::Array{Float64,2}, ::Array{Float64,2}, ::Array{Float64,2}) at ./linalg/matmul.jl:148


julia> A = ones(BigInt, 4, 4)
4×4 Array{BigInt,2}:
 1  1  1  1
 1  1  1  1
 1  1  1  1
 1  1  1  1

julia> B = ones(BigInt,4,4)
4×4 Array{BigInt,2}:
 1  1  1  1
 1  1  1  1
 1  1  1  1
 1  1  1  1

julia> A_mul_B!(A,A,B)
4×4 Array{BigInt,2}:
 4  7  13  25
 4  7  13  25
 4  7  13  25
 4  7  13  25

@tkelman
Copy link
Contributor Author

tkelman commented Jul 27, 2017

Looks like the BigInt method is skipping the aliasing checks that the Float64 method caught, that should be fixed. 2-arg methods wouldn't be able to work in any situations where 3-arg with the equivalent aliased input/output arguments doesn't.

@StefanKarpinski
Copy link
Sponsor Member

The three-argument methods being broken is a separate issue (which should be fixed, of course). I think the fact that 2-arg mul! is not sufficiently documented and therefore misused is perfectly valid, but not really an argument for deleting it.

@StefanKarpinski StefanKarpinski removed the design Design of APIs or of the language itself label Jul 28, 2017
@simonbyrne
Copy link
Contributor

That problem also arises with the mul2! method suggestion.

You specify it via the function: mul!1 will overwrite the first one, mul!2 the second.

@Jutho
Copy link
Contributor

Jutho commented Nov 21, 2017

Ah, I thought the 2 was just for 2-argument version :-).

At this point, it's probably personal preference. I think the fear for misuse is purely hyptothetical, especially if a more elaborate aliasing checking mechanism would be put in place. But even without, similar to how accidental aliassing does not seem to appear often or cause many problems currently.

Personally, I find mul!(B,A,B) as clear (if not more) as mul!2(A,B), without requiring new syntax conventions (specifiers after the exclamation mark), but I don't use this sufficiently to have any strong opinion on the final decision.

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Nov 22, 2017

I'm not even convinced that such checking is even possible in all cases.

Precise checking is probably not possible, but checking if the underlying "buffer" array is the same is possible and we can fall back to a slow, alias-safe implementation in those cases. Of course, we'd also want tooling to help detect such situations and avoid it when it's a performance problem.

@JeffBezanson
Copy link
Sponsor Member

JeffBezanson commented Nov 22, 2017

Nitpick: the ! should come last.

@JeffBezanson
Copy link
Sponsor Member

mul!(A, B) is documented as overwriting one of its arguments, plus this is not even exported, so closing.

@tkelman
Copy link
Contributor Author

tkelman commented Jan 3, 2018

This really shouldn't be a doc issue, "overwrites one argument but which one depends on input types" isn't a useful thing for this function to do.

@JeffBezanson
Copy link
Sponsor Member

Ok, we can reopen it if you want but I don't think it belongs on the 1.0 milestone since the function is not exported.

@JeffBezanson JeffBezanson removed this from the 1.0 milestone Jan 3, 2018
@JeffBezanson
Copy link
Sponsor Member

Some aspects of this are still not totally clear to me. I can imagine two reasons for an API like the current mul!(A, B):

  1. I know something about the types of A and B, and I want the function to use that information to figure out which argument I expect it to mutate, so I don't have to specify e.g. mul2!(A,B) or mul!(B,A,B).
  2. I don't care at all which of A or B is mutated, the function can trash both for all I care, I just want it to do the fastest thing.

These are very different styles of use. The cases @andreasnoack is referring to seem to be (1). Are there any examples of (2)?

If we mostly care about (1), then I do think it would be better to write such calls as mul!(B,A,B). Existing 2-argument methods can be mechanically converted to 3-argument methods by checking that the output is === to the expected input and throwing an error if not.

@JeffBezanson JeffBezanson reopened this Jan 3, 2018
@tkelman
Copy link
Contributor Author

tkelman commented Jan 3, 2018

When I last looked at the uses of 2-arg in place multiplication across packages over the summer, hardly any used and referred to the return value, which is the only sane way you might use (2) to do something worthwhile

@Jutho
Copy link
Contributor

Jutho commented Jan 3, 2018

@JeffBezanson , mul!(B,A,B) + checking has been my proposal all along. Furthermore, this resolves the ambiguity when either of the two arguments could in fact be mutated (e.g. Diagonal times Diagonal). I think there was a case of such ambiguity that @Sacha0 bumped into at some point during his amazing work on the mul! transition.

@andreasnoack
Copy link
Member

If we mostly care about (1), then I do think it would be better to write such calls as mul!(B,A,B)

I agree with @simonbyrne's criticism of this proposal in #16772 (comment). You don't win much in terms of generic programming when mul!(B,A,B) is only valid for some subtypes of AbstractMatrix.

I still don't get why it is so important to change the name or signature of methods that no users have reported issues with. The methods multiply matrices in-place so the name mul! seems pretty reasonable but if the consensus is that generic programming considerations prohibit the use of the most obvious name for this operation then I'd prefer renaming over the three-argument version.

@JeffBezanson
Copy link
Sponsor Member

when mul!(B,A,B) is only valid for some subtypes of AbstractMatrix

I think that's exactly the point --- if you hit a case where the types don't work, you get a method error instead of silently mutating a different value.

@simonbyrne
Copy link
Contributor

I think that's exactly the point --- if you hit a case where the types don't work, you get a method error instead of silently mutating a different value.

Isn't it the other way around? At the moment you get a MethodError, if we change it to mul!(B,A,B) then you will either get silent mutation or an error if the method manually checks.

@simonbyrne
Copy link
Contributor

Also, I should point out that the two separate mul1!/mul2! method deal with the Diagonal ambiguity case, as well as a bunch of other ones (I managed to get rid of quite a few definitions). We could replace scale! as well.

@JeffBezanson
Copy link
Sponsor Member

Isn't it the other way around?

Not the way I'm thinking about it. Currently I can write mul!(A, B), expecting B to be mutated. If the types change, this can call a different method that mutates A instead. But if I write mul!(B,A,B) then it has to mutate B, and if the types of A and B make that impossible you get an error. The 3 argument form can't do what I'm calling "silent" mutation, because it says explicitly which value to mutate.

@simonbyrne
Copy link
Contributor

I've updated my previous PR at #25421.

I did have a quick stab at trying the 3-element version, but it was far from clear how to go about it.

@Jutho
Copy link
Contributor

Jutho commented Jan 5, 2018

Thanks for the nice work.

For the alternative proposal of the 3-argument version, what is wrong with replacing
every instance of function f1(A,B)
by

function f(C,A,B)
C === A || copy!(C,A)
...

and similar for f2 type of methods.

Sure, this is a problem if C and A differ but alias, but that is the same now with all mutating methods, and the advanced user that wants to use these methods needs to know about this. If there is at some point a method to check for aliasing that's great, but I don't see this as a breaking point right now.

@simonbyrne
Copy link
Contributor

It's not so straightforward: for some it should throw an error, for others that wouldn't actually work at all (e.g. some of the triangular types return a value of a different type that reuses the underlying array).

If we still wanted to do it, we could implement that on top of #25421.

@Jutho
Copy link
Contributor

Jutho commented Jan 6, 2018

I trust your assessment as I am not too familiar with all the special array types. However, I do find that statement confusing, as above it was mentioned that almost all use-cases of the 2-argument version were not using the return type explicitly, but where just called as mul!(A,B) without assigning the return value.

@Sacha0
Copy link
Member

Sacha0 commented Jan 6, 2018

@simonbyrne,

(e.g. some of the triangular types return a value of a different type that reuses the underlying array)

sounded familiar, so I skimmed base/linalg/triangular.jl to refresh my memory but found no examples. I encountered only the following somewhat idiosyncratic (perhaps accidental?) definition.

mul!(A::Tridiagonal, B::AbstractTriangular) = A*full!(B)

Could you perhaps link a relevant definition or two for consideration? Thanks! :)

@simonbyrne
Copy link
Contributor

@Sacha0
Copy link
Member

Sacha0 commented Jan 12, 2018

Ah, XUnitTriangular -> XTriangular, I see! Thank you! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation linear algebra Linear algebra stdlib Julia's standard library
Projects
None yet
Development

No branches or pull requests