document that 2-arg mul! methods trashes all arguments #338

tkelman · 2016-06-05T15:07:12Z

Example methods are here for Triangular, or here for Diagonal where these follow the BLAS trmm convention of mutating whichever argument is not the type in question. So A_mul_B!(A, B) might mutate A, or it might mutate B, depending on their types. This is the case for several other combinations of input types as well, and I think it's bad for generic programming. Ref collapsed discussion right below JuliaLang/julia#16615 (comment), and JuliaLang/julia#16577 and JuliaLang/julia#16562.

If we can't apply a consistent rule here, "always mutates the second argument" (or, if we apply it consistently everywhere, the first?), then we shouldn't define this method. Better to favor the more explicit A_mul_B!(C, A, B) method where the output is a separate argument. Some combinations of types can allow one of A_mul_B!(A, A, B) or A_mul_B!(B, A, B) to work, but writing code that does that will be inherently non-generic. The individual methods of A_mul_B! should be responsible for checking whether the inputs are === (and possibly more complicated forms of alias checks if we have more view types in common use) and only allowing the specific combinations that are expected to work.

This issue is still present for all the in-place multiplication methods even if JuliaLang/julia#6837 renames all 7 (or 9 if you count possible combinations we don't define right now, ref #57) variants to methods of mul!. I think this is an explicitly post-0.5 cleanup to make though since we don't have time to deal with it right now.

The text was updated successfully, but these errors were encountered:

simonbyrne · 2016-06-05T20:04:36Z

I know I've mentioned it before, but I would like to plug the Inplace{N} type I use in InplaceOps.jl.

stevengj · 2016-06-06T17:07:12Z

See also JuliaLang/julia#16702

tkelman · 2016-08-01T21:18:48Z

Adding to this list the 2-argument scale!(A, b) vs scale!(b, A) from JuliaLang/julia#14425 (and JuliaLang/julia#17739). The ambiguity in which argument gets mutated strikes me as bad for the same reasons as in A_mul_B!, so we should probably require an explicit output argument if you need to be picky about left multiplication vs right-multiplication.

Keno · 2016-08-01T21:20:05Z

I don't think scale! is problematic, because in general one of them is a scalar, unlike A_mul_B! which are both arrays.

tkelman · 2016-08-01T21:25:25Z

There are a few methods where that's not the case:

scale!(A::AbstractArray{T<:Any,2}, b::AbstractArray{T<:Any,1}) at linalg/generic.jl:477
scale!(b::AbstractArray{T<:Any,1}, A::AbstractArray{T<:Any,2}) at linalg/generic.jl:478

If you write code where some inputs may need to have varying dimensionality of scalar, 1, or 2, there's an ambiguity here.

Keno · 2016-08-01T22:14:38Z

Arguably those shouldn't exist. I had no idea what they were going to do until I looked at the code. I think that should be explicitly written as

A*Diagonal(b)

(or rather whatever mutating equivalent we have).

andreasnoack · 2016-08-01T22:25:09Z

scale! is the mutating equivalent and, as far as I'm aware, nobody who has been using these methods has had problems. With a language that supports dispatch, I think it fine to have scale!(Vector, Matrix) and scale!(Matrix,Vector) instead of scale_matrix_with_vector_left!(Matrix,Vector) and scale_matrix_with_vector_right!(Matrix,Vector).

tkelman · 2016-08-01T22:26:54Z

What does

function foo(a, b)
    scale!(a, b)
    return a
end

do then? If we can't give a consistent definition for what the generic function verb scale! means in terms of its side effects, that's bad.

andreasnoack · 2016-08-01T22:29:58Z

I think we should use types in the definitions. That was the point I tried to make in the last comment.

Keno · 2016-08-01T22:35:19Z

I don't have a particularly strong opinion on the topic, but it feels like a pun to me.

tkelman · 2016-08-01T23:28:39Z

I think we should use types in the definitions

That defeats the purpose of generic programming. I think the non-commutative case would be better served by a scaleleft!(A, b) and scaleright!(A, b) where the mutated argument is the first input, as is the convention in most other ! functions.

edit: or mul!(A, A, b) for right-multiplication by a scalar and mul!(A, b, A) for left-multiplication by a scalar.

Jutho · 2016-08-02T04:28:51Z

I am also in favor of only having the 3-argument mul!, where the only difference for the special types (scaling with a vector, or multiplying with a triangular matrix) is that you can use the same output argument as the (other) input argument, whereas you can't for a general matrix. That might still be hard to explain, but a simple detection in the generic matrix multiplication code (which is already present) suffices to catch this.

Sacha0 · 2016-12-22T19:05:38Z

Unlikely to receive attention prior to 0.6. Best!

StefanKarpinski · 2017-07-20T20:27:00Z

Bump, if any linalg people are looking for things to do.

StefanKarpinski · 2017-07-27T18:46:04Z

Seems like the two-argument mul! should be documented to return the product and potentially trash either or both of its arguments and that it may return one of its arguments or a new object.

JeffBezanson · 2017-07-27T18:49:54Z

A majority of triage'ers think this behavior is fine if it's clearly documented, warning people that you can't predict which argument will be mutated.

StefanKarpinski · 2017-07-27T19:02:14Z

The three-argument mul! should be documented to only mutate the output, so if you want to mutate only the left argument of the product, then write mul!(A, A, B); if you want to mutate only the right argument of the product, write mul!(B, A, B); if you want to leave them both intact, write mul!(A, B, C). Without the two-argument form mul!(A, B) there's no way to express that you're ok with either argument being trashed as long as you get the product as a result – this is far less constrained than any of the above three calls since it could return either argument or neither and completely independently it could trash either, both or neither of the inputs depending on what's efficient. It should be clearly documented that this is expert functionality to be used with caution.

tkelman · 2017-07-27T19:04:59Z

In practice in current package code the two-argument versions are very rarely used, and when they are they are virtually always used for their side effects rather than assigning their output and using that. Given that usage, these are more harmful than helpful since their side effects aren't predictable in a generic way.

simonbyrne · 2017-07-27T22:17:57Z

A lot of the 3-argument versions will fail or give incorrect results if you reuse input and output arguments:

julia> A = eye(2)
2×2 Array{Float64,2}:
 1.0  0.0
 0.0  1.0

julia> B = eye(2)
2×2 Array{Float64,2}:
 1.0  0.0
 0.0  1.0

julia> Base.A_mul_B!(A,A,B)
ERROR: ArgumentError: output matrix must not be aliased with input matrix
Stacktrace:
 [1] gemm_wrapper!(::Array{Float64,2}, ::Char, ::Char, ::Array{Float64,2}, ::Array{Float64,2}) at ./linalg/matmul.jl:349
 [2] A_mul_B!(::Array{Float64,2}, ::Array{Float64,2}, ::Array{Float64,2}) at ./linalg/matmul.jl:148


julia> A = ones(BigInt, 4, 4)
4×4 Array{BigInt,2}:
 1  1  1  1
 1  1  1  1
 1  1  1  1
 1  1  1  1

julia> B = ones(BigInt,4,4)
4×4 Array{BigInt,2}:
 1  1  1  1
 1  1  1  1
 1  1  1  1
 1  1  1  1

julia> A_mul_B!(A,A,B)
4×4 Array{BigInt,2}:
 4  7  13  25
 4  7  13  25
 4  7  13  25
 4  7  13  25

tkelman · 2017-07-27T22:27:56Z

Looks like the BigInt method is skipping the aliasing checks that the Float64 method caught, that should be fixed. 2-arg methods wouldn't be able to work in any situations where 3-arg with the equivalent aliased input/output arguments doesn't.

StefanKarpinski · 2017-07-28T13:30:42Z

The three-argument methods being broken is a separate issue (which should be fixed, of course). I think the fact that 2-arg mul! is not sufficiently documented and therefore misused is perfectly valid, but not really an argument for deleting it.

Jutho · 2017-07-28T13:35:12Z

At some point, I tried implementing a subset of Lapack in pure Julia, and for those case where in place matrix multiplication and division are possible and useful (e.g. triangular matrices), I just still had a three argument version mul!(C,A,B), but where you could just choose C equal to A or B (depending on which of the two is e.g. triangular). That is, there was no two-argument mul!, but you could access it with mul!(B,A,B), if e.g. A is Diagonal or Triangular.

Jutho · 2017-07-28T13:39:25Z

I found this more intuitive, i.e. it is explicitly clear which argument is overwritten. Like the implementation of generic matrix multiplication has to check against aliasing, in that case the implementation would look like

function mul!(C::Matrix,A::Triangular,B::Matrix)
    C !== B && copy!(C,B)
    # ... in place multiplication of A onto C
end

simonbyrne · 2017-11-21T21:25:30Z

That problem also arises with the mul2! method suggestion.

You specify it via the function: mul!1 will overwrite the first one, mul!2 the second.

Jutho · 2017-11-21T22:17:55Z

Ah, I thought the 2 was just for 2-argument version :-).

At this point, it's probably personal preference. I think the fear for misuse is purely hyptothetical, especially if a more elaborate aliasing checking mechanism would be put in place. But even without, similar to how accidental aliassing does not seem to appear often or cause many problems currently.

Personally, I find mul!(B,A,B) as clear (if not more) as mul!2(A,B), without requiring new syntax conventions (specifiers after the exclamation mark), but I don't use this sufficiently to have any strong opinion on the final decision.

StefanKarpinski · 2017-11-22T14:55:40Z

I'm not even convinced that such checking is even possible in all cases.

Precise checking is probably not possible, but checking if the underlying "buffer" array is the same is possible and we can fall back to a slow, alias-safe implementation in those cases. Of course, we'd also want tooling to help detect such situations and avoid it when it's a performance problem.

JeffBezanson · 2017-11-22T15:25:06Z

Nitpick: the ! should come last.

JeffBezanson · 2018-01-03T02:23:11Z

mul!(A, B) is documented as overwriting one of its arguments, plus this is not even exported, so closing.

tkelman · 2018-01-03T02:50:56Z

This really shouldn't be a doc issue, "overwrites one argument but which one depends on input types" isn't a useful thing for this function to do.

JeffBezanson · 2018-01-03T02:55:20Z

Ok, we can reopen it if you want but I don't think it belongs on the 1.0 milestone since the function is not exported.

JeffBezanson · 2018-01-03T03:31:40Z

Some aspects of this are still not totally clear to me. I can imagine two reasons for an API like the current mul!(A, B):

I know something about the types of A and B, and I want the function to use that information to figure out which argument I expect it to mutate, so I don't have to specify e.g. mul2!(A,B) or mul!(B,A,B).
I don't care at all which of A or B is mutated, the function can trash both for all I care, I just want it to do the fastest thing.

These are very different styles of use. The cases @andreasnoack is referring to seem to be (1). Are there any examples of (2)?

If we mostly care about (1), then I do think it would be better to write such calls as mul!(B,A,B). Existing 2-argument methods can be mechanically converted to 3-argument methods by checking that the output is === to the expected input and throwing an error if not.

tkelman · 2018-01-03T03:42:15Z

When I last looked at the uses of 2-arg in place multiplication across packages over the summer, hardly any used and referred to the return value, which is the only sane way you might use (2) to do something worthwhile

Jutho · 2018-01-03T05:39:26Z

@JeffBezanson , mul!(B,A,B) + checking has been my proposal all along. Furthermore, this resolves the ambiguity when either of the two arguments could in fact be mutated (e.g. Diagonal times Diagonal). I think there was a case of such ambiguity that @Sacha0 bumped into at some point during his amazing work on the mul! transition.

andreasnoack · 2018-01-03T09:17:13Z

If we mostly care about (1), then I do think it would be better to write such calls as mul!(B,A,B)

I agree with @simonbyrne's criticism of this proposal in #338. You don't win much in terms of generic programming when mul!(B,A,B) is only valid for some subtypes of AbstractMatrix.

I still don't get why it is so important to change the name or signature of methods that no users have reported issues with. The methods multiply matrices in-place so the name mul! seems pretty reasonable but if the consensus is that generic programming considerations prohibit the use of the most obvious name for this operation then I'd prefer renaming over the three-argument version.

JeffBezanson · 2018-01-03T15:01:55Z

when mul!(B,A,B) is only valid for some subtypes of AbstractMatrix

I think that's exactly the point --- if you hit a case where the types don't work, you get a method error instead of silently mutating a different value.

simonbyrne · 2018-01-03T22:07:51Z

I think that's exactly the point --- if you hit a case where the types don't work, you get a method error instead of silently mutating a different value.

Isn't it the other way around? At the moment you get a MethodError, if we change it to mul!(B,A,B) then you will either get silent mutation or an error if the method manually checks.

simonbyrne · 2018-01-03T22:12:52Z

Also, I should point out that the two separate mul1!/mul2! method deal with the Diagonal ambiguity case, as well as a bunch of other ones (I managed to get rid of quite a few definitions). We could replace scale! as well.

JeffBezanson · 2018-01-03T22:59:08Z

Isn't it the other way around?

Not the way I'm thinking about it. Currently I can write mul!(A, B), expecting B to be mutated. If the types change, this can call a different method that mutates A instead. But if I write mul!(B,A,B) then it has to mutate B, and if the types of A and B make that impossible you get an error. The 3 argument form can't do what I'm calling "silent" mutation, because it says explicitly which value to mutate.

simonbyrne · 2018-01-05T22:20:16Z

I've updated my previous PR at JuliaLang/julia#25421.

I did have a quick stab at trying the 3-element version, but it was far from clear how to go about it.

Jutho · 2018-01-05T22:26:08Z

Thanks for the nice work.

For the alternative proposal of the 3-argument version, what is wrong with replacing
every instance of function f1(A,B)
by

function f(C,A,B)
C === A || copy!(C,A)
...

and similar for f2 type of methods.

Sure, this is a problem if C and A differ but alias, but that is the same now with all mutating methods, and the advanced user that wants to use these methods needs to know about this. If there is at some point a method to check for aliasing that's great, but I don't see this as a breaking point right now.

simonbyrne · 2018-01-05T22:49:16Z

It's not so straightforward: for some it should throw an error, for others that wouldn't actually work at all (e.g. some of the triangular types return a value of a different type that reuses the underlying array).

If we still wanted to do it, we could implement that on top of JuliaLang/julia#25421.

Jutho · 2018-01-06T01:43:34Z

I trust your assessment as I am not too familiar with all the special array types. However, I do find that statement confusing, as above it was mentioned that almost all use-cases of the 2-argument version were not using the return type explicitly, but where just called as mul!(A,B) without assigning the return value.

Sacha0 · 2018-01-06T18:20:59Z

@simonbyrne,

(e.g. some of the triangular types return a value of a different type that reuses the underlying array)

sounded familiar, so I skimmed base/linalg/triangular.jl to refresh my memory but found no examples. I encountered only the following somewhat idiosyncratic (perhaps accidental?) definition.

https://github.com/JuliaLang/julia/blob/08620e5c1ec64d4961b78d6b038520cc31c0e2fe/base/linalg/triangular.jl#L449

Could you perhaps link a relevant definition or two for consideration? Thanks! :)

simonbyrne · 2018-01-11T21:16:31Z

Most are in triangular.jl, e.g. https://github.com/JuliaLang/julia/pull/25421/files#diff-f3a0525264ed2a88c32e4ec3f72c8fa4R163

Sacha0 · 2018-01-12T01:16:14Z

Ah, XUnitTriangular -> XTriangular, I see! Thank you! :)

tkelman added linear algebra design Design of APIs or of the language itself labels Jun 5, 2016

tkelman mentioned this issue Aug 1, 2016

fix fallback for scaling blas number with array JuliaLang/julia#17739

Merged

Sacha0 mentioned this issue Jan 16, 2017

A[c|t]_mul_B[!] specializations for SparseMatrixCSC-StridedVecOrMat, less generalized linear indexing and meta-fu, take 2 JuliaLang/julia#20053

Closed

JeffBezanson added the docs This change adds or pertains to documentation label Jul 27, 2017

StefanKarpinski removed the design Design of APIs or of the language itself label Jul 28, 2017

StefanKarpinski changed the title ~~2-arg A_mul_B! should be consistent, or not exist~~ document that 2-arg mul! methods trashes all arguments Jul 28, 2017

simonbyrne mentioned this issue Nov 22, 2017

Clarify which are is modified in mul/div. JuliaLang/julia#24698

Closed

Sacha0 mentioned this issue Dec 11, 2017

lazier, less-jazzy linalg internals JuliaLang/julia#24969

Merged

Jutho mentioned this issue Dec 21, 2017

sunset linalg jazz JuliaLang/julia#25217

Merged

JeffBezanson closed this as completed Jan 3, 2018

JeffBezanson reopened this Jan 3, 2018

simonbyrne mentioned this issue Jan 5, 2018

Clarify which arg is modified in mul/div. JuliaLang/julia#25421

Merged

JeffBezanson closed this as completed Jan 19, 2018

KristofferC transferred this issue from JuliaLang/julia Nov 26, 2024

document that 2-arg mul! methods trashes all arguments #338

document that 2-arg mul! methods trashes all arguments #338

Comments

tkelman commented Jun 5, 2016 • edited Loading

simonbyrne commented Jun 5, 2016 • edited Loading

stevengj commented Jun 6, 2016

tkelman commented Aug 1, 2016

Keno commented Aug 1, 2016

tkelman commented Aug 1, 2016 • edited Loading

Keno commented Aug 1, 2016

andreasnoack commented Aug 1, 2016

tkelman commented Aug 1, 2016

andreasnoack commented Aug 1, 2016

Keno commented Aug 1, 2016

tkelman commented Aug 1, 2016 • edited Loading

Jutho commented Aug 2, 2016

Sacha0 commented Dec 22, 2016

StefanKarpinski commented Jul 20, 2017

StefanKarpinski commented Jul 27, 2017

JeffBezanson commented Jul 27, 2017

StefanKarpinski commented Jul 27, 2017 • edited Loading

tkelman commented Jul 27, 2017

simonbyrne commented Jul 27, 2017 • edited Loading

tkelman commented Jul 27, 2017

StefanKarpinski commented Jul 28, 2017

Jutho commented Jul 28, 2017

Jutho commented Jul 28, 2017 • edited Loading

simonbyrne commented Nov 21, 2017

Jutho commented Nov 21, 2017

StefanKarpinski commented Nov 22, 2017 • edited Loading

JeffBezanson commented Nov 22, 2017 • edited Loading

JeffBezanson commented Jan 3, 2018

tkelman commented Jan 3, 2018

JeffBezanson commented Jan 3, 2018

JeffBezanson commented Jan 3, 2018

tkelman commented Jan 3, 2018

Jutho commented Jan 3, 2018

andreasnoack commented Jan 3, 2018

JeffBezanson commented Jan 3, 2018

simonbyrne commented Jan 3, 2018

simonbyrne commented Jan 3, 2018

JeffBezanson commented Jan 3, 2018

simonbyrne commented Jan 5, 2018

Jutho commented Jan 5, 2018

simonbyrne commented Jan 5, 2018

Jutho commented Jan 6, 2018

Sacha0 commented Jan 6, 2018

simonbyrne commented Jan 11, 2018

Sacha0 commented Jan 12, 2018

tkelman commented Jun 5, 2016 •

edited

Loading

simonbyrne commented Jun 5, 2016 •

edited

Loading

tkelman commented Aug 1, 2016 •

edited

Loading

tkelman commented Aug 1, 2016 •

edited

Loading

StefanKarpinski commented Jul 27, 2017 •

edited

Loading

simonbyrne commented Jul 27, 2017 •

edited

Loading

Jutho commented Jul 28, 2017 •

edited

Loading

StefanKarpinski commented Nov 22, 2017 •

edited

Loading

JeffBezanson commented Nov 22, 2017 •

edited

Loading